dma-buf-alloc-exchange.rst 19 KB


  1. .. SPDX-License-Identifier: GPL-2.0
  2. .. Copyright 2021-2023 Collabora Ltd.
  3. ========================
  4. Exchanging pixel buffers
  5. ========================
  6. As originally designed, the Linux graphics subsystem had extremely limited
  7. support for sharing pixel-buffer allocations between processes, devices, and
  8. subsystems. Modern systems require extensive integration between all three
  9. classes; this document details how applications and kernel subsystems should
  10. approach this sharing for two-dimensional image data.
  11. It is written with reference to the DRM subsystem for GPU and display devices,
  12. V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace
  13. support, however any other subsystems should also follow this design and advice.
  14. Glossary of terms
  15. =================
  16. .. glossary::
  17. image:
  18. Conceptually a two-dimensional array of pixels. The pixels may be stored
  19. in one or more memory buffers. Has width and height in pixels, pixel
  20. format and modifier (implicit or explicit).
  21. row:
  22. A span along a single y-axis value, e.g. from co-ordinates (0,100) to
  23. (200,100).
  24. scanline:
  25. Synonym for row.
  26. column:
  27. A span along a single x-axis value, e.g. from co-ordinates (100,0) to
  28. (100,100).
  29. memory buffer:
  30. A piece of memory for storing (parts of) pixel data. Has stride and size
  31. in bytes and at least one handle in some API. May contain one or more
  32. planes.
  33. plane:
  34. A two-dimensional array of some or all of an image's color and alpha
  35. channel values.
  36. pixel:
  37. A picture element. Has a single color value which is defined by one or
  38. more color channels values, e.g. R, G and B, or Y, Cb and Cr. May also
  39. have an alpha value as an additional channel.
  40. pixel data:
  41. Bytes or bits that represent some or all of the color/alpha channel values
  42. of a pixel or an image. The data for one pixel may be spread over several
  43. planes or memory buffers depending on format and modifier.
  44. color value:
  45. A tuple of numbers, representing a color. Each element in the tuple is a
  46. color channel value.
  47. color channel:
  48. One of the dimensions in a color model. For example, RGB model has
  49. channels R, G, and B. Alpha channel is sometimes counted as a color
  50. channel as well.
  51. pixel format:
  52. A description of how pixel data represents the pixel's color and alpha
  53. values.
  54. modifier:
  55. A description of how pixel data is laid out in memory buffers.
  56. alpha:
  57. A value that denotes the color coverage in a pixel. Sometimes used for
  58. translucency instead.
  59. stride:
  60. A value that denotes the relationship between pixel-location co-ordinates
  61. and byte-offset values. Typically used as the byte offset between two
  62. pixels at the start of vertically-consecutive tiling blocks. For linear
  63. layouts, the byte offset between two vertically-adjacent pixels. For
  64. non-linear formats the stride must be computed in a consistent way, which
  65. usually is done as-if the layout was linear.
  66. pitch:
  67. Synonym for stride.
  68. Formats and modifiers
  69. =====================
  70. Each buffer must have an underlying format. This format describes the color
  71. values provided for each pixel. Although each subsystem has its own format
  72. descriptions (e.g. V4L2 and fbdev), the ``DRM_FORMAT_*`` tokens should be reused
  73. wherever possible, as they are the standard descriptions used for interchange.
  74. These tokens are described in the ``drm_fourcc.h`` file, which is a part of
  75. DRM's uAPI.
  76. Each ``DRM_FORMAT_*`` token describes the translation between a pixel
  77. co-ordinate in an image, and the color values for that pixel contained within
  78. its memory buffers. The number and type of color channels are described:
  79. whether they are RGB or YUV, integer or floating-point, the size of each channel
  80. and their locations within the pixel memory, and the relationship between color
  81. planes.
  82. For example, ``DRM_FORMAT_ARGB8888`` describes a format in which each pixel has
  83. a single 32-bit value in memory. Alpha, red, green, and blue, color channels are
  84. available at 8-bit precision per channel, ordered respectively from most to
  85. least significant bits in little-endian storage. ``DRM_FORMAT_*`` is not
  86. affected by either CPU or device endianness; the byte pattern in memory is
  87. always as described in the format definition, which is usually little-endian.
  88. As a more complex example, ``DRM_FORMAT_NV12`` describes a format in which luma
  89. and chroma YUV samples are stored in separate planes, where the chroma plane is
  90. stored at half the resolution in both dimensions (i.e. one U/V chroma
  91. sample is stored for each 2x2 pixel grouping).
  92. Format modifiers describe a translation mechanism between these per-pixel memory
  93. samples, and the actual memory storage for the buffer. The most straightforward
  94. modifier is ``DRM_FORMAT_MOD_LINEAR``, describing a scheme in which each plane
  95. is laid out row-sequentially, from the top-left to the bottom-right corner.
  96. This is considered the baseline interchange format, and most convenient for CPU
  97. access.
  98. Modern hardware employs much more sophisticated access mechanisms, typically
  99. making use of tiled access and possibly also compression. For example, the
  100. ``DRM_FORMAT_MOD_VIVANTE_TILED`` modifier describes memory storage where pixels
  101. are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile in
  102. a plane stores pixels (0,0) to (3,3) inclusive, and the second tile in a plane
  103. stores pixels (4,0) to (7,3) inclusive.
  104. Some modifiers may modify the number of planes required for an image; for
  105. example, the ``I915_FORMAT_MOD_Y_TILED_CCS`` modifier adds a second plane to RGB
  106. formats in which it stores data about the status of every tile, notably
  107. including whether the tile is fully populated with pixel data, or can be
  108. expanded from a single solid color.
  109. These extended layouts are highly vendor-specific, and even specific to
  110. particular generations or configurations of devices per-vendor. For this reason,
  111. support of modifiers must be explicitly enumerated and negotiated by all users
  112. in order to ensure a compatible and optimal pipeline, as discussed below.
  113. Dimensions and size
  114. ===================
  115. Each pixel buffer must be accompanied by logical pixel dimensions. This refers
  116. to the number of unique samples which can be extracted from, or stored to, the
  117. underlying memory storage. For example, even though a 1920x1080
  118. ``DRM_FORMAT_NV12`` buffer has a luma plane containing 1920x1080 samples for the Y
  119. component, and 960x540 samples for the U and V components, the overall buffer is
  120. still described as having dimensions of 1920x1080.
  121. The in-memory storage of a buffer is not guaranteed to begin immediately at the
  122. base address of the underlying memory, nor is it guaranteed that the memory
  123. storage is tightly clipped to either dimension.
  124. Each plane must therefore be described with an ``offset`` in bytes, which will be
  125. added to the base address of the memory storage before performing any per-pixel
  126. calculations. This may be used to combine multiple planes into a single memory
  127. buffer; for example, ``DRM_FORMAT_NV12`` may be stored in a single memory buffer
  128. where the luma plane's storage begins immediately at the start of the buffer
  129. with an offset of 0, and the chroma plane's storage follows within the same buffer
  130. beginning from the byte offset for that plane.
  131. Each plane must also have a ``stride`` in bytes, expressing the offset in memory
  132. between two contiguous row. For example, a ``DRM_FORMAT_MOD_LINEAR`` buffer
  133. with dimensions of 1000x1000 may have been allocated as if it were 1024x1000, in
  134. order to allow for aligned access patterns. In this case, the buffer will still
  135. be described with a width of 1000, however the stride will be ``1024 * bpp``,
  136. indicating that there are 24 pixels at the positive extreme of the x axis whose
  137. values are not significant.
  138. Buffers may also be padded further in the y dimension, simply by allocating a
  139. larger area than would ordinarily be required. For example, many media decoders
  140. are not able to natively output buffers of height 1080, but instead require an
  141. effective height of 1088 pixels. In this case, the buffer continues to be
  142. described as having a height of 1080, with the memory allocation for each buffer
  143. being increased to account for the extra padding.
  144. Enumeration
  145. ===========
  146. Every user of pixel buffers must be able to enumerate a set of supported formats
  147. and modifiers, described together. Within KMS, this is achieved with the
  148. ``IN_FORMATS`` property on each DRM plane, listing the supported DRM formats, and
  149. the modifiers supported for each format. In userspace, this is supported through
  150. the `EGL_EXT_image_dma_buf_import_modifiers`_ extension entrypoints for EGL, the
  151. `VK_EXT_image_drm_format_modifier`_ extension for Vulkan, and the
  152. `zwp_linux_dmabuf_v1`_ extension for Wayland.
  153. Each of these interfaces allows users to query a set of supported
  154. format+modifier combinations.
  155. Negotiation
  156. ===========
  157. It is the responsibility of userspace to negotiate an acceptable format+modifier
  158. combination for its usage. This is performed through a simple intersection of
  159. lists. For example, if a user wants to use Vulkan to render an image to be
  160. displayed on a KMS plane, it must:
  161. - query KMS for the ``IN_FORMATS`` property for the given plane
  162. - query Vulkan for the supported formats for its physical device, making sure
  163. to pass the ``VkImageUsageFlagBits`` and ``VkImageCreateFlagBits``
  164. corresponding to the intended rendering use
  165. - intersect these formats to determine the most appropriate one
  166. - for this format, intersect the lists of supported modifiers for both KMS and
  167. Vulkan, to obtain a final list of acceptable modifiers for that format
  168. This intersection must be performed for all usages. For example, if the user
  169. also wishes to encode the image to a video stream, it must query the media API
  170. it intends to use for encoding for the set of modifiers it supports, and
  171. additionally intersect against this list.
  172. If the intersection of all lists is an empty list, it is not possible to share
  173. buffers in this way, and an alternate strategy must be considered (e.g. using
  174. CPU access routines to copy data between the different uses, with the
  175. corresponding performance cost).
  176. The resulting modifier list is unsorted; the order is not significant.
  177. Allocation
  178. ==========
  179. Once userspace has determined an appropriate format, and corresponding list of
  180. acceptable modifiers, it must allocate the buffer. As there is no universal
  181. buffer-allocation interface available at either kernel or userspace level, the
  182. client makes an arbitrary choice of allocation interface such as Vulkan, GBM, or
  183. a media API.
  184. Each allocation request must take, at a minimum: the pixel format, a list of
  185. acceptable modifiers, and the buffer's width and height. Each API may extend
  186. this set of properties in different ways, such as allowing allocation in more
  187. than two dimensions, intended usage patterns, etc.
  188. The component which allocates the buffer will make an arbitrary choice of what
  189. it considers the 'best' modifier within the acceptable list for the requested
  190. allocation, any padding required, and further properties of the underlying
  191. memory buffers such as whether they are stored in system or device-specific
  192. memory, whether or not they are physically contiguous, and their cache mode.
  193. These properties of the memory buffer are not visible to userspace, however the
  194. ``dma-heaps`` API is an effort to address this.
  195. After allocation, the client must query the allocator to determine the actual
  196. modifier selected for the buffer, as well as the per-plane offset and stride.
  197. Allocators are not permitted to vary the format in use, to select a modifier not
  198. provided within the acceptable list, nor to vary the pixel dimensions other than
  199. the padding expressed through offset, stride, and size.
  200. Communicating additional constraints, such as alignment of stride or offset,
  201. placement within a particular memory area, etc, is out of scope of dma-buf,
  202. and is not solved by format and modifier tokens.
  203. Import
  204. ======
  205. To use a buffer within a different context, device, or subsystem, the user
  206. passes these parameters (format, modifier, width, height, and per-plane offset
  207. and stride) to an importing API.
  208. Each memory buffer is referred to by a buffer handle, which may be unique or
  209. duplicated within an image. For example, a ``DRM_FORMAT_NV12`` buffer may have
  210. the luma and chroma buffers combined into a single memory buffer by use of the
  211. per-plane offset parameters, or they may be completely separate allocations in
  212. memory. For this reason, each import and allocation API must provide a separate
  213. handle for each plane.
  214. Each kernel subsystem has its own types and interfaces for buffer management.
  215. DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These types
  216. are not portable between contexts, processes, devices, or subsystems.
  217. To address this, ``dma-buf`` handles are used as the universal interchange for
  218. buffers. Subsystem-specific operations are used to export native buffer handles
  219. to a ``dma-buf`` file descriptor, and to import those file descriptors into a
  220. native buffer handle. dma-buf file descriptors can be transferred between
  221. contexts, processes, devices, and subsystems.
  222. For example, a Wayland media player may use V4L2 to decode a video frame into a
  223. ``DRM_FORMAT_NV12`` buffer. This will result in two memory planes (luma and
  224. chroma) being dequeued by the user from V4L2. These planes are then exported to
  225. one dma-buf file descriptor per plane, these descriptors are then sent along
  226. with the metadata (format, modifier, width, height, per-plane offset and stride)
  227. to the Wayland server. The Wayland server will then import these file
  228. descriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for use
  229. through Vulkan, or a KMS framebuffer object; each of these import operations
  230. will take the same metadata and convert the dma-buf file descriptors into their
  231. native buffer handles.
  232. Having a non-empty intersection of supported modifiers does not guarantee that
  233. import will succeed into all consumers; they may have constraints beyond those
  234. implied by modifiers which must be satisfied.
  235. Implicit modifiers
  236. ==================
  237. The concept of modifiers post-dates all of the subsystems mentioned above. As
  238. such, it has been retrofitted into all of these APIs, and in order to ensure
  239. backwards compatibility, support is needed for drivers and userspace which do
  240. not (yet) support modifiers.
  241. As an example, GBM is used to allocate buffers to be shared between EGL for
  242. rendering and KMS for display. It has two entrypoints for allocating buffers:
  243. ``gbm_bo_create`` which only takes the format, width, height, and a usage token,
  244. and ``gbm_bo_create_with_modifiers`` which extends this with a list of modifiers.
  245. In the latter case, the allocation is as discussed above, being provided with a
  246. list of acceptable modifiers that the implementation can choose from (or fail if
  247. it is not possible to allocate within those constraints). In the former case
  248. where modifiers are not provided, the GBM implementation must make its own
  249. choice as to what is likely to be the 'best' layout. Such a choice is entirely
  250. implementation-specific: some will internally use tiled layouts which are not
  251. CPU-accessible if the implementation decides that is a good idea through
  252. whatever heuristic. It is the implementation's responsibility to ensure that
  253. this choice is appropriate.
  254. To support this case where the layout is not known because there is no awareness
  255. of modifiers, a special ``DRM_FORMAT_MOD_INVALID`` token has been defined. This
  256. pseudo-modifier declares that the layout is not known, and that the driver
  257. should use its own logic to determine what the underlying layout may be.
  258. .. note::
  259. ``DRM_FORMAT_MOD_INVALID`` is a non-zero value. The modifier value zero is
  260. ``DRM_FORMAT_MOD_LINEAR``, which is an explicit guarantee that the image
  261. has the linear layout. Care and attention should be taken to ensure that
  262. zero as a default value is not mixed up with either no modifier or the linear
  263. modifier. Also note that in some APIs the invalid modifier value is specified
  264. with an out-of-band flag, like in ``DRM_IOCTL_MODE_ADDFB2``.
  265. There are four cases where this token may be used:
  266. - during enumeration, an interface may return ``DRM_FORMAT_MOD_INVALID``, either
  267. as the sole member of a modifier list to declare that explicit modifiers are
  268. not supported, or as part of a larger list to declare that implicit modifiers
  269. may be used
  270. - during allocation, a user may supply ``DRM_FORMAT_MOD_INVALID``, either as the
  271. sole member of a modifier list (equivalent to not supplying a modifier list
  272. at all) to declare that explicit modifiers are not supported and must not be
  273. used, or as part of a larger list to declare that an allocation using implicit
  274. modifiers is acceptable
  275. - in a post-allocation query, an implementation may return
  276. ``DRM_FORMAT_MOD_INVALID`` as the modifier of the allocated buffer to declare
  277. that the underlying layout is implementation-defined and that an explicit
  278. modifier description is not available; per the above rules, this may only be
  279. returned when the user has included ``DRM_FORMAT_MOD_INVALID`` as part of the
  280. list of acceptable modifiers, or not provided a list
  281. - when importing a buffer, the user may supply ``DRM_FORMAT_MOD_INVALID`` as the
  282. buffer modifier (or not supply a modifier) to indicate that the modifier is
  283. unknown for whatever reason; this is only acceptable when the buffer has
  284. not been allocated with an explicit modifier
  285. It follows from this that for any single buffer, the complete chain of operations
  286. formed by the producer and all the consumers must be either fully implicit or fully
  287. explicit. For example, if a user wishes to allocate a buffer for use between
  288. GPU, display, and media, but the media API does not support modifiers, then the
  289. user **must not** allocate the buffer with explicit modifiers and attempt to
  290. import the buffer into the media API with no modifier, but either perform the
  291. allocation using implicit modifiers, or allocate the buffer for media use
  292. separately and copy between the two buffers.
  293. As one exception to the above, allocations may be 'upgraded' from implicit
  294. to explicit modifiers. For example, if the buffer is allocated with
  295. ``gbm_bo_create`` (taking no modifiers), the user may then query the modifier with
  296. ``gbm_bo_get_modifier`` and then use this modifier as an explicit modifier token
  297. if a valid modifier is returned.
  298. When allocating buffers for exchange between different users and modifiers are
  299. not available, implementations are strongly encouraged to use
  300. ``DRM_FORMAT_MOD_LINEAR`` for their allocation, as this is the universal baseline
  301. for exchange. However, it is not guaranteed that this will result in the correct
  302. interpretation of buffer content, as implicit modifier operation may still be
  303. subject to driver-specific heuristics.
  304. Any new users - userspace programs and protocols, kernel subsystems, etc -
  305. wishing to exchange buffers must offer interoperability through dma-buf file
  306. descriptors for memory planes, DRM format tokens to describe the format, DRM
  307. format modifiers to describe the layout in memory, at least width and height for
  308. dimensions, and at least offset and stride for each memory plane.
  309. .. _zwp_linux_dmabuf_v1: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml
  310. .. _VK_EXT_image_drm_format_modifier: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_image_drm_format_modifier.html
  311. .. _EGL_EXT_image_dma_buf_import_modifiers: https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import_modifiers.txt