All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert Beckett <bob.beckett@collabora.com>
To: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] doc: gpu: Add document describing buffer exchange
Date: Mon, 6 Sep 2021 18:13:13 +0100	[thread overview]
Message-ID: <72c53fc3-1714-3b24-3a7c-8ee4e72574c1@collabora.com> (raw)
In-Reply-To: <20210905122742.86029-1-daniels@collabora.com>



On 05/09/2021 13:27, Daniel Stone wrote:
> Since there's a lot of confusion around this, document both the rules
> and the best practice around negotiating, allocating, importing, and
> using buffers when crossing context/process/device/subsystem boundaries.
> 
> This ties up all of dmabuf, formats and modifiers, and their usage.
> 
> Signed-off-by: Daniel Stone <daniels@collabora.com>
> ---
> 
> This is just a quick first draft, inspired by:
>    https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3197#note_1048637
> 
> It's not complete or perfect, but I'm off to eat a roast then have a
> nice walk in the sun, so figured it'd be better to dash it off rather
> than let it rot on my hard drive.
> 
> 
>   .../gpu/exchanging-pixel-buffers.rst          | 285 ++++++++++++++++++
>   Documentation/gpu/index.rst                   |   1 +
>   2 files changed, 286 insertions(+)
>   create mode 100644 Documentation/gpu/exchanging-pixel-buffers.rst
> 
> diff --git a/Documentation/gpu/exchanging-pixel-buffers.rst b/Documentation/gpu/exchanging-pixel-buffers.rst
> new file mode 100644
> index 000000000000..75c4de13d5c8
> --- /dev/null
> +++ b/Documentation/gpu/exchanging-pixel-buffers.rst
> @@ -0,0 +1,285 @@
> +.. Copyright 2021 Collabora Ltd.
> +
> +========================
> +Exchanging pixel buffers
> +========================
> +
> +As originally designed, the Linux graphics subsystem had extremely limited
> +support for sharing pixel-buffer allocations between processes, devices, and
> +subsystems. Modern systems require extensive integration between all three
> +classes; this document details how applications and kernel subsystems should
> +approach this sharing for two-dimensional image data.
> +
> +It is written with reference to the DRM subsystem for GPU and display devices,
> +V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace
> +support, however any other subsystems should also follow this design and advice.
> +
> +
> +Formats and modifiers
> +=====================
> +
> +Each buffer must have an underlying format. This format describes the data which
> +can be stored and loaded for each pixel. Although each subsystem has its own
> +format descriptions (e.g. V4L2 and fbdev), the `DRM_FORMAT_*` tokens should be
> +reused wherever possible, as they are the standard descriptions used for
> +interchange.
> +
> +Each `DRM_FORMAT_*` token describes the per-pixel data available, in terms of
> +the translation between one or more pixels in memory, and the color data
> +contained within that memory. The number and type of color channels are
> +described: whether they are RGB or YUV, integer or floating-point, the size
> +of each channel and their locations within the pixel memory, and the
> +relationship between color planes.
> +
> +For example, `DRM_FORMAT_ARGB8888` describes a format in which each pixel has a
> +single 32-bit value in memory. Alpha, red, green, and blue, color channels are
> +available at 8-byte precision per channel, ordered respectively from most to

think you meant 8-bit there

> +least significant bits in little-endian storage. As a more complex example,
> +`DRM_FORMAT_NV12` describes a format in which luma and chroma YUV samples are
> +stored in separate memory planes, where the chroma plane is stored at half the
> +resolution in both dimensions (i.e. one U/V chroma sample is stored for each 2x2
> +pixel grouping).
> +
> +Format modifiers describe a translation mechanism between these per-pixel memory
> +samples, and the actual memory storage for the buffer. The most straightforward
> +modifier is `DRM_FORMAT_MOD_LINEAR`, describing a scheme in which each pixel has
> +contiguous storage beginning at (0,0); each pixel's location in memory will be
> +`base + (y * stride) + (x * bpp)`. This is considered the baseline interchange
> +format, and most convenient for CPU access.
> +
> +Modern hardware employs much more sophisticated access mechanisms, typically
> +making use of tiled access and possibly also compression. For example, the
> +`DRM_FORMAT_MOD_VIVANTE_TILED` modifier describes memory storage where pixels
> +are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile in
> +memory stores pixels (0,0) to (3,3) inclusive, and the second tile in memory
> +stores pixels (4,0) to (7,3) inclusive.
> +
> +Some modifiers may modify the number of memory buffers required to store the
> +data; for example, the `I915_FORMAT_MOD_Y_TILED_CCS` modifier adds a second
> +memory buffer to RGB formats in which it stores data about the status of every
> +tile, notably including whether the tile is fully populated with pixel data, or
> +can be expanded from a single solid color.
> +
> +These extended layouts are highly vendor-specific, and even specific to
> +particular generations or configurations of devices per-vendor. For this reason,
> +support of modifiers must be explicitly enumerated and negotiated by all users
> +in order to ensure a compatible and optimal pipeline, as discussed below.
> +
> +
> +Dimensions and size
> +===================
> +
> +Each pixel buffer must be accompanied by logical pixel dimensions. This refers
> +to the number of unique samples which can be extracted from, or stored to, the
> +underlying memory storage. For example, even though a 1920x1080
> +`DRM_FORMAT_NV12` buffer has a luma plane containing 1920x1080 samples for the Y
> +component, and 960x540 samples for the U and V components, the overall buffer is
> +still described as having dimensions of 1920x1080.
> +
> +The in-memory storage of a buffer is not guaranteed to begin immediately at the
> +base address of the underlying memory, nor is it guaranteed that the memory
> +storage is tightly clipped to either dimension.
> +
> +Each plane must therefore be described with an `offset` in bytes, which will be
> +added to the base address of the memory storage before performing any per-pixel
> +calculations. This may be used to combine multiple planes into a single pixel
> +buffer; for example, `DRM_FORMAT_NV12` may be stored in a single memory buffer
> +where the luma plane's storage begins immediately at the start of the buffer
> +with an offset of 0, and the chroma plane's storage begins after the offset of
> +the luma plane as expressed through its offset.
> +
> +Each plane must also have a `stride` in bytes, expressing the offset in memory
> +between two contiguous scanlines. For example, a `DRM_FORMAT_MOD_LINEAR` buffer
> +with dimensions of 1000x1000 may have been allocated as if it were 1024x1000, in
> +order to allow for aligned access patterns. In this case, the buffer will still
> +be described with a width of 1000, however the stride will be `1024 * bpp`,
> +indicating that there are 24 pixels at the positive extreme of the x axis whose
> +values are not significant.
> +
> +Buffers may also be padded further in the y dimension, simply by allocating a
> +larger area than would ordinarily be required. For example, many media decoders
> +are not able to natively output buffers of height 1080, but instead require an
> +effective height of 1088 pixels. In this case, the buffer continues to be
> +described as having a height of 1080, with the memory allocation for each buffer
> +being increased to account for the extra padding.
> +
> +
> +Enumeration
> +===========
> +
> +Every user of pixel buffers must be able to enumerate a set of supported formats
> +and modifiers, described together. Within KMS, this is achieved with the
> +`IN_FORMATS` property on each DRM plane, listing the supported DRM formats, and
> +the modifiers supported for each format. In userspace, this is supported through
> +the `EGL_EXT_image_dma_buf_import_modifiers` extension entrypoints for EGL, the
> +`VK_EXT_image_drm_format_modifier` extension for Vulkan, and the
> +`zwp_linux_dmabuf_v1` extension for Wayland.
> +
> +Each of these interfaces allows users to query a set of supported
> +format+modifier combinations.
> +
> +
> +Negotiation
> +===========
> +
> +It is the responsibility of userspace to negotiate an acceptable format+modifier
> +combination for its usage. This is performed through a simple intersection of
> +lists. For example, if a user wants to use Vulkan to render an image to be
> +displayed on a KMS plane, it must:
> +  - query KMS for the `IN_FORMATS` property for the given plane
> +  - query Vulkan for the supported formats for its physical device
> +  - intersect these formats to determine the most appropriate one
> +  - for this format, intersect the lists of supported modifiers for both KMS and
> +    Vulkan, to obtain a final list of acceptable modifiers for that format
> +
> +This intersection must be performed for all usages. For example, if the user
> +also wishes to encode the image to a video stream, it must query the media API
> +it intends to use for encoding for the set of modifiers it supports, and
> +additionally intersect against this list.
> +
> +If the intersection of all lists is an empty list, it is not possible to share
> +buffers in this way, and an alternate strategy must be considered (e.g. using
> +CPU access routines to copy data between the different uses, with the
> +corresponding performance cost).
> +
> +The resulting modifier list is unsorted; the order is not significant.
> +
> +
> +Allocation
> +==========
> +
> +Once userspace has determined an appropriate format, and corresponding list of
> +acceptable modifiers, it must allocate the buffer. As there is no universal
> +buffer-allocation interface available at either kernel or userspace level, the
> +client makes an arbitrary choice of allocation interface such as Vulkan, GBM, or
> +a media API.
> +
> +Each allocation request must take, at a minimum: the pixel format, a list of
> +acceptable modifiers, and the buffer's width and height. Each API may extend
> +this set of properties in different ways, such as allowing allocation in more
> +than two dimensions, intended usage patterns, etc.
> +
> +The component which allocates the buffer will make an arbitrary choice of what
> +it considers the 'best' modifier within the acceptable list for the requested
> +allocation, any padding required, and further properties of the underlying
> +memory buffers such as whether they are stored in system or device-specific
> +memory, whether or not they are physically contiguous, and their cache mode.
> +These properties of the memory buffer are not visible to userspace, however the
> +`dma-heaps` API is an effort to address this.
> +
> +After allocation, the client must query the allocator to determine the actual
> +modifier selected for the buffer, as well as the per-plane offset and stride.
> +Allocators are not permitted to vary the format in use, to select a modifier not
> +provided within the acceptable list, nor to vary the pixel dimensions other than
> +the padding expressed through offset, stride, and size.
> +
> +
> +Import
> +======
> +
> +To use a buffer within a different context, device, or subsystem, the user
> +passes these parameters (format, modifier, width, height, and per-plane offset
> +and stride) to an importing API.
> +
> +Each memory plane is referred to by a buffer handle, which may be unique or
> +duplicated within a buffer. For example, a `DRM_FORMAT_NV12` buffer may have the
> +luma and chroma buffers combined into a single memory buffer by use of the
> +per-plane offset parameters, or they may be completely separate allocations in
> +memory. For this reason, each import and allocation API must provide a separate
> +handle for each plane.
> +
> +Each kernel subsystem has its own types and interfaces for buffer management.
> +DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These types
> +are not portable between contexts, processes, devices, or subsystems.
> +
> +To address this, `dma-buf` handles are used as the universal interchange for
> +buffers. Subsystem-specific operations are used to export native buffer handles
> +to a `dma-buf` file descriptor, and to import those file descriptors into a
> +native buffer handle. dma-buf file descriptors can be transferred between
> +contexts, processes, devices, and subsystems.
> +
> +For example, a Wayland media player may use V4L2 to decode a video frame into
> +a `DRM_FORMAT_NV12` buffer. This will result in two memory planes (luma and
> +chroma) being dequeued by the user from V4L2. These planes are then exported to
> +one dma-buf file descriptor per plane, these descriptors are then sent along
> +with the metadata (format, modifier, width, height, per-plane offset and stride)
> +to the Wayland server. The Wayland server will then import these file
> +descriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for use
> +through Vulkan, or a `drm_fb` for use through KMS; each of these import
> +operations will take the same metadata and convert the dma-buf file descriptors
> +into their native buffer handles.
> +
> +
> +Implicit modifiers
> +==================
> +
> +The concept of modifiers post-dates all of the subsystems mentioned above. As
> +such, it has been retrofitted into all of these APIs, and in order to ensure
> +backwards compatibility, support is needed for drivers and userspace which do
> +not (yet) support modifiers.
> +
> +As an example, GBM is used to allocate buffers to be shared between EGL for
> +rendering and KMS for display. It has two entrypoints for allocating buffers:
> +`gbm_bo_create` which only takes the format, width, height, and a usage token,
> +and `gbm_bo_create_with_modifiers` which extends this with a list of modifiers.
> +
> +In the latter case, the allocation is as discussed above, being provided with a
> +list of acceptable modifiers that the implementation can choose from (or fail if
> +it is not possible to allocate within those constraints). In the former case
> +where modifiers are not provided, the GBM implementation must make its own
> +choice as to what is likely to be the 'best' layout. Such a choice is entirely
> +implementation-specific: some will internally use tiled layouts which are not
> +CPU-accessible if the implementation decides that is a good idea through
> +whatever heuristic. It is the implementation's responsibility to ensure that
> +this choice is appropriate.
> +
> +To support this case where the layout is not known because there is no awareness
> +of modifiers, a special `DRM_FORMAT_MOD_INVALID` token has been defined. This
> +pseudo-modifier declares that the layout is not known, and that the driver
> +should use its own logic to determine what the underlying layout may be.
> +
> +There are four cases where this token may be used:
> +  - during enumeration, an interface may return `DRM_FORMAT_MOD_INVALID`, either
> +    as the sole member of a modifier list to declare that explicit modifiers are
> +    not supported, or as part of a larger list to declare that implicit modifiers
> +    may be used
> +  - during allocation, a user may supply `DRM_FORMAT_MOD_INVALID`, either as the
> +    sole member of a modifier list (equivalent to not supplying a modifier list
> +    at all) to declare that explicit modifiers are not supported and must not be
> +    used, or as part of a larger list to declare that an allocation using implicit
> +    modifiers is acceptable
> +  - in a post-allocation query, an implementation may return
> +    `DRM_FORMAT_MOD_INVALID` as the modifier of the allocated buffer to declare
> +    that the underlying layout is implementation-defined and that an explicit
> +    modifier description is not available; per the above rules, this may only be
> +    returned when the user has included `DRM_FORMAT_MOD_INVALID` as part of the
> +    list of acceptable modifiers, or not provided a list
> +  - when importing a buffer, the user may supply `DRM_FORMAT_MOD_INVALID` as the
> +    buffer modifier (or not supply a modifier) to indicate that the modifier is
> +    unknown for whatever reason; this is only acceptable when the buffer has
> +    not been allocated with an explicit modifier
> +
> +It follows from this that a buffer chain must be either fully implicit or fully
> +explicit. For example, if a user wishes to allocate a buffer for use between
> +GPU, display, and media, but the media API does not support modifiers, then the
> +user **must not** allocate the buffer with explicit modifiers and attempt to
> +import the buffer into the media API with no modifier, but either perform the
> +allocation using implicit modifiers, or allocate the buffer for media use
> +separately and copy between the two buffers.
> +
> +As one exception to the above, allocations may be 'upgraded' from implicit
> +to explicit modifiers. For example, if the buffer is allocated with
> +`gbm_bo_create` (taking no modifiers), the user may then query the modifier with
> +`gbm_bo_get_modifier` and then use this modifier as an explicit modifier token
> +if a valid modifier is returned.
> +
> +When allocating buffers for exchange between different users and modifiers are
> +not available, implementations are strongly encouraged to use
> +`DRM_FORMAT_MOD_LINEAR` for their allocation, as this is the universal baseline
> +for exchange.
> +
> +Any new users - userspace programs and protocols, kernel subsystems, etc -
> +wishing to exchange buffers must offer interoperability through dma-buf file
> +descriptors for memory planes, DRM format tokens to describe the format, DRM
> +format modifiers to describe the layout in memory, at least width and height for
> +dimensions, and at least offset and stride for each memory plane.
> diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst
> index b9c1214d8f23..cb12f2654ed7 100644
> --- a/Documentation/gpu/index.rst
> +++ b/Documentation/gpu/index.rst
> @@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
>      drm-kms
>      drm-kms-helpers
>      drm-uapi
> +   exchanging-pixel-buffers
>      driver-uapi
>      drm-client
>      drivers
> 

  parent reply	other threads:[~2021-09-06 17:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-05 12:27 [PATCH] doc: gpu: Add document describing buffer exchange Daniel Stone
2021-09-06 12:28 ` Simon Ser
2021-11-09  0:18   ` James Jones
2021-11-09  9:13     ` Daniel Vetter
2021-11-09  9:22       ` Simon Ser
2023-08-03 15:46       ` Daniel Stone
2023-08-03 15:46         ` Daniel Stone
2021-09-06 17:13 ` Robert Beckett [this message]
2021-09-08  9:34 ` Pekka Paalanen
2021-09-08  9:44   ` Simon Ser
2021-11-09  0:21     ` James Jones
2021-11-09  9:12       ` Daniel Vetter
2021-09-08 18:16 ` Daniel Vetter
2023-08-03 15:47 ` [PATCH v2 0/2] doc: uapi: Document dma-buf interop design & semantics Daniel Stone
2023-08-03 19:47   ` James Jones
2023-08-03 20:30   ` Sebastian Wick
2023-08-03 20:30     ` Sebastian Wick
2023-08-29 13:30   ` [Linaro-mm-sig] " Christian König
2023-08-03 15:47 ` [PATCH v2 1/2] doc: dma-buf: Rewrite intro section a little Daniel Stone
2023-08-03 15:47 ` [PATCH v2 2/2] doc: uapi: Add document describing dma-buf semantics Daniel Stone
2023-08-18 15:37   ` [v2,2/2] " suijingfeng
2023-08-21 13:33   ` [PATCH v2 2/2] " Daniel Vetter
2023-08-21 17:17     ` Simon Ser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72c53fc3-1714-3b24-3a7c-8ee4e72574c1@collabora.com \
    --to=bob.beckett@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.