Direct userspace dma-buf mmap (v6)

* Direct userspace dma-buf mmap (v6)
@ 2015-12-16 22:25 Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd Tiago Vignatti
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

Hi all,

The last version of this work was sent a while ago here:

http://lists.freedesktop.org/archives/dri-devel/2015-August/089263.html

So let's recap this series:

    1. it adds a vendor-independent client interface for mapping gem objects
       through prime, IOW it implements userspace mmap() on dma-buf fd.
       This could be used for texturing from CPU rendered buffer, passing
       buffers among processes without performing copies in the userspace.
    2. the series lets the client write on the mmap'ed memory, and
    3. it deals with GPU and CPU caches synchronization.

Based on previous discussions seems that people are fine with 1. and 2. but 
not really with 3., given that caches coherency is a bit more boring to deal 
with.

It's easier to use this new infra on "coherent hardware" (systems with the
memory cache that is shared by the GPU and CPU) because they rarely need to
use that kind of synchronization. But would be much more convenient to have 
the very same interface exposed for clients no matter whether the underlying 
hardware is cache coherent or not.

One idea that came up was to force clients to call the sync ioctls after the
dma-buf was mmaped. But apparently there's no easy, and performant, way to do
so cause seems too costly to go over the page table entry and check the dirty
bits. Also, depending on the instructions order sent for the devices, it
might be needed a sync call after the mapped region gets accessed as well, to
flush all cachelines and make sure for example the GPU domain won't read stale 
data. So that would make the things even more complicated, if we ever decide
to go to this direction of forcing sync ioctls. The alternative therefore is to
simply document it very well, strong wording the clients to use the sync ioctl 
regardless otherwise they will mis-behave. Do we have objections or maybe 
other wiser ways to circumvent this? I've made similar comments in August and
no one has came up with better ideas.

Lastly, the diff of v6 series is that I've basically addressed concerns
pointed in the igt tests, organized those changes better a bit (in smaller
patches), documented the usage of sync ioctls and I have extensively tested 
this in different types of hardware.

https://github.com/tiagovignatti/drm-intel/commits/drm-intel-nightly_dma-buf-mmap-v6
https://github.com/tiagovignatti/intel-gpu-tools/commits/dma-buf-mmap-v6

Tiago

Daniel Thompson (1):
  drm: prime: Honour O_RDWR during prime-handle-to-fd

Daniel Vetter (1):
  dma-buf: Add ioctls to allow userspace to flush

Tiago Vignatti (3):
  dma-buf: Remove range-based flush
  drm/i915: Implement end_cpu_access
  drm/i915: Use CPU mapping for userspace dma-buf mmap()

 Documentation/dma-buf-sharing.txt         | 41 +++++++++++++++-------
 drivers/dma-buf/dma-buf.c                 | 56 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/drm_prime.c               | 10 ++----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c    | 42 +++++++++++++++++++++--
 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |  4 +--
 drivers/gpu/drm/udl/udl_fb.c              |  2 --
 drivers/staging/android/ion/ion.c         |  6 ++--
 drivers/staging/android/ion/ion_test.c    |  4 +--
 include/linux/dma-buf.h                   | 12 +++----
 include/uapi/drm/drm.h                    |  1 +
 include/uapi/linux/dma-buf.h              | 38 +++++++++++++++++++++
 11 files changed, 169 insertions(+), 47 deletions(-)
 create mode 100644 include/uapi/linux/dma-buf.h

And the igt changes:
Rob Bradford (1):
  prime_mmap: Add new test for calling mmap() on dma-buf fds

Tiago Vignatti (5):
  lib: Add gem_userptr and __gem_userptr helpers
  prime_mmap: Add basic tests to write in a bo using CPU
  lib: Add prime_sync_start and prime_sync_end helpers
  tests: Add kms_mmap_write_crc for cache coherency tests
  tests: Add prime_mmap_coherency for cache coherency tests

 benchmarks/gem_userptr_benchmark.c |  55 +----
 lib/ioctl_wrappers.c               |  92 +++++++
 lib/ioctl_wrappers.h               |  32 +++
 tests/Makefile.sources             |   3 +
 tests/gem_userptr_blits.c          | 104 ++------
 tests/kms_mmap_write_crc.c         | 281 +++++++++++++++++++++
 tests/prime_mmap.c                 | 494 +++++++++++++++++++++++++++++++++++++
 tests/prime_mmap_coherency.c       | 246 ++++++++++++++++++
 8 files changed, 1180 insertions(+), 127 deletions(-)
 create mode 100644 tests/kms_mmap_write_crc.c
 create mode 100644 tests/prime_mmap.c
 create mode 100644 tests/prime_mmap_coherency.c

-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread