All of lore.kernel.org
 help / color / mirror / Atom feed
* Direct userspace dma-buf mmap (v6)
@ 2015-12-16 22:25 Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd Tiago Vignatti
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

Hi all,

The last version of this work was sent a while ago here:

http://lists.freedesktop.org/archives/dri-devel/2015-August/089263.html

So let's recap this series:

    1. it adds a vendor-independent client interface for mapping gem objects
       through prime, IOW it implements userspace mmap() on dma-buf fd.
       This could be used for texturing from CPU rendered buffer, passing
       buffers among processes without performing copies in the userspace.
    2. the series lets the client write on the mmap'ed memory, and
    3. it deals with GPU and CPU caches synchronization.

Based on previous discussions seems that people are fine with 1. and 2. but 
not really with 3., given that caches coherency is a bit more boring to deal 
with.

It's easier to use this new infra on "coherent hardware" (systems with the
memory cache that is shared by the GPU and CPU) because they rarely need to
use that kind of synchronization. But would be much more convenient to have 
the very same interface exposed for clients no matter whether the underlying 
hardware is cache coherent or not.

One idea that came up was to force clients to call the sync ioctls after the
dma-buf was mmaped. But apparently there's no easy, and performant, way to do
so cause seems too costly to go over the page table entry and check the dirty
bits. Also, depending on the instructions order sent for the devices, it
might be needed a sync call after the mapped region gets accessed as well, to
flush all cachelines and make sure for example the GPU domain won't read stale 
data. So that would make the things even more complicated, if we ever decide
to go to this direction of forcing sync ioctls. The alternative therefore is to
simply document it very well, strong wording the clients to use the sync ioctl 
regardless otherwise they will mis-behave. Do we have objections or maybe 
other wiser ways to circumvent this? I've made similar comments in August and
no one has came up with better ideas.

Lastly, the diff of v6 series is that I've basically addressed concerns
pointed in the igt tests, organized those changes better a bit (in smaller
patches), documented the usage of sync ioctls and I have extensively tested 
this in different types of hardware.

https://github.com/tiagovignatti/drm-intel/commits/drm-intel-nightly_dma-buf-mmap-v6
https://github.com/tiagovignatti/intel-gpu-tools/commits/dma-buf-mmap-v6

Tiago


Daniel Thompson (1):
  drm: prime: Honour O_RDWR during prime-handle-to-fd

Daniel Vetter (1):
  dma-buf: Add ioctls to allow userspace to flush

Tiago Vignatti (3):
  dma-buf: Remove range-based flush
  drm/i915: Implement end_cpu_access
  drm/i915: Use CPU mapping for userspace dma-buf mmap()

 Documentation/dma-buf-sharing.txt         | 41 +++++++++++++++-------
 drivers/dma-buf/dma-buf.c                 | 56 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/drm_prime.c               | 10 ++----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c    | 42 +++++++++++++++++++++--
 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |  4 +--
 drivers/gpu/drm/udl/udl_fb.c              |  2 --
 drivers/staging/android/ion/ion.c         |  6 ++--
 drivers/staging/android/ion/ion_test.c    |  4 +--
 include/linux/dma-buf.h                   | 12 +++----
 include/uapi/drm/drm.h                    |  1 +
 include/uapi/linux/dma-buf.h              | 38 +++++++++++++++++++++
 11 files changed, 169 insertions(+), 47 deletions(-)
 create mode 100644 include/uapi/linux/dma-buf.h


And the igt changes:
Rob Bradford (1):
  prime_mmap: Add new test for calling mmap() on dma-buf fds

Tiago Vignatti (5):
  lib: Add gem_userptr and __gem_userptr helpers
  prime_mmap: Add basic tests to write in a bo using CPU
  lib: Add prime_sync_start and prime_sync_end helpers
  tests: Add kms_mmap_write_crc for cache coherency tests
  tests: Add prime_mmap_coherency for cache coherency tests

 benchmarks/gem_userptr_benchmark.c |  55 +----
 lib/ioctl_wrappers.c               |  92 +++++++
 lib/ioctl_wrappers.h               |  32 +++
 tests/Makefile.sources             |   3 +
 tests/gem_userptr_blits.c          | 104 ++------
 tests/kms_mmap_write_crc.c         | 281 +++++++++++++++++++++
 tests/prime_mmap.c                 | 494 +++++++++++++++++++++++++++++++++++++
 tests/prime_mmap_coherency.c       | 246 ++++++++++++++++++
 8 files changed, 1180 insertions(+), 127 deletions(-)
 create mode 100644 tests/kms_mmap_write_crc.c
 create mode 100644 tests/prime_mmap.c
 create mode 100644 tests/prime_mmap_coherency.c

-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 2/5] dma-buf: Remove range-based flush Tiago Vignatti
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

From: Daniel Thompson <daniel.thompson@linaro.org>

Currently DRM_IOCTL_PRIME_HANDLE_TO_FD rejects all flags except
(DRM|O)_CLOEXEC making it difficult (maybe impossible) for userspace
to mmap() the resulting dma-buf even when this is supported by the
DRM driver.

It is trivial to relax the restriction and permit read/write access.
This is safe because the flags are seldom touched by drm; mostly they
are passed verbatim to dma_buf calls.

v3 (Tiago): removed unused flags variable from drm_prime_handle_to_fd_ioctl.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 drivers/gpu/drm/drm_prime.c | 10 +++-------
 include/uapi/drm/drm.h      |  1 +
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 27aa718..df6cdc7 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -329,7 +329,7 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
  * drm_gem_prime_export - helper library implementation of the export callback
  * @dev: drm_device to export from
  * @obj: GEM object to export
- * @flags: flags like DRM_CLOEXEC
+ * @flags: flags like DRM_CLOEXEC and DRM_RDWR
  *
  * This is the implementation of the gem_prime_export functions for GEM drivers
  * using the PRIME helpers.
@@ -628,7 +628,6 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
 				 struct drm_file *file_priv)
 {
 	struct drm_prime_handle *args = data;
-	uint32_t flags;
 
 	if (!drm_core_check_feature(dev, DRIVER_PRIME))
 		return -EINVAL;
@@ -637,14 +636,11 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
 		return -ENOSYS;
 
 	/* check flags are valid */
-	if (args->flags & ~DRM_CLOEXEC)
+	if (args->flags & ~(DRM_CLOEXEC | DRM_RDWR))
 		return -EINVAL;
 
-	/* we only want to pass DRM_CLOEXEC which is == O_CLOEXEC */
-	flags = args->flags & DRM_CLOEXEC;
-
 	return dev->driver->prime_handle_to_fd(dev, file_priv,
-			args->handle, flags, &args->fd);
+			args->handle, args->flags, &args->fd);
 }
 
 int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data,
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index b4e92eb..a0ebfe7 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -669,6 +669,7 @@ struct drm_set_client_cap {
 	__u64 value;
 };
 
+#define DRM_RDWR O_RDWR
 #define DRM_CLOEXEC O_CLOEXEC
 struct drm_prime_handle {
 	__u32 handle;
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 2/5] dma-buf: Remove range-based flush
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush Tiago Vignatti
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel
  Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse, Daniel Vetter

This patch removes range-based information used for optimizations in
begin_cpu_access and end_cpu_access.

We don't have any user nor implementation using range-based flush. It seems a
consensus that if we ever want something like that again (or even more robust
using 2D, 3D sub-range regions) we can use the upcoming dma-buf sync ioctl for
such.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 Documentation/dma-buf-sharing.txt         | 19 ++++++++-----------
 drivers/dma-buf/dma-buf.c                 | 13 ++++---------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c    |  2 +-
 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |  4 ++--
 drivers/gpu/drm/udl/udl_fb.c              |  2 --
 drivers/staging/android/ion/ion.c         |  6 ++----
 drivers/staging/android/ion/ion_test.c    |  4 ++--
 include/linux/dma-buf.h                   | 12 +++++-------
 8 files changed, 24 insertions(+), 38 deletions(-)

diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
index 480c8de..4f4a84b 100644
--- a/Documentation/dma-buf-sharing.txt
+++ b/Documentation/dma-buf-sharing.txt
@@ -257,17 +257,15 @@ Access to a dma_buf from the kernel context involves three steps:
 
    Interface:
       int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
-				   size_t start, size_t len,
 				   enum dma_data_direction direction)
 
    This allows the exporter to ensure that the memory is actually available for
    cpu access - the exporter might need to allocate or swap-in and pin the
    backing storage. The exporter also needs to ensure that cpu access is
-   coherent for the given range and access direction. The range and access
-   direction can be used by the exporter to optimize the cache flushing, i.e.
-   access outside of the range or with a different direction (read instead of
-   write) might return stale or even bogus data (e.g. when the exporter needs to
-   copy the data to temporary storage).
+   coherent for the access direction. The direction can be used by the exporter
+   to optimize the cache flushing, i.e. access with a different direction (read
+   instead of write) might return stale or even bogus data (e.g. when the
+   exporter needs to copy the data to temporary storage).
 
    This step might fail, e.g. in oom conditions.
 
@@ -322,14 +320,13 @@ Access to a dma_buf from the kernel context involves three steps:
 
 3. Finish access
 
-   When the importer is done accessing the range specified in begin_cpu_access,
-   it needs to announce this to the exporter (to facilitate cache flushing and
-   unpinning of any pinned resources). The result of any dma_buf kmap calls
-   after end_cpu_access is undefined.
+   When the importer is done accessing the CPU, it needs to announce this to
+   the exporter (to facilitate cache flushing and unpinning of any pinned
+   resources). The result of any dma_buf kmap calls after end_cpu_access is
+   undefined.
 
    Interface:
       void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
-				  size_t start, size_t len,
 				  enum dma_data_direction dir);
 
 
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 155c146..b2ac13b 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -539,13 +539,11 @@ EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
  * preparations. Coherency is only guaranteed in the specified range for the
  * specified access direction.
  * @dmabuf:	[in]	buffer to prepare cpu access for.
- * @start:	[in]	start of range for cpu access.
- * @len:	[in]	length of range for cpu access.
  * @direction:	[in]	length of range for cpu access.
  *
  * Can return negative error values, returns 0 on success.
  */
-int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
+int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 			     enum dma_data_direction direction)
 {
 	int ret = 0;
@@ -554,8 +552,7 @@ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
 		return -EINVAL;
 
 	if (dmabuf->ops->begin_cpu_access)
-		ret = dmabuf->ops->begin_cpu_access(dmabuf, start,
-							len, direction);
+		ret = dmabuf->ops->begin_cpu_access(dmabuf, direction);
 
 	return ret;
 }
@@ -567,19 +564,17 @@ EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access);
  * actions. Coherency is only guaranteed in the specified range for the
  * specified access direction.
  * @dmabuf:	[in]	buffer to complete cpu access for.
- * @start:	[in]	start of range for cpu access.
- * @len:	[in]	length of range for cpu access.
  * @direction:	[in]	length of range for cpu access.
  *
  * This call must always succeed.
  */
-void dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len,
+void dma_buf_end_cpu_access(struct dma_buf *dmabuf,
 			    enum dma_data_direction direction)
 {
 	WARN_ON(!dmabuf);
 
 	if (dmabuf->ops->end_cpu_access)
-		dmabuf->ops->end_cpu_access(dmabuf, start, len, direction);
+		dmabuf->ops->end_cpu_access(dmabuf, direction);
 }
 EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index e9c2bfd..65ab2bd 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -196,7 +196,7 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *
 	return -EINVAL;
 }
 
-static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t length, enum dma_data_direction direction)
+static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	struct drm_device *dev = obj->base.dev;
diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
index 27c2976..aebae1c 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
@@ -79,7 +79,7 @@ static void omap_gem_dmabuf_release(struct dma_buf *buffer)
 
 
 static int omap_gem_dmabuf_begin_cpu_access(struct dma_buf *buffer,
-		size_t start, size_t len, enum dma_data_direction dir)
+		enum dma_data_direction dir)
 {
 	struct drm_gem_object *obj = buffer->priv;
 	struct page **pages;
@@ -94,7 +94,7 @@ static int omap_gem_dmabuf_begin_cpu_access(struct dma_buf *buffer,
 }
 
 static void omap_gem_dmabuf_end_cpu_access(struct dma_buf *buffer,
-		size_t start, size_t len, enum dma_data_direction dir)
+		enum dma_data_direction dir)
 {
 	struct drm_gem_object *obj = buffer->priv;
 	omap_gem_put_pages(obj);
diff --git a/drivers/gpu/drm/udl/udl_fb.c b/drivers/gpu/drm/udl/udl_fb.c
index 200419d..c427499 100644
--- a/drivers/gpu/drm/udl/udl_fb.c
+++ b/drivers/gpu/drm/udl/udl_fb.c
@@ -409,7 +409,6 @@ static int udl_user_framebuffer_dirty(struct drm_framebuffer *fb,
 
 	if (ufb->obj->base.import_attach) {
 		ret = dma_buf_begin_cpu_access(ufb->obj->base.import_attach->dmabuf,
-					       0, ufb->obj->base.size,
 					       DMA_FROM_DEVICE);
 		if (ret)
 			goto unlock;
@@ -425,7 +424,6 @@ static int udl_user_framebuffer_dirty(struct drm_framebuffer *fb,
 
 	if (ufb->obj->base.import_attach) {
 		dma_buf_end_cpu_access(ufb->obj->base.import_attach->dmabuf,
-				       0, ufb->obj->base.size,
 				       DMA_FROM_DEVICE);
 	}
 
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index e237e9f..0754a37 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -1057,8 +1057,7 @@ static void ion_dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long offset,
 {
 }
 
-static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start,
-					size_t len,
+static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 					enum dma_data_direction direction)
 {
 	struct ion_buffer *buffer = dmabuf->priv;
@@ -1076,8 +1075,7 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start,
 	return PTR_ERR_OR_ZERO(vaddr);
 }
 
-static void ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start,
-				       size_t len,
+static void ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
 				       enum dma_data_direction direction)
 {
 	struct ion_buffer *buffer = dmabuf->priv;
diff --git a/drivers/staging/android/ion/ion_test.c b/drivers/staging/android/ion/ion_test.c
index b8dcf5a..da34bc12 100644
--- a/drivers/staging/android/ion/ion_test.c
+++ b/drivers/staging/android/ion/ion_test.c
@@ -109,7 +109,7 @@ static int ion_handle_test_kernel(struct dma_buf *dma_buf, void __user *ptr,
 	if (offset > dma_buf->size || size > dma_buf->size - offset)
 		return -EINVAL;
 
-	ret = dma_buf_begin_cpu_access(dma_buf, offset, size, dir);
+	ret = dma_buf_begin_cpu_access(dma_buf, dir);
 	if (ret)
 		return ret;
 
@@ -139,7 +139,7 @@ static int ion_handle_test_kernel(struct dma_buf *dma_buf, void __user *ptr,
 		copy_offset = 0;
 	}
 err:
-	dma_buf_end_cpu_access(dma_buf, offset, size, dir);
+	dma_buf_end_cpu_access(dma_buf, dir);
 	return ret;
 }
 
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index f98bd70..532108e 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -54,7 +54,7 @@ struct dma_buf_attachment;
  * @release: release this buffer; to be called after the last dma_buf_put.
  * @begin_cpu_access: [optional] called before cpu access to invalidate cpu
  * 		      caches and allocate backing storage (if not yet done)
- * 		      respectively pin the objet into memory.
+ * 		      respectively pin the object into memory.
  * @end_cpu_access: [optional] called after cpu access to flush caches.
  * @kmap_atomic: maps a page from the buffer into kernel address
  * 		 space, users may not block until the subsequent unmap call.
@@ -93,10 +93,8 @@ struct dma_buf_ops {
 	/* after final dma_buf_put() */
 	void (*release)(struct dma_buf *);
 
-	int (*begin_cpu_access)(struct dma_buf *, size_t, size_t,
-				enum dma_data_direction);
-	void (*end_cpu_access)(struct dma_buf *, size_t, size_t,
-			       enum dma_data_direction);
+	int (*begin_cpu_access)(struct dma_buf *, enum dma_data_direction);
+	void (*end_cpu_access)(struct dma_buf *, enum dma_data_direction);
 	void *(*kmap_atomic)(struct dma_buf *, unsigned long);
 	void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *);
 	void *(*kmap)(struct dma_buf *, unsigned long);
@@ -224,9 +222,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
 					enum dma_data_direction);
 void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
 				enum dma_data_direction);
-int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
+int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
 			     enum dma_data_direction dir);
-void dma_buf_end_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len,
+void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
 			    enum dma_data_direction dir);
 void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
 void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd Tiago Vignatti
  2015-12-16 22:25 ` [PATCH v6 2/5] dma-buf: Remove range-based flush Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-17 18:19   ` Alex Deucher
  2015-12-17 21:58   ` Thomas Hellstrom
  2015-12-16 22:25 ` [PATCH v6 4/5] drm/i915: Implement end_cpu_access Tiago Vignatti
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel
  Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse, Daniel Vetter

From: Daniel Vetter <daniel.vetter@ffwll.ch>

The userspace might need some sort of cache coherency management e.g. when CPU
and GPU domains are being accessed through dma-buf at the same time. To
circumvent this problem there are begin/end coherency markers, that forward
directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
used like following:
     - mmap dma-buf fd
     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
       want (with the new data being consumed by the GPU or say scanout device)
     - munmap once you don't need the buffer any more

v2 (Tiago): Fix header file type names (u64 -> __u64)
v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
dma-buf functions. Check for overflows in start/length.
v4 (Tiago): use 2d regions for sync.
v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
remove range information from struct dma_buf_sync.
v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
documentation about the recommendation on using sync ioctls.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
 drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
 3 files changed, 102 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/dma-buf.h

diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
index 4f4a84b..2ddd4b2 100644
--- a/Documentation/dma-buf-sharing.txt
+++ b/Documentation/dma-buf-sharing.txt
@@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
    handles, too). So it's beneficial to support this in a similar fashion on
    dma-buf to have a good transition path for existing Android userspace.
 
-   No special interfaces, userspace simply calls mmap on the dma-buf fd.
+   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
+   important to note though is that, even if it is not mandatory, the userspace
+   is strongly recommended to always use the cache synchronization ioctl
+   (DMA_BUF_IOCTL_SYNC) discussed next.
+
+   Some systems might need some sort of cache coherency management e.g. when
+   CPU and GPU domains are being accessed through dma-buf at the same time. To
+   circumvent this problem there are begin/end coherency markers, that forward
+   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
+   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
+   would be used like following:
+     - mmap dma-buf fd
+     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
+       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
+       want (with the new data being consumed by the GPU or say scanout device)
+     - munmap once you don't need the buffer any more
+
+    In principle systems with the memory cache shared by the GPU and CPU may
+    not need SYNC_START and SYNC_END but still, userspace is always encouraged
+    to use these ioctls before and after, respectively, when accessing the
+    mapped address.
 
 2. Supporting existing mmap interfaces in importers
 
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index b2ac13b..9a298bd 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -34,6 +34,8 @@
 #include <linux/poll.h>
 #include <linux/reservation.h>
 
+#include <uapi/linux/dma-buf.h>
+
 static inline int is_dma_buf_file(struct file *);
 
 struct dma_buf_list {
@@ -251,11 +253,52 @@ out:
 	return events;
 }
 
+static long dma_buf_ioctl(struct file *file,
+			  unsigned int cmd, unsigned long arg)
+{
+	struct dma_buf *dmabuf;
+	struct dma_buf_sync sync;
+	enum dma_data_direction direction;
+
+	dmabuf = file->private_data;
+
+	if (!is_dma_buf_file(file))
+		return -EINVAL;
+
+	switch (cmd) {
+	case DMA_BUF_IOCTL_SYNC:
+		if (copy_from_user(&sync, (void __user *) arg, sizeof(sync)))
+			return -EFAULT;
+
+		if (sync.flags & DMA_BUF_SYNC_RW)
+			direction = DMA_BIDIRECTIONAL;
+		else if (sync.flags & DMA_BUF_SYNC_READ)
+			direction = DMA_FROM_DEVICE;
+		else if (sync.flags & DMA_BUF_SYNC_WRITE)
+			direction = DMA_TO_DEVICE;
+		else
+			return -EINVAL;
+
+		if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
+			return -EINVAL;
+
+		if (sync.flags & DMA_BUF_SYNC_END)
+			dma_buf_end_cpu_access(dmabuf, direction);
+		else
+			dma_buf_begin_cpu_access(dmabuf, direction);
+
+		return 0;
+	default:
+		return -ENOTTY;
+	}
+}
+
 static const struct file_operations dma_buf_fops = {
 	.release	= dma_buf_release,
 	.mmap		= dma_buf_mmap_internal,
 	.llseek		= dma_buf_llseek,
 	.poll		= dma_buf_poll,
+	.unlocked_ioctl	= dma_buf_ioctl,
 };
 
 /*
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
new file mode 100644
index 0000000..bd195f2
--- /dev/null
+++ b/include/uapi/linux/dma-buf.h
@@ -0,0 +1,38 @@
+/*
+ * Framework for buffer objects that can be shared across devices/subsystems.
+ *
+ * Copyright(C) 2015 Intel Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _DMA_BUF_UAPI_H_
+#define _DMA_BUF_UAPI_H_
+
+/* begin/end dma-buf functions used for userspace mmap. */
+struct dma_buf_sync {
+	__u64 flags;
+};
+
+#define DMA_BUF_SYNC_READ      (1 << 0)
+#define DMA_BUF_SYNC_WRITE     (2 << 0)
+#define DMA_BUF_SYNC_RW        (3 << 0)
+#define DMA_BUF_SYNC_START     (0 << 2)
+#define DMA_BUF_SYNC_END       (1 << 2)
+#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
+	(DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
+
+#define DMA_BUF_BASE		'b'
+#define DMA_BUF_IOCTL_SYNC	_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
+
+#endif
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 4/5] drm/i915: Implement end_cpu_access
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (2 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-17  8:01   ` Chris Wilson
  2015-12-16 22:25 ` [PATCH v6 5/5] drm/i915: Use CPU mapping for userspace dma-buf mmap() Tiago Vignatti
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

This function is meant to be used with dma-buf mmap, when finishing the CPU
access of the mapped pointer.

The error case should be rare to happen though, requiring the buffer become
active during the sync period and for the end_cpu_access to be interrupted. So
we use a uninterruptible mutex_lock to spit out when it ever happens.

v2: disable interruption to make sure errors are reported.
v3: update to the new end_cpu_access API.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 65ab2bd..9dba876 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -212,6 +212,27 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire
 	return ret;
 }
 
+static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
+{
+	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE);
+	int ret;
+
+	mutex_lock(&dev->struct_mutex);
+	was_interruptible = dev_priv->mm.interruptible;
+	dev_priv->mm.interruptible = false;
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, write);
+
+	dev_priv->mm.interruptible = was_interruptible;
+	mutex_unlock(&dev->struct_mutex);
+
+	if (unlikely(ret))
+		DRM_ERROR("unable to flush buffer following CPU access; rendering may be corrupt\n");
+}
+
 static const struct dma_buf_ops i915_dmabuf_ops =  {
 	.map_dma_buf = i915_gem_map_dma_buf,
 	.unmap_dma_buf = i915_gem_unmap_dma_buf,
@@ -224,6 +245,7 @@ static const struct dma_buf_ops i915_dmabuf_ops =  {
 	.vmap = i915_gem_dmabuf_vmap,
 	.vunmap = i915_gem_dmabuf_vunmap,
 	.begin_cpu_access = i915_gem_begin_cpu_access,
+	.end_cpu_access = i915_gem_end_cpu_access,
 };
 
 struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 5/5] drm/i915: Use CPU mapping for userspace dma-buf mmap()
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (3 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH v6 4/5] drm/i915: Implement end_cpu_access Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH igt v6 1/6] lib: Add gem_userptr and __gem_userptr helpers Tiago Vignatti
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

Userspace is the one in charge of flush CPU by wrapping mmap with
begin{,end}_cpu_access.

v2: Remove LLC check cause we have dma-buf sync providers now. Also, fix return
before transferring ownership when mmap fails.
v3: Fix return values.
v4: !obj->base.filp is user triggerable, so removed the WARN_ON.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 9dba876..b7e7a90 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -193,7 +193,23 @@ static void i915_gem_dmabuf_kunmap(struct dma_buf *dma_buf, unsigned long page_n
 
 static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma)
 {
-	return -EINVAL;
+	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
+	int ret;
+
+	if (obj->base.size < vma->vm_end - vma->vm_start)
+		return -EINVAL;
+
+	if (!obj->base.filp)
+		return -ENODEV;
+
+	ret = obj->base.filp->f_op->mmap(obj->base.filp, vma);
+	if (ret)
+		return ret;
+
+	fput(vma->vm_file);
+	vma->vm_file = get_file(obj->base.filp);
+
+	return 0;
 }
 
 static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 1/6] lib: Add gem_userptr and __gem_userptr helpers
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (4 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH v6 5/5] drm/i915: Use CPU mapping for userspace dma-buf mmap() Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH igt v6 2/6] prime_mmap: Add new test for calling mmap() on dma-buf fds Tiago Vignatti
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

This patch moves userptr definitions and helpers implementation that were
locally in gem_userptr_benchmark and gem_userptr_blits to the library, so other
tests can make use of them as well. There's no functional changes.

v2: added __ function to differentiate when errors want to be handled back in
the caller; bring gem_userptr_sync back to gem_userptr_blits; added gtkdoc.

Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 benchmarks/gem_userptr_benchmark.c |  55 +++-----------------
 lib/ioctl_wrappers.c               |  41 +++++++++++++++
 lib/ioctl_wrappers.h               |  13 +++++
 tests/gem_userptr_blits.c          | 104 ++++++++++---------------------------
 4 files changed, 86 insertions(+), 127 deletions(-)

diff --git a/benchmarks/gem_userptr_benchmark.c b/benchmarks/gem_userptr_benchmark.c
index 1eae7ff..f7716df 100644
--- a/benchmarks/gem_userptr_benchmark.c
+++ b/benchmarks/gem_userptr_benchmark.c
@@ -58,17 +58,6 @@
   #define PAGE_SIZE 4096
 #endif
 
-#define LOCAL_I915_GEM_USERPTR       0x33
-#define LOCAL_IOCTL_I915_GEM_USERPTR DRM_IOWR (DRM_COMMAND_BASE + LOCAL_I915_GEM_USERPTR, struct local_i915_gem_userptr)
-struct local_i915_gem_userptr {
-	uint64_t user_ptr;
-	uint64_t user_size;
-	uint32_t flags;
-#define LOCAL_I915_USERPTR_READ_ONLY (1<<0)
-#define LOCAL_I915_USERPTR_UNSYNCHRONIZED (1<<31)
-	uint32_t handle;
-};
-
 static uint32_t userptr_flags = LOCAL_I915_USERPTR_UNSYNCHRONIZED;
 
 #define BO_SIZE (65536)
@@ -83,30 +72,6 @@ static void gem_userptr_test_synchronized(void)
 	userptr_flags = 0;
 }
 
-static int gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t *handle)
-{
-	struct local_i915_gem_userptr userptr;
-	int ret;
-
-	userptr.user_ptr = (uintptr_t)ptr;
-	userptr.user_size = size;
-	userptr.flags = userptr_flags;
-	if (read_only)
-		userptr.flags |= LOCAL_I915_USERPTR_READ_ONLY;
-
-	ret = drmIoctl(fd, LOCAL_IOCTL_I915_GEM_USERPTR, &userptr);
-	if (ret)
-		ret = errno;
-	igt_skip_on_f(ret == ENODEV &&
-		      (userptr_flags & LOCAL_I915_USERPTR_UNSYNCHRONIZED) == 0 &&
-		      !read_only,
-		      "Skipping, synchronized mappings with no kernel CONFIG_MMU_NOTIFIER?");
-	if (ret == 0)
-		*handle = userptr.handle;
-
-	return ret;
-}
-
 static void **handle_ptr_map;
 static unsigned int num_handle_ptr_map;
 
@@ -144,8 +109,7 @@ static uint32_t create_userptr_bo(int fd, int size)
 	ret = posix_memalign(&ptr, PAGE_SIZE, size);
 	igt_assert(ret == 0);
 
-	ret = gem_userptr(fd, (uint32_t *)ptr, size, 0, &handle);
-	igt_assert(ret == 0);
+	gem_userptr(fd, (uint32_t *)ptr, size, 0, userptr_flags, &handle);
 	add_handle_ptr(handle, ptr);
 
 	return handle;
@@ -167,7 +131,7 @@ static int has_userptr(int fd)
 	assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
 	oldflags = userptr_flags;
 	gem_userptr_test_unsynchronized();
-	ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle);
 	userptr_flags = oldflags;
 	if (ret != 0) {
 		free(ptr);
@@ -379,9 +343,7 @@ static void test_impact_overlap(int fd, const char *prefix)
 
 			for (i = 0, p = block; i < nr_bos[subtest];
 			     i++, p += PAGE_SIZE)
-				ret = gem_userptr(fd, (uint32_t *)p, BO_SIZE, 0,
-						  &handles[i]);
-				igt_assert(ret == 0);
+				gem_userptr(fd, (uint32_t *)p, BO_SIZE, 0, userptr_flags, &handles[i]);
 		}
 
 		if (nr_bos[subtest] > 0)
@@ -427,7 +389,6 @@ static void test_single(int fd)
 	char *ptr, *bo_ptr;
 	uint32_t handle = 0;
 	unsigned long iter = 0;
-	int ret;
 	unsigned long map_size = BO_SIZE + PAGE_SIZE - 1;
 
 	ptr = mmap(NULL, map_size, PROT_READ | PROT_WRITE,
@@ -439,8 +400,7 @@ static void test_single(int fd)
 	start_test(test_duration_sec);
 
 	while (run_test) {
-		ret = gem_userptr(fd, bo_ptr, BO_SIZE, 0, &handle);
-		assert(ret == 0);
+		gem_userptr(fd, bo_ptr, BO_SIZE, 0, userptr_flags, &handle);
 		gem_close(fd, handle);
 		iter++;
 	}
@@ -456,7 +416,6 @@ static void test_multiple(int fd, unsigned int batch, int random)
 	uint32_t handles[10000];
 	int map[10000];
 	unsigned long iter = 0;
-	int ret;
 	int i;
 	unsigned long map_size = batch * BO_SIZE + PAGE_SIZE - 1;
 
@@ -478,10 +437,8 @@ static void test_multiple(int fd, unsigned int batch, int random)
 		if (random)
 			igt_permute_array(map, batch, igt_exchange_int);
 		for (i = 0; i < batch; i++) {
-			ret = gem_userptr(fd, bo_ptr + map[i] * BO_SIZE,
-						BO_SIZE,
-						0, &handles[i]);
-			assert(ret == 0);
+			gem_userptr(fd, bo_ptr + map[i] * BO_SIZE, BO_SIZE,
+						0, userptr_flags, &handles[i]);
 		}
 		if (random)
 			igt_permute_array(map, batch, igt_exchange_int);
diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index e348f26..6cad8a2 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -871,6 +871,47 @@ void gem_context_require_ban_period(int fd)
 	igt_require(has_ban_period);
 }
 
+int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+{
+	struct local_i915_gem_userptr userptr;
+	int ret;
+
+	memset(&userptr, 0, sizeof(userptr));
+	userptr.user_ptr = (uintptr_t)ptr;
+	userptr.user_size = size;
+	userptr.flags = flags;
+	if (read_only)
+		userptr.flags |= LOCAL_I915_USERPTR_READ_ONLY;
+
+	ret = drmIoctl(fd, LOCAL_IOCTL_I915_GEM_USERPTR, &userptr);
+	if (ret)
+		ret = errno;
+	igt_skip_on_f(ret == ENODEV &&
+			(flags & LOCAL_I915_USERPTR_UNSYNCHRONIZED) == 0 &&
+			!read_only,
+			"Skipping, synchronized mappings with no kernel CONFIG_MMU_NOTIFIER?");
+	if (ret == 0)
+		*handle = userptr.handle;
+
+	return ret;
+}
+
+/**
+ * gem_userptr:
+ * @fd: open i915 drm file descriptor
+ * @ptr: userptr pointer to be passed
+ * @size: desired size of the buffer
+ * @read_only: specify whether userptr is opened read only
+ * @flags: other userptr flags
+ * @handle: returned handle for the object
+ *
+ * Returns userptr handle for the GEM object.
+ */
+void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle)
+{
+	igt_assert_eq(__gem_userptr(fd, ptr, size, read_only, flags, handle), 0);
+}
+
 /**
  * gem_sw_finish:
  * @fd: open i915 drm file descriptor
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index fe2f687..bb8a858 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -112,6 +112,19 @@ void gem_context_require_param(int fd, uint64_t param);
 void gem_context_get_param(int fd, struct local_i915_gem_context_param *p);
 void gem_context_set_param(int fd, struct local_i915_gem_context_param *p);
 
+#define LOCAL_I915_GEM_USERPTR       0x33
+#define LOCAL_IOCTL_I915_GEM_USERPTR DRM_IOWR (DRM_COMMAND_BASE + LOCAL_I915_GEM_USERPTR, struct local_i915_gem_userptr)
+struct local_i915_gem_userptr {
+  uint64_t user_ptr;
+  uint64_t user_size;
+  uint32_t flags;
+#define LOCAL_I915_USERPTR_READ_ONLY (1<<0)
+#define LOCAL_I915_USERPTR_UNSYNCHRONIZED (1<<31)
+  uint32_t handle;
+};
+void gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
+int __gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t flags, uint32_t *handle);
+
 void gem_sw_finish(int fd, uint32_t handle);
 
 bool gem_bo_busy(int fd, uint32_t handle);
diff --git a/tests/gem_userptr_blits.c b/tests/gem_userptr_blits.c
index 6d38260..95d7ca2 100644
--- a/tests/gem_userptr_blits.c
+++ b/tests/gem_userptr_blits.c
@@ -61,17 +61,6 @@
 #define PAGE_SIZE 4096
 #endif
 
-#define LOCAL_I915_GEM_USERPTR       0x33
-#define LOCAL_IOCTL_I915_GEM_USERPTR DRM_IOWR (DRM_COMMAND_BASE + LOCAL_I915_GEM_USERPTR, struct local_i915_gem_userptr)
-struct local_i915_gem_userptr {
-	uint64_t user_ptr;
-	uint64_t user_size;
-	uint32_t flags;
-#define LOCAL_I915_USERPTR_READ_ONLY (1<<0)
-#define LOCAL_I915_USERPTR_UNSYNCHRONIZED (1<<31)
-	uint32_t handle;
-};
-
 static uint32_t userptr_flags = LOCAL_I915_USERPTR_UNSYNCHRONIZED;
 
 #define WIDTH 512
@@ -89,32 +78,6 @@ static void gem_userptr_test_synchronized(void)
 	userptr_flags = 0;
 }
 
-static int gem_userptr(int fd, void *ptr, int size, int read_only, uint32_t *handle)
-{
-	struct local_i915_gem_userptr userptr;
-	int ret;
-
-	memset(&userptr, 0, sizeof(userptr));
-	userptr.user_ptr = (uintptr_t)ptr;
-	userptr.user_size = size;
-	userptr.flags = userptr_flags;
-	if (read_only)
-		userptr.flags |= LOCAL_I915_USERPTR_READ_ONLY;
-
-	ret = drmIoctl(fd, LOCAL_IOCTL_I915_GEM_USERPTR, &userptr);
-	if (ret)
-		ret = errno;
-	igt_skip_on_f(ret == ENODEV &&
-		      (userptr_flags & LOCAL_I915_USERPTR_UNSYNCHRONIZED) == 0 &&
-		      !read_only,
-		      "Skipping, synchronized mappings with no kernel CONFIG_MMU_NOTIFIER?");
-	if (ret == 0)
-		*handle = userptr.handle;
-
-	return ret;
-}
-
-
 static void gem_userptr_sync(int fd, uint32_t handle)
 {
 	gem_set_domain(fd, handle, I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
@@ -289,10 +252,9 @@ static uint32_t
 create_userptr(int fd, uint32_t val, uint32_t *ptr)
 {
 	uint32_t handle;
-	int i, ret;
+	int i;
 
-	ret = gem_userptr(fd, ptr, sizeof(linear), 0, &handle);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr, sizeof(linear), 0, userptr_flags, &handle);
 	igt_assert(handle != 0);
 
 	/* Fill the BO with dwords starting at val */
@@ -363,7 +325,6 @@ static uint32_t create_userptr_bo(int fd, uint64_t size)
 {
 	void *ptr;
 	uint32_t handle;
-	int ret;
 
 	ptr = mmap(NULL, size,
 		   PROT_READ | PROT_WRITE,
@@ -371,8 +332,7 @@ static uint32_t create_userptr_bo(int fd, uint64_t size)
 		   -1, 0);
 	igt_assert(ptr != MAP_FAILED);
 
-	ret = gem_userptr(fd, (uint32_t *)ptr, size, 0, &handle);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, (uint32_t *)ptr, size, 0, userptr_flags, &handle);
 	add_handle_ptr(handle, ptr, size);
 
 	return handle;
@@ -450,7 +410,7 @@ static int has_userptr(int fd)
 	igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
 	oldflags = userptr_flags;
 	gem_userptr_test_unsynchronized();
-	ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle);
 	userptr_flags = oldflags;
 	if (ret != 0) {
 		free(ptr);
@@ -509,7 +469,7 @@ static int test_access_control(int fd)
 
 		igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
 
-		ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle);
+		ret = __gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle);
 		if (ret == 0)
 			gem_close(fd, handle);
 		free(ptr);
@@ -524,11 +484,9 @@ static int test_access_control(int fd)
 static int test_invalid_null_pointer(int fd)
 {
 	uint32_t handle;
-	int ret;
 
 	/* NULL pointer. */
-	ret = gem_userptr(fd, NULL, PAGE_SIZE, 0, &handle);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, NULL, PAGE_SIZE, 0, userptr_flags, &handle);
 
 	copy(fd, handle, handle, ~0); /* QQQ Precise errno? */
 	gem_close(fd, handle);
@@ -540,7 +498,6 @@ static int test_invalid_gtt_mapping(int fd)
 {
 	uint32_t handle, handle2;
 	void *ptr;
-	int ret;
 
 	/* GTT mapping */
 	handle = create_bo(fd, 0);
@@ -550,8 +507,7 @@ static int test_invalid_gtt_mapping(int fd)
 	igt_assert(((unsigned long)ptr & (PAGE_SIZE - 1)) == 0);
 	igt_assert((sizeof(linear) & (PAGE_SIZE - 1)) == 0);
 
-	ret = gem_userptr(fd, ptr, sizeof(linear), 0, &handle2);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr, sizeof(linear), 0, userptr_flags, &handle2);
 	copy(fd, handle2, handle2, ~0); /* QQQ Precise errno? */
 	gem_close(fd, handle2);
 
@@ -594,8 +550,7 @@ static void test_forked_access(int fd)
 #ifdef MADV_DONTFORK
 	ret |= madvise(ptr1, sizeof(linear), MADV_DONTFORK);
 #endif
-	ret |= gem_userptr(fd, ptr1, sizeof(linear), 0, &handle1);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr1, sizeof(linear), 0, userptr_flags, &handle1);
 	igt_assert(ptr1);
 	igt_assert(handle1);
 
@@ -603,8 +558,7 @@ static void test_forked_access(int fd)
 #ifdef MADV_DONTFORK
 	ret |= madvise(ptr2, sizeof(linear), MADV_DONTFORK);
 #endif
-	ret |= gem_userptr(fd, ptr2, sizeof(linear), 0, &handle2);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr2, sizeof(linear), 0, userptr_flags, &handle2);
 	igt_assert(ptr2);
 	igt_assert(handle2);
 
@@ -651,8 +605,7 @@ static int test_forbidden_ops(int fd)
 
 	igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
 
-	ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle);
 
 	/* pread/pwrite are not always forbidden, but when they
 	 * are they should fail with EINVAL.
@@ -839,19 +792,19 @@ static int test_usage_restrictions(int fd)
 	igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE * 2) == 0);
 
 	/* Address not aligned. */
-	ret = gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE, 0, &handle);
+	ret = __gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE, 0, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
 	/* Size not rounded to page size. */
-	ret = gem_userptr(fd, ptr, PAGE_SIZE - 1, 0, &handle);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE - 1, 0, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
 	/* Both wrong. */
-	ret = gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE - 1, 0, &handle);
+	ret = __gem_userptr(fd, (char *)ptr + 1, PAGE_SIZE - 1, 0, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
 	/* Read-only not supported. */
-	ret = gem_userptr(fd, (char *)ptr, PAGE_SIZE, 1, &handle);
+	ret = __gem_userptr(fd, (char *)ptr, PAGE_SIZE, 1, userptr_flags, &handle);
 	igt_assert_neq(ret, 0);
 
 	free(ptr);
@@ -873,7 +826,7 @@ static int test_create_destroy(int fd, int time)
 		for (n = 0; n < 1000; n++) {
 			igt_assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
 
-			do_or_die(gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle));
+			do_or_die(__gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle));
 
 			gem_close(fd, handle);
 			free(ptr);
@@ -1065,41 +1018,40 @@ static void test_overlap(int fd, int expected)
 
 	igt_assert(posix_memalign((void *)&ptr, PAGE_SIZE, PAGE_SIZE * 3) == 0);
 
-	ret = gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE, 0, &handle);
-	igt_assert_eq(ret, 0);
+	gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE, 0, userptr_flags, &handle);
 
 	/* before, no overlap */
-	ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle2);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert_eq(ret, 0);
 
 	/* after, no overlap */
-	ret = gem_userptr(fd, ptr + PAGE_SIZE * 2, PAGE_SIZE, 0, &handle2);
+	ret = __gem_userptr(fd, ptr + PAGE_SIZE * 2, PAGE_SIZE, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert_eq(ret, 0);
 
 	/* exactly overlapping */
-	ret = gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE, 0, &handle2);
+	ret = __gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert(ret == 0 || ret == expected);
 
 	/* start overlaps */
-	ret = gem_userptr(fd, ptr, PAGE_SIZE * 2, 0, &handle2);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE * 2, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert(ret == 0 || ret == expected);
 
 	/* end overlaps */
-	ret = gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE * 2, 0, &handle2);
+	ret = __gem_userptr(fd, ptr + PAGE_SIZE, PAGE_SIZE * 2, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert(ret == 0 || ret == expected);
 
 	/* subsumes */
-	ret = gem_userptr(fd, ptr, PAGE_SIZE * 3, 0, &handle2);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE * 3, 0, userptr_flags, &handle2);
 	if (ret == 0)
 		gem_close(fd, handle2);
 	igt_assert(ret == 0 || ret == expected);
@@ -1124,8 +1076,7 @@ static void test_unmap(int fd, int expected)
 	bo_ptr = (char *)ALIGN((unsigned long)ptr, PAGE_SIZE);
 
 	for (i = 0; i < num_obj; i++, bo_ptr += sizeof(linear)) {
-		ret = gem_userptr(fd, bo_ptr, sizeof(linear), 0, &bo[i]);
-		igt_assert_eq(ret, 0);
+		gem_userptr(fd, bo_ptr, sizeof(linear), 0, userptr_flags, &bo[i]);
 	}
 
 	bo[num_obj] = create_bo(fd, 0);
@@ -1159,8 +1110,7 @@ static void test_unmap_after_close(int fd)
 	bo_ptr = (char *)ALIGN((unsigned long)ptr, PAGE_SIZE);
 
 	for (i = 0; i < num_obj; i++, bo_ptr += sizeof(linear)) {
-		ret = gem_userptr(fd, bo_ptr, sizeof(linear), 0, &bo[i]);
-		igt_assert_eq(ret, 0);
+		gem_userptr(fd, bo_ptr, sizeof(linear), 0, userptr_flags, &bo[i]);
 	}
 
 	bo[num_obj] = create_bo(fd, 0);
@@ -1230,8 +1180,7 @@ static void test_stress_mm(int fd)
 	igt_assert_eq(ret, 0);
 
 	while (loops--) {
-		ret = gem_userptr(fd, ptr, PAGE_SIZE, 0, &handle);
-		igt_assert_eq(ret, 0);
+		gem_userptr(fd, ptr, PAGE_SIZE, 0, userptr_flags, &handle);
 
 		gem_close(fd, handle);
 	}
@@ -1265,8 +1214,7 @@ static void *mm_userptr_close_thread(void *data)
 	while (!t->stop) {
 		pthread_mutex_unlock(&t->mutex);
 		for (int i = 0; i < num_handles; i++)
-			igt_assert_eq(gem_userptr(t->fd, t->ptr, PAGE_SIZE, 0, &handle[i]),
-				      0);
+			gem_userptr(t->fd, t->ptr, PAGE_SIZE, 0, userptr_flags, &handle[i]);
 		for (int i = 0; i < num_handles; i++)
 			gem_close(t->fd, handle[i]);
 		pthread_mutex_lock(&t->mutex);
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 2/6] prime_mmap: Add new test for calling mmap() on dma-buf fds
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (5 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 1/6] lib: Add gem_userptr and __gem_userptr helpers Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH igt v6 3/6] prime_mmap: Add basic tests to write in a bo using CPU Tiago Vignatti
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel
  Cc: daniel.thompson, daniel.vetter, Rob Bradford, thellstrom, jglisse

From: Rob Bradford <rob@linux.intel.com>

This test has the following subtests:
 - test_correct for correctness of the data
 - test_map_unmap checks for mapping idempotency
 - test_reprime checks for dma-buf creation idempotency
 - test_forked checks for multiprocess access
 - test_refcounting checks for buffer reference counting
 - test_dup checks that dup()ing the fd works
 - test_userptr make sure it fails when mmaping due the lack of obj->base.filp
   in a userptr.
 - test_errors checks the error return values for failures
 - test_aperture_limit tests multiple buffer creation at the gtt aperture
   limit

v2 (Tiago): Removed pattern_check(), which was walking through a useless
iterator. Removed superfluous PROT_WRITE from gem_mmap, in test_correct().
Added binary file to .gitignore
v3 (Tiago): squash patch "prime_mmap: Test for userptr mmap" into this one.
v4 (Tiago): use synchronized userptr for testing. Add test for buffer
overlapping.

Signed-off-by: Rob Bradford <rob@linux.intel.com>
Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 tests/Makefile.sources |   1 +
 tests/prime_mmap.c     | 417 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 418 insertions(+)
 create mode 100644 tests/prime_mmap.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index d594038..75f3cb0 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -96,6 +96,7 @@ TESTS_progs_M = \
 	pm_rps \
 	pm_rc6_residency \
 	pm_sseu \
+	prime_mmap \
 	prime_self_import \
 	template \
 	$(NULL)
diff --git a/tests/prime_mmap.c b/tests/prime_mmap.c
new file mode 100644
index 0000000..95304a9
--- /dev/null
+++ b/tests/prime_mmap.c
@@ -0,0 +1,417 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Rob Bradford <rob at linux.intel.com>
+ *
+ */
+
+/*
+ * Testcase: Check whether mmap()ing dma-buf works
+ */
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <pthread.h>
+
+#include "drm.h"
+#include "i915_drm.h"
+#include "drmtest.h"
+#include "igt_debugfs.h"
+#include "ioctl_wrappers.h"
+
+#define BO_SIZE (16*1024)
+
+static int fd;
+
+char pattern[] = {0xff, 0x00, 0x00, 0x00,
+	0x00, 0xff, 0x00, 0x00,
+	0x00, 0x00, 0xff, 0x00,
+	0x00, 0x00, 0x00, 0xff};
+
+static void
+fill_bo(uint32_t handle, size_t size)
+{
+	off_t i;
+	for (i = 0; i < size; i+=sizeof(pattern))
+	{
+		gem_write(fd, handle, i, pattern, sizeof(pattern));
+	}
+}
+
+static void
+test_correct(void)
+{
+	int dma_buf_fd;
+	char *ptr1, *ptr2;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+
+	/* Check correctness vs GEM_MMAP_GTT */
+	ptr1 = gem_mmap__gtt(fd, handle, BO_SIZE, PROT_READ);
+	ptr2 = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr1 != MAP_FAILED);
+	igt_assert(ptr2 != MAP_FAILED);
+	igt_assert(memcmp(ptr1, ptr2, BO_SIZE) == 0);
+
+	/* Check pattern correctness */
+	igt_assert(memcmp(ptr2, pattern, sizeof(pattern)) == 0);
+
+	munmap(ptr1, BO_SIZE);
+	munmap(ptr2, BO_SIZE);
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+}
+
+static void
+test_map_unmap(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+
+	/* Unmap and remap */
+	munmap(ptr, BO_SIZE);
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+
+	munmap(ptr, BO_SIZE);
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+}
+
+/* prime and then unprime and then prime again the same handle */
+static void
+test_reprime(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+
+	close (dma_buf_fd);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+	munmap(ptr, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+
+	munmap(ptr, BO_SIZE);
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+}
+
+/* map from another process */
+static void
+test_forked(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+
+	igt_fork(childno, 1) {
+		ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+		igt_assert(ptr != MAP_FAILED);
+		igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+		munmap(ptr, BO_SIZE);
+		close(dma_buf_fd);
+	}
+	close(dma_buf_fd);
+	igt_waitchildren();
+	gem_close(fd, handle);
+}
+
+static void
+test_refcounting(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+	/* Close gem object before mapping */
+	gem_close(fd, handle);
+
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+	munmap(ptr, BO_SIZE);
+	close (dma_buf_fd);
+}
+
+/* dup before mmap */
+static void
+test_dup(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+
+	dma_buf_fd = dup(prime_handle_to_fd(fd, handle));
+	igt_assert(errno == 0);
+
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+	munmap(ptr, BO_SIZE);
+	gem_close(fd, handle);
+	close (dma_buf_fd);
+}
+
+
+/* Used for error case testing to avoid wrapper */
+static int prime_handle_to_fd_no_assert(uint32_t handle, int *fd_out)
+{
+	struct drm_prime_handle args;
+	int ret;
+
+	args.handle = handle;
+	args.flags = DRM_CLOEXEC;
+	args.fd = -1;
+
+	ret = drmIoctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
+	if (ret)
+		ret = errno;
+	*fd_out = args.fd;
+
+	return ret;
+}
+
+/* test for mmap(dma_buf_export(userptr)) */
+static void
+test_userptr(void)
+{
+	int ret, dma_buf_fd;
+	void *ptr;
+	uint32_t handle;
+
+	/* create userptr bo */
+	ret = posix_memalign(&ptr, 4096, BO_SIZE);
+	igt_assert_eq(ret, 0);
+
+	/* we are not allowed to export unsynchronized userptr. Just create a normal
+	 * one */
+	gem_userptr(fd, (uint32_t *)ptr, BO_SIZE, 0, 0, &handle);
+
+	/* export userptr */
+	ret = prime_handle_to_fd_no_assert(handle, &dma_buf_fd);
+	if (ret) {
+		igt_assert(ret == EINVAL || ret == ENODEV);
+		goto free_userptr;
+	} else {
+		igt_assert_eq(ret, 0);
+		igt_assert_lte(0, dma_buf_fd);
+	}
+
+	/* a userptr doesn't have the obj->base.filp, but can be exported via
+	 * dma-buf, so make sure it fails here */
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr == MAP_FAILED && errno == ENODEV);
+free_userptr:
+	gem_close(fd, handle);
+	close(dma_buf_fd);
+}
+
+static void
+test_errors(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	/* Close gem object before priming */
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+	gem_close(fd, handle);
+	prime_handle_to_fd_no_assert(handle, &dma_buf_fd);
+	igt_assert(dma_buf_fd == -1 && errno == ENOENT);
+	errno = 0;
+
+	/* close fd before mapping */
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+	close(dma_buf_fd);
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr == MAP_FAILED && errno == EBADF);
+	errno = 0;
+	gem_close(fd, handle);
+
+	/* Map too big */
+	handle = gem_create(fd, BO_SIZE);
+	fill_bo(handle, BO_SIZE);
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+	ptr = mmap(NULL, BO_SIZE * 2, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr == MAP_FAILED && errno == EINVAL);
+	errno = 0;
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+
+	/* Overlapping the end of the buffer */
+	handle = gem_create(fd, BO_SIZE);
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	igt_assert(errno == 0);
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, BO_SIZE / 2);
+	igt_assert(ptr == MAP_FAILED && errno == EINVAL);
+	errno = 0;
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+}
+
+static void
+test_aperture_limit(void)
+{
+	int dma_buf_fd1, dma_buf_fd2;
+	char *ptr1, *ptr2;
+	uint32_t handle1, handle2;
+	/* Two buffers the sum of which > mappable aperture */
+	uint64_t size1 = (gem_mappable_aperture_size() * 7) / 8;
+	uint64_t size2 = (gem_mappable_aperture_size() * 3) / 8;
+
+	handle1 = gem_create(fd, size1);
+	fill_bo(handle1, BO_SIZE);
+
+	dma_buf_fd1 = prime_handle_to_fd(fd, handle1);
+	igt_assert(errno == 0);
+	ptr1 = mmap(NULL, size1, PROT_READ, MAP_SHARED, dma_buf_fd1, 0);
+	igt_assert(ptr1 != MAP_FAILED);
+	igt_assert(memcmp(ptr1, pattern, sizeof(pattern)) == 0);
+
+	handle2 = gem_create(fd, size1);
+	fill_bo(handle2, BO_SIZE);
+	dma_buf_fd2 = prime_handle_to_fd(fd, handle2);
+	igt_assert(errno == 0);
+	ptr2 = mmap(NULL, size2, PROT_READ, MAP_SHARED, dma_buf_fd2, 0);
+	igt_assert(ptr2 != MAP_FAILED);
+	igt_assert(memcmp(ptr2, pattern, sizeof(pattern)) == 0);
+
+	igt_assert(memcmp(ptr1, ptr2, BO_SIZE) == 0);
+
+	munmap(ptr1, size1);
+	munmap(ptr2, size2);
+	close(dma_buf_fd1);
+	close(dma_buf_fd2);
+	gem_close(fd, handle1);
+	gem_close(fd, handle2);
+}
+
+static int
+check_for_dma_buf_mmap(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+	int ret = 1;
+
+	handle = gem_create(fd, BO_SIZE);
+	dma_buf_fd = prime_handle_to_fd(fd, handle);
+	ptr = mmap(NULL, BO_SIZE, PROT_READ, MAP_SHARED, dma_buf_fd, 0);
+	if (ptr != MAP_FAILED)
+		ret = 0;
+	munmap(ptr, BO_SIZE);
+	gem_close(fd, handle);
+	close(dma_buf_fd);
+	return ret;
+}
+
+igt_main
+{
+	struct {
+		const char *name;
+		void (*fn)(void);
+	} tests[] = {
+		{ "test_correct", test_correct },
+		{ "test_map_unmap", test_map_unmap },
+		{ "test_reprime", test_reprime },
+		{ "test_forked", test_forked },
+		{ "test_refcounting", test_refcounting },
+		{ "test_dup", test_dup },
+		{ "test_userptr", test_userptr },
+		{ "test_errors", test_errors },
+		{ "test_aperture_limit", test_aperture_limit },
+	};
+	int i;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_INTEL);
+		errno = 0;
+	}
+
+	igt_skip_on((check_for_dma_buf_mmap() != 0));
+
+	for (i = 0; i < ARRAY_SIZE(tests); i++) {
+		igt_subtest(tests[i].name)
+			tests[i].fn();
+	}
+
+	igt_fixture
+		close(fd);
+}
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 3/6] prime_mmap: Add basic tests to write in a bo using CPU
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (6 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 2/6] prime_mmap: Add new test for calling mmap() on dma-buf fds Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-16 22:25 ` [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers Tiago Vignatti
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

This patch adds test_correct_cpu_write, which maps the texture buffer through a
prime fd and then writes directly to it using the CPU. It stresses the driver
to guarantee cache synchronization among the different domains.

This test also adds test_forked_cpu_write, which creates the GEM bo in one
process and pass the prime handle of the it to another process, which in turn
uses the handle only to map and write. Roughly speaking this test simulates
Chrome OS  architecture, where the Web content ("unpriviledged process") maps
and CPU-draws a buffer, which was previously allocated in the GPU process
("priviledged process").

This requires kernel modifications (Daniel Thompson's "drm: prime: Honour
O_RDWR during prime-handle-to-fd") and therefore prime_handle_to_fd_for_mmap is
added to fail in case these lack. Also, upcoming tests (e.g. next patch) are
going to use it as well, so make it public and available in the lib.

v2: adds prime_handle_to_fd_with_mmap for skipping test in older kernels and
test for invalid flags.

Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 lib/ioctl_wrappers.c | 25 +++++++++++++++
 lib/ioctl_wrappers.h |  4 +++
 tests/prime_mmap.c   | 89 ++++++++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 112 insertions(+), 6 deletions(-)

diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index 6cad8a2..86a61ba 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -1329,6 +1329,31 @@ int prime_handle_to_fd(int fd, uint32_t handle)
 }
 
 /**
+ * prime_handle_to_fd_for_mmap:
+ * @fd: open i915 drm file descriptor
+ * @handle: file-private gem buffer object handle
+ *
+ * Same as prime_handle_to_fd above but with DRM_RDWR capabilities, which can
+ * be useful for writing into the mmap'ed dma-buf file-descriptor.
+ *
+ * Returns: The created dma-buf fd handle or -1 if the ioctl fails.
+ */
+int prime_handle_to_fd_for_mmap(int fd, uint32_t handle)
+{
+	struct drm_prime_handle args;
+
+	memset(&args, 0, sizeof(args));
+	args.handle = handle;
+	args.flags = DRM_CLOEXEC | DRM_RDWR;
+	args.fd = -1;
+
+	if (drmIoctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args) != 0)
+		return -1;
+
+	return args.fd;
+}
+
+/**
  * prime_fd_to_handle:
  * @fd: open i915 drm file descriptor
  * @dma_buf_fd: dma-buf fd handle
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index bb8a858..d3ffba2 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -149,6 +149,10 @@ void gem_require_ring(int fd, int ring_id);
 
 /* prime */
 int prime_handle_to_fd(int fd, uint32_t handle);
+#ifndef DRM_RDWR
+#define DRM_RDWR O_RDWR
+#endif
+int prime_handle_to_fd_for_mmap(int fd, uint32_t handle);
 uint32_t prime_fd_to_handle(int fd, int dma_buf_fd);
 off_t prime_get_size(int dma_buf_fd);
 
diff --git a/tests/prime_mmap.c b/tests/prime_mmap.c
index 95304a9..269ada6 100644
--- a/tests/prime_mmap.c
+++ b/tests/prime_mmap.c
@@ -22,6 +22,7 @@
  *
  * Authors:
  *    Rob Bradford <rob at linux.intel.com>
+ *    Tiago Vignatti <tiago.vignatti at intel.com>
  *
  */
 
@@ -66,6 +67,12 @@ fill_bo(uint32_t handle, size_t size)
 }
 
 static void
+fill_bo_cpu(char *ptr)
+{
+	memcpy(ptr, pattern, sizeof(pattern));
+}
+
+static void
 test_correct(void)
 {
 	int dma_buf_fd;
@@ -180,6 +187,65 @@ test_forked(void)
 	gem_close(fd, handle);
 }
 
+/* test simple CPU write */
+static void
+test_correct_cpu_write(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd_for_mmap(fd, handle);
+
+	/* Skip if DRM_RDWR is not supported */
+	igt_skip_on(errno == EINVAL);
+
+	/* Check correctness of map using write protection (PROT_WRITE) */
+	ptr = mmap(NULL, BO_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+
+	/* Fill bo using CPU */
+	fill_bo_cpu(ptr);
+
+	/* Check pattern correctness */
+	igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+
+	munmap(ptr, BO_SIZE);
+	close(dma_buf_fd);
+	gem_close(fd, handle);
+}
+
+/* map from another process and then write using CPU */
+static void
+test_forked_cpu_write(void)
+{
+	int dma_buf_fd;
+	char *ptr;
+	uint32_t handle;
+
+	handle = gem_create(fd, BO_SIZE);
+
+	dma_buf_fd = prime_handle_to_fd_for_mmap(fd, handle);
+
+	/* Skip if DRM_RDWR is not supported */
+	igt_skip_on(errno == EINVAL);
+
+	igt_fork(childno, 1) {
+		ptr = mmap(NULL, BO_SIZE, PROT_READ | PROT_WRITE , MAP_SHARED, dma_buf_fd, 0);
+		igt_assert(ptr != MAP_FAILED);
+		fill_bo_cpu(ptr);
+
+		igt_assert(memcmp(ptr, pattern, sizeof(pattern)) == 0);
+		munmap(ptr, BO_SIZE);
+		close(dma_buf_fd);
+	}
+	close(dma_buf_fd);
+	igt_waitchildren();
+	gem_close(fd, handle);
+}
+
 static void
 test_refcounting(void)
 {
@@ -224,15 +290,14 @@ test_dup(void)
 	close (dma_buf_fd);
 }
 
-
 /* Used for error case testing to avoid wrapper */
-static int prime_handle_to_fd_no_assert(uint32_t handle, int *fd_out)
+static int prime_handle_to_fd_no_assert(uint32_t handle, int flags, int *fd_out)
 {
 	struct drm_prime_handle args;
 	int ret;
 
 	args.handle = handle;
-	args.flags = DRM_CLOEXEC;
+	args.flags = flags;
 	args.fd = -1;
 
 	ret = drmIoctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
@@ -260,7 +325,7 @@ test_userptr(void)
 	gem_userptr(fd, (uint32_t *)ptr, BO_SIZE, 0, 0, &handle);
 
 	/* export userptr */
-	ret = prime_handle_to_fd_no_assert(handle, &dma_buf_fd);
+	ret = prime_handle_to_fd_no_assert(handle, DRM_CLOEXEC, &dma_buf_fd);
 	if (ret) {
 		igt_assert(ret == EINVAL || ret == ENODEV);
 		goto free_userptr;
@@ -281,15 +346,25 @@ free_userptr:
 static void
 test_errors(void)
 {
-	int dma_buf_fd;
+	int i, dma_buf_fd;
 	char *ptr;
 	uint32_t handle;
+	int invalid_flags[] = {DRM_CLOEXEC - 1, DRM_CLOEXEC + 1,
+	                       DRM_RDWR - 1, DRM_RDWR + 1};
+
+	/* Test for invalid flags */
+	handle = gem_create(fd, BO_SIZE);
+	for (i = 0; i < sizeof(invalid_flags) / sizeof(invalid_flags[0]); i++) {
+		prime_handle_to_fd_no_assert(handle, invalid_flags[i], &dma_buf_fd);
+		igt_assert_eq(errno, EINVAL);
+		errno = 0;
+	}
 
 	/* Close gem object before priming */
 	handle = gem_create(fd, BO_SIZE);
 	fill_bo(handle, BO_SIZE);
 	gem_close(fd, handle);
-	prime_handle_to_fd_no_assert(handle, &dma_buf_fd);
+	prime_handle_to_fd_no_assert(handle, DRM_CLOEXEC, &dma_buf_fd);
 	igt_assert(dma_buf_fd == -1 && errno == ENOENT);
 	errno = 0;
 
@@ -392,6 +467,8 @@ igt_main
 		{ "test_map_unmap", test_map_unmap },
 		{ "test_reprime", test_reprime },
 		{ "test_forked", test_forked },
+		{ "test_correct_cpu_write", test_correct_cpu_write },
+		{ "test_forked_cpu_write", test_forked_cpu_write },
 		{ "test_refcounting", test_refcounting },
 		{ "test_dup", test_dup },
 		{ "test_userptr", test_userptr },
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (7 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 3/6] prime_mmap: Add basic tests to write in a bo using CPU Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-17 10:18   ` Daniel Vetter
  2015-12-16 22:25 ` [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests Tiago Vignatti
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

This patch adds dma-buf mmap synchronization ioctls that can be used by tests
for cache coherency management e.g. when CPU and GPU domains are being accessed
through dma-buf at the same time.

Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 lib/ioctl_wrappers.c | 26 ++++++++++++++++++++++++++
 lib/ioctl_wrappers.h | 15 +++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index 86a61ba..0d84d00 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -1400,6 +1400,32 @@ off_t prime_get_size(int dma_buf_fd)
 }
 
 /**
+ * prime_sync_start
+ * @dma_buf_fd: dma-buf fd handle
+ */
+void prime_sync_start(int dma_buf_fd)
+{
+	struct local_dma_buf_sync sync_start;
+
+	memset(&sync_start, 0, sizeof(sync_start));
+	sync_start.flags = LOCAL_DMA_BUF_SYNC_START | LOCAL_DMA_BUF_SYNC_RW;
+	do_ioctl(dma_buf_fd, LOCAL_DMA_BUF_IOCTL_SYNC, &sync_start);
+}
+
+/**
+ * prime_sync_end
+ * @dma_buf_fd: dma-buf fd handle
+ */
+void prime_sync_end(int dma_buf_fd)
+{
+	struct local_dma_buf_sync sync_end;
+
+	memset(&sync_end, 0, sizeof(sync_end));
+	sync_end.flags = LOCAL_DMA_BUF_SYNC_END | LOCAL_DMA_BUF_SYNC_RW;
+	do_ioctl(dma_buf_fd, LOCAL_DMA_BUF_IOCTL_SYNC, &sync_end);
+}
+
+/**
  * igt_require_fb_modifiers:
  * @fd: Open DRM file descriptor.
  *
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index d3ffba2..cbd7a73 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -148,6 +148,19 @@ void gem_require_caching(int fd);
 void gem_require_ring(int fd, int ring_id);
 
 /* prime */
+struct local_dma_buf_sync {
+	uint64_t flags;
+};
+
+#define LOCAL_DMA_BUF_SYNC_RW        (3 << 0)
+#define LOCAL_DMA_BUF_SYNC_START     (0 << 2)
+#define LOCAL_DMA_BUF_SYNC_END       (1 << 2)
+#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
+		(DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
+
+#define LOCAL_DMA_BUF_BASE 'b'
+#define LOCAL_DMA_BUF_IOCTL_SYNC _IOW(LOCAL_DMA_BUF_BASE, 0, struct local_dma_buf_sync)
+
 int prime_handle_to_fd(int fd, uint32_t handle);
 #ifndef DRM_RDWR
 #define DRM_RDWR O_RDWR
@@ -155,6 +168,8 @@ int prime_handle_to_fd(int fd, uint32_t handle);
 int prime_handle_to_fd_for_mmap(int fd, uint32_t handle);
 uint32_t prime_fd_to_handle(int fd, int dma_buf_fd);
 off_t prime_get_size(int dma_buf_fd);
+void prime_sync_start(int dma_buf_fd);
+void prime_sync_end(int dma_buf_fd);
 
 /* addfb2 fb modifiers */
 struct local_drm_mode_fb_cmd2 {
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (8 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-17  7:53   ` Chris Wilson
  2015-12-16 22:25 ` [PATCH igt v6 6/6] tests: Add prime_mmap_coherency " Tiago Vignatti
  2015-12-17 10:15 ` Direct userspace dma-buf mmap (v6) Daniel Vetter
  11 siblings, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

This program can be used to detect when CPU writes in the dma-buf mapped object
don't land in scanout due cache incoherency.

Although this seems a problem inherently of non-LCC machines ("Atom"), this
particular test catches a cache dirt on scanout on LLC machines as well. It's
inspired in Ville's kms_pwrite_crc.c and can be used also to test the
correctness of the driver's begin_cpu_access and end_cpu_access (which requires
i915 implementation.

To see the need for flush, one has to run this same binary a few times cause
it's not 100% reproducible -- what I usually do is the following, using '-n'
option to not call the sync ioctls:

    $ while ((1)) ; do ./kms_mmap_write_crc -n; done  # in terminal A
    $ find /                                          # in terminal B

That will most likely trashes the memory while the test will catch the
coherency issue. If you now suppress '-n', then things should just work like
expected.

I tested this with !llc and llc platforms, BTY and IVY respectively.

v2: use prime_handle_to_fd_for_mmap instead.
v3: merge end_cpu_access() patch with this and provide options to disable sync.
v4: use library's prime_sync_{start,end} instead.

Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 tests/Makefile.sources     |   1 +
 tests/kms_mmap_write_crc.c | 281 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 282 insertions(+)
 create mode 100644 tests/kms_mmap_write_crc.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 75f3cb0..ad2dd6a 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -168,6 +168,7 @@ TESTS_progs = \
 	kms_3d \
 	kms_fence_pin_leak \
 	kms_force_connector_basic \
+	kms_mmap_write_crc \
 	kms_pwrite_crc \
 	kms_sink_crc_basic \
 	prime_udl \
diff --git a/tests/kms_mmap_write_crc.c b/tests/kms_mmap_write_crc.c
new file mode 100644
index 0000000..6a12539
--- /dev/null
+++ b/tests/kms_mmap_write_crc.c
@@ -0,0 +1,281 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Tiago Vignatti <tiago.vignatti at intel.com>
+ */
+
+#include <errno.h>
+#include <limits.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <string.h>
+
+#include "drmtest.h"
+#include "igt_debugfs.h"
+#include "igt_kms.h"
+#include "intel_chipset.h"
+#include "ioctl_wrappers.h"
+#include "igt_aux.h"
+
+IGT_TEST_DESCRIPTION(
+   "Use the display CRC support to validate mmap write to an already uncached future scanout buffer.");
+
+typedef struct {
+	int drm_fd;
+	igt_display_t display;
+	struct igt_fb fb[2];
+	igt_output_t *output;
+	igt_plane_t *primary;
+	enum pipe pipe;
+	igt_crc_t ref_crc;
+	igt_pipe_crc_t *pipe_crc;
+	uint32_t devid;
+} data_t;
+
+static int ioctl_sync = true;
+int dma_buf_fd;
+
+static char *dmabuf_mmap_framebuffer(int drm_fd, struct igt_fb *fb)
+{
+	char *ptr = NULL;
+
+	dma_buf_fd = prime_handle_to_fd_for_mmap(drm_fd, fb->gem_handle);
+	igt_skip_on(dma_buf_fd == -1 && errno == EINVAL);
+
+	ptr = mmap(NULL, fb->size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr != MAP_FAILED);
+
+	return ptr;
+}
+
+static void test(data_t *data)
+{
+	igt_display_t *display = &data->display;
+	igt_output_t *output = data->output;
+	struct igt_fb *fb = &data->fb[1];
+	drmModeModeInfo *mode;
+	cairo_t *cr;
+	char *ptr;
+	uint32_t caching;
+	void *buf;
+	igt_crc_t crc;
+
+	mode = igt_output_get_mode(output);
+
+	/* create a non-white fb where we can write later */
+	igt_create_fb(data->drm_fd, mode->hdisplay, mode->vdisplay,
+		      DRM_FORMAT_XRGB8888, LOCAL_DRM_FORMAT_MOD_NONE, fb);
+
+	ptr = dmabuf_mmap_framebuffer(data->drm_fd, fb);
+
+	cr = igt_get_cairo_ctx(data->drm_fd, fb);
+	igt_paint_test_pattern(cr, fb->width, fb->height);
+	cairo_destroy(cr);
+
+	/* flip to it to make it UC/WC and fully flushed */
+	igt_plane_set_fb(data->primary, fb);
+	igt_display_commit(display);
+
+	/* flip back the original white buffer */
+	igt_plane_set_fb(data->primary, &data->fb[0]);
+	igt_display_commit(display);
+
+	/* make sure caching mode has become UC/WT */
+	caching = gem_get_caching(data->drm_fd, fb->gem_handle);
+	igt_assert(caching == I915_CACHING_NONE || caching == I915_CACHING_DISPLAY);
+
+	/*
+	 * firstly demonstrate the need for DMA_BUF_SYNC_START ("begin_cpu_access")
+	 */
+	if (ioctl_sync)
+		prime_sync_start(dma_buf_fd);
+
+	/* use dmabuf pointer to make the other fb all white too */
+	buf = malloc(fb->size);
+	igt_assert(buf != NULL);
+	memset(buf, 0xff, fb->size);
+	memcpy(ptr, buf, fb->size);
+	free(buf);
+
+	/* and flip to it */
+	igt_plane_set_fb(data->primary, fb);
+	igt_display_commit(display);
+
+	/* check that the crc is as expected, which requires that caches got flushed */
+	igt_pipe_crc_collect_crc(data->pipe_crc, &crc);
+	igt_assert_crc_equal(&crc, &data->ref_crc);
+
+	/*
+	 * now demonstrates the need for DMA_BUF_SYNC_END ("end_cpu_access")
+	 */
+
+	/* start over, writing non-white to the fb again and flip to it to make it
+	 * fully flushed */
+	cr = igt_get_cairo_ctx(data->drm_fd, fb);
+	igt_paint_test_pattern(cr, fb->width, fb->height);
+	cairo_destroy(cr);
+
+	igt_plane_set_fb(data->primary, fb);
+	igt_display_commit(display);
+
+	/* sync start, to move to CPU domain */
+	if (ioctl_sync)
+		prime_sync_start(dma_buf_fd);
+
+	/* use dmabuf pointer in the same fb to make it all white */
+	buf = malloc(fb->size);
+	igt_assert(buf != NULL);
+	memset(buf, 0xff, fb->size);
+	memcpy(ptr, buf, fb->size);
+	free(buf);
+
+	/* if we don't change to the GTT domain again, the whites won't get flushed
+	 * and therefore we demonstrates the need for sync end here */
+	if (ioctl_sync)
+		prime_sync_end(dma_buf_fd);
+
+	/* check that the crc is as expected, which requires that caches got flushed */
+	igt_pipe_crc_collect_crc(data->pipe_crc, &crc);
+	igt_assert_crc_equal(&crc, &data->ref_crc);
+}
+
+static bool prepare_crtc(data_t *data)
+{
+	igt_display_t *display = &data->display;
+	igt_output_t *output = data->output;
+	drmModeModeInfo *mode;
+
+	/* select the pipe we want to use */
+	igt_output_set_pipe(output, data->pipe);
+	igt_display_commit(display);
+
+	if (!output->valid) {
+		igt_output_set_pipe(output, PIPE_ANY);
+		igt_display_commit(display);
+		return false;
+	}
+
+	mode = igt_output_get_mode(output);
+
+	/* create a white reference fb and flip to it */
+	igt_create_color_fb(data->drm_fd, mode->hdisplay, mode->vdisplay,
+			    DRM_FORMAT_XRGB8888, LOCAL_DRM_FORMAT_MOD_NONE,
+			    1.0, 1.0, 1.0, &data->fb[0]);
+
+	data->primary = igt_output_get_plane(output, IGT_PLANE_PRIMARY);
+
+	igt_plane_set_fb(data->primary, &data->fb[0]);
+	igt_display_commit(display);
+
+	if (data->pipe_crc)
+		igt_pipe_crc_free(data->pipe_crc);
+
+	data->pipe_crc = igt_pipe_crc_new(data->pipe,
+					  INTEL_PIPE_CRC_SOURCE_AUTO);
+
+	/* get reference crc for the white fb */
+	igt_pipe_crc_collect_crc(data->pipe_crc, &data->ref_crc);
+
+	return true;
+}
+
+static void cleanup_crtc(data_t *data)
+{
+	igt_display_t *display = &data->display;
+	igt_output_t *output = data->output;
+
+	igt_pipe_crc_free(data->pipe_crc);
+	data->pipe_crc = NULL;
+
+	igt_plane_set_fb(data->primary, NULL);
+
+	igt_output_set_pipe(output, PIPE_ANY);
+	igt_display_commit(display);
+
+	igt_remove_fb(data->drm_fd, &data->fb[0]);
+	igt_remove_fb(data->drm_fd, &data->fb[1]);
+}
+
+static void run_test(data_t *data)
+{
+	igt_display_t *display = &data->display;
+	igt_output_t *output;
+	enum pipe pipe;
+
+	for_each_connected_output(display, output) {
+		data->output = output;
+		for_each_pipe(display, pipe) {
+			data->pipe = pipe;
+
+			if (!prepare_crtc(data))
+				continue;
+
+			test(data);
+			cleanup_crtc(data);
+
+			/* once is enough */
+			return;
+		}
+	}
+
+	igt_skip("no valid crtc/connector combinations found\n");
+}
+
+static int opt_handler(int opt, int opt_index, void *data)
+{
+	if (opt == 'n') {
+		ioctl_sync = false;
+		igt_info("set via cmd line to not use sync ioctls\n");
+	}
+
+	return 0;
+}
+
+static data_t data;
+
+int main(int argc, char **argv)
+{
+	igt_simple_init_parse_opts(&argc, argv, "n", NULL, NULL, opt_handler, NULL);
+
+	igt_skip_on_simulation();
+
+	igt_fixture {
+		data.drm_fd = drm_open_driver_master(DRIVER_INTEL);
+
+		data.devid = intel_get_drm_devid(data.drm_fd);
+
+		kmstest_set_vt_graphics_mode();
+
+		igt_require_pipe_crc();
+
+		igt_display_init(&data.display, data.drm_fd);
+	}
+
+	run_test(&data);
+
+	igt_fixture {
+		igt_display_fini(&data.display);
+	}
+
+	igt_exit();
+}
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH igt v6 6/6] tests: Add prime_mmap_coherency for cache coherency tests
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (9 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests Tiago Vignatti
@ 2015-12-16 22:25 ` Tiago Vignatti
  2015-12-17 10:15 ` Direct userspace dma-buf mmap (v6) Daniel Vetter
  11 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-16 22:25 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.thompson, daniel.vetter, thellstrom, jglisse

Different than kms_mmap_write_crc that captures the coherency issues within the
scanout mapped buffer, this one is meant for test dma-buf mmap on !llc
platforms mostly and provoke coherency bugs so we know where we need the sync
ioctls.

I tested this with !llc and llc platforms, BTY and IVY respectively.

Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
---
 tests/Makefile.sources       |   1 +
 tests/prime_mmap_coherency.c | 246 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 247 insertions(+)
 create mode 100644 tests/prime_mmap_coherency.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index ad2dd6a..78605c6 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -97,6 +97,7 @@ TESTS_progs_M = \
 	pm_rc6_residency \
 	pm_sseu \
 	prime_mmap \
+	prime_mmap_coherency \
 	prime_self_import \
 	template \
 	$(NULL)
diff --git a/tests/prime_mmap_coherency.c b/tests/prime_mmap_coherency.c
new file mode 100644
index 0000000..a9a2664
--- /dev/null
+++ b/tests/prime_mmap_coherency.c
@@ -0,0 +1,246 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Tiago Vignatti
+ */
+
+/** @file prime_mmap_coherency.c
+ *
+ * TODO: need to show the need for prime_sync_end().
+ */
+
+#include "igt.h"
+
+IGT_TEST_DESCRIPTION("Test dma-buf mmap on !llc platforms mostly and provoke"
+		" coherency bugs so we know for sure where we need the sync ioctls.");
+
+#define ROUNDS 20
+
+int fd;
+int stale = 0;
+static drm_intel_bufmgr *bufmgr;
+struct intel_batchbuffer *batch;
+static int width = 1024, height = 1024;
+
+/*
+ * Exercises the need for read flush:
+ *   1. create a BO and write '0's, in GTT domain.
+ *   2. read BO using the dma-buf CPU mmap.
+ *   3. write '1's, in GTT domain.
+ *   4. read again through the mapped dma-buf.
+ */
+static void test_read_flush(bool expect_stale_cache)
+{
+	drm_intel_bo *bo_1;
+	drm_intel_bo *bo_2;
+	uint32_t *ptr_cpu;
+	uint32_t *ptr_gtt;
+	int dma_buf_fd, i;
+
+	if (expect_stale_cache)
+		igt_require(!gem_has_llc(fd));
+
+	bo_1 = drm_intel_bo_alloc(bufmgr, "BO 1", width * height * 4, 4096);
+
+	/* STEP #1: put the BO 1 in GTT domain. We use the blitter to copy and fill
+	 * zeros to BO 1, so commands will be submitted and likely to place BO 1 in
+	 * the GTT domain. */
+	bo_2 = drm_intel_bo_alloc(bufmgr, "BO 2", width * height * 4, 4096);
+	intel_copy_bo(batch, bo_1, bo_2, width * height);
+	gem_sync(fd, bo_1->handle);
+	drm_intel_bo_unreference(bo_2);
+
+	/* STEP #2: read BO 1 using the dma-buf CPU mmap. This dirties the CPU caches. */
+	dma_buf_fd = prime_handle_to_fd_for_mmap(fd, bo_1->handle);
+	igt_skip_on(errno == EINVAL);
+
+	ptr_cpu = mmap(NULL, width * height, PROT_READ | PROT_WRITE,
+		       MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr_cpu != MAP_FAILED);
+
+	for (i = 0; i < (width * height) / 4; i++)
+		igt_assert_eq(ptr_cpu[i], 0);
+
+	/* STEP #3: write 0x11 into BO 1. */
+	bo_2 = drm_intel_bo_alloc(bufmgr, "BO 2", width * height * 4, 4096);
+	ptr_gtt = gem_mmap__gtt(fd, bo_2->handle, width * height, PROT_READ | PROT_WRITE);
+	memset(ptr_gtt, 0x11, width * height);
+	munmap(ptr_gtt, width * height);
+
+	intel_copy_bo(batch, bo_1, bo_2, width * height);
+	gem_sync(fd, bo_1->handle);
+	drm_intel_bo_unreference(bo_2);
+
+	/* STEP #4: read again using the CPU mmap. Doing #1 before #3 makes sure we
+	 * don't do a full CPU cache flush in step #3 again. That makes sure all the
+	 * stale cachelines from step #2 survive (mostly, a few will be evicted)
+	 * until we try to read them again in step #4. This behavior could be fixed
+	 * by flush CPU read right before accessing the CPU pointer */
+	if (!expect_stale_cache)
+		prime_sync_start(dma_buf_fd);
+
+	for (i = 0; i < (width * height) / 4; i++)
+		if (ptr_cpu[i] != 0x11111111) {
+			igt_warn_on_f(!expect_stale_cache,
+				    "Found 0x%08x at offset 0x%08x\n", ptr_cpu[i], i);
+			stale++;
+		}
+
+	drm_intel_bo_unreference(bo_1);
+	munmap(ptr_cpu, width * height);
+}
+
+/*
+ * Exercises the need for write flush:
+ *   1. create BO 1 and write '0's, in GTT domain.
+ *   2. write '1's into BO 1 using the dma-buf CPU mmap.
+ *   3. copy BO 1 to new BO 2, in GTT domain.
+ *   4. read via dma-buf mmap BO 2.
+ */
+static void test_write_flush(bool expect_stale_cache)
+{
+	drm_intel_bo *bo_1;
+	drm_intel_bo *bo_2;
+	uint32_t *ptr_cpu;
+	uint32_t *ptr2_cpu;
+	int dma_buf_fd, dma_buf2_fd, i;
+
+	if (expect_stale_cache)
+		igt_require(!gem_has_llc(fd));
+
+	bo_1 = drm_intel_bo_alloc(bufmgr, "BO 1", width * height * 4, 4096);
+
+	/* STEP #1: Put the BO 1 in GTT domain. We use the blitter to copy and fill
+	 * zeros to BO 1, so commands will be submitted and likely to place BO 1 in
+	 * the GTT domain. */
+	bo_2 = drm_intel_bo_alloc(bufmgr, "BO 2", width * height * 4, 4096);
+	intel_copy_bo(batch, bo_1, bo_2, width * height);
+	gem_sync(fd, bo_1->handle);
+	drm_intel_bo_unreference(bo_2);
+
+	/* STEP #2: Write '1's into BO 1 using the dma-buf CPU mmap. */
+	dma_buf_fd = prime_handle_to_fd_for_mmap(fd, bo_1->handle);
+	igt_skip_on(errno == EINVAL);
+
+	ptr_cpu = mmap(NULL, width * height, PROT_READ | PROT_WRITE,
+		       MAP_SHARED, dma_buf_fd, 0);
+	igt_assert(ptr_cpu != MAP_FAILED);
+
+	/* This is the main point of this test: !llc hw requires a cache write
+	 * flush right here (explained in step #4). */
+	if (!expect_stale_cache)
+		prime_sync_start(dma_buf_fd);
+
+	memset(ptr_cpu, 0x11, width * height);
+
+	/* STEP #3: Copy BO 1 into BO 2, using blitter. */
+	bo_2 = drm_intel_bo_alloc(bufmgr, "BO 2", width * height * 4, 4096);
+	intel_copy_bo(batch, bo_2, bo_1, width * height);
+	gem_sync(fd, bo_2->handle);
+
+	/* STEP #4: compare BO 2 against written BO 1. In !llc hardware, there
+	 * should be some cache lines that didn't get flushed out and are still 0,
+	 * requiring cache flush before the write in step 2. */
+	dma_buf2_fd = prime_handle_to_fd_for_mmap(fd, bo_2->handle);
+	igt_skip_on(errno == EINVAL);
+
+	ptr2_cpu = mmap(NULL, width * height, PROT_READ | PROT_WRITE,
+		        MAP_SHARED, dma_buf2_fd, 0);
+	igt_assert(ptr2_cpu != MAP_FAILED);
+
+	for (i = 0; i < (width * height) / 4; i++)
+		if (ptr2_cpu[i] != 0x11111111) {
+			igt_warn_on_f(!expect_stale_cache,
+				      "Found 0x%08x at offset 0x%08x\n", ptr2_cpu[i], i);
+			stale++;
+		}
+
+	drm_intel_bo_unreference(bo_1);
+	drm_intel_bo_unreference(bo_2);
+	munmap(ptr_cpu, width * height);
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	bool expect_stale_cache;
+	igt_subtest_init(argc, argv);
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_INTEL);
+
+		bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
+		batch = intel_batchbuffer_alloc(bufmgr, intel_get_drm_devid(fd));
+	}
+
+	/* Cache coherency and the eviction are pretty much unpredictable, so
+	 * reproducing boils down to trial and error to hit different scenarios.
+	 * TODO: We may want to improve tests a bit by picking random subranges. */
+	igt_info("%d rounds for each test\n", ROUNDS);
+	igt_subtest("read") {
+		stale = 0;
+		expect_stale_cache = false;
+		igt_info("exercising read flush\n");
+		for (i = 0; i < ROUNDS; i++)
+			test_read_flush(expect_stale_cache);
+		igt_fail_on_f(stale, "num of stale cache lines %d\n", stale);
+	}
+
+	/* Only for !llc platforms */
+	igt_subtest("read-and-fail") {
+		stale = 0;
+		expect_stale_cache = true;
+		igt_info("exercising read flush and expect to fail on !llc\n");
+		for (i = 0; i < ROUNDS; i++)
+			test_read_flush(expect_stale_cache);
+		igt_fail_on_f(!stale, "couldn't find any stale cache lines\n");
+	}
+
+	igt_subtest("write") {
+		stale = 0;
+		expect_stale_cache = false;
+		igt_info("exercising write flush\n");
+		for (i = 0; i < ROUNDS; i++)
+			test_write_flush(expect_stale_cache);
+		igt_fail_on_f(stale, "num of stale cache lines %d\n", stale);
+	}
+
+	/* Only for !llc platforms */
+	igt_subtest("write-and-fail") {
+		stale = 0;
+		expect_stale_cache = true;
+		igt_info("exercising write flush and expect to fail on !llc\n");
+		for (i = 0; i < ROUNDS; i++)
+			test_write_flush(expect_stale_cache);
+		igt_fail_on_f(!stale, "couldn't find any stale cache lines\n");
+	}
+
+	igt_fixture {
+		intel_batchbuffer_free(batch);
+		drm_intel_bufmgr_destroy(bufmgr);
+
+		close(fd);
+	}
+
+	igt_exit();
+}
-- 
2.1.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests
  2015-12-16 22:25 ` [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests Tiago Vignatti
@ 2015-12-17  7:53   ` Chris Wilson
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2015-12-17  7:53 UTC (permalink / raw)
  To: Tiago Vignatti
  Cc: jglisse, daniel.vetter, daniel.thompson, thellstrom, dri-devel

On Wed, Dec 16, 2015 at 08:25:42PM -0200, Tiago Vignatti wrote:
> This program can be used to detect when CPU writes in the dma-buf mapped object
> don't land in scanout due cache incoherency.
> 
> Although this seems a problem inherently of non-LCC machines ("Atom"), this
> particular test catches a cache dirt on scanout on LLC machines as well. It's
> inspired in Ville's kms_pwrite_crc.c and can be used also to test the
> correctness of the driver's begin_cpu_access and end_cpu_access (which requires
> i915 implementation.
> 
> To see the need for flush, one has to run this same binary a few times cause
> it's not 100% reproducible -- what I usually do is the following, using '-n'
> option to not call the sync ioctls:
> 
>     $ while ((1)) ; do ./kms_mmap_write_crc -n; done  # in terminal A
>     $ find /                                          # in terminal B

Sounds like we need a igt_fork_memhog_helper() and repeat the test for
20s? until faiure.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/5] drm/i915: Implement end_cpu_access
  2015-12-16 22:25 ` [PATCH v6 4/5] drm/i915: Implement end_cpu_access Tiago Vignatti
@ 2015-12-17  8:01   ` Chris Wilson
  2015-12-18 19:02     ` Tiago Vignatti
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2015-12-17  8:01 UTC (permalink / raw)
  To: Tiago Vignatti
  Cc: jglisse, daniel.vetter, daniel.thompson, thellstrom, dri-devel

On Wed, Dec 16, 2015 at 08:25:36PM -0200, Tiago Vignatti wrote:
> This function is meant to be used with dma-buf mmap, when finishing the CPU
> access of the mapped pointer.
> 
> +static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
> +{
> +	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
> +	struct drm_device *dev = obj->base.dev;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE);
> +	int ret;
> +
> +	mutex_lock(&dev->struct_mutex);
> +	was_interruptible = dev_priv->mm.interruptible;
> +	dev_priv->mm.interruptible = false;
> +
> +	ret = i915_gem_object_set_to_gtt_domain(obj, write);

This only needs to pass .write=false. The dma-buf direction is
only for the period of the user access, and we are now flushing the
caches. This is equivalent to the sw-finish ioctl and ideally we just
want the i915_gem_object_flush_cpu_write_domain().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Direct userspace dma-buf mmap (v6)
  2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
                   ` (10 preceding siblings ...)
  2015-12-16 22:25 ` [PATCH igt v6 6/6] tests: Add prime_mmap_coherency " Tiago Vignatti
@ 2015-12-17 10:15 ` Daniel Vetter
  11 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2015-12-17 10:15 UTC (permalink / raw)
  To: Tiago Vignatti
  Cc: daniel.thompson, daniel.vetter, thellstrom, dri-devel, jglisse

On Wed, Dec 16, 2015 at 08:25:32PM -0200, Tiago Vignatti wrote:
> Hi all,
> 
> The last version of this work was sent a while ago here:
> 
> http://lists.freedesktop.org/archives/dri-devel/2015-August/089263.html
> 
> So let's recap this series:
> 
>     1. it adds a vendor-independent client interface for mapping gem objects
>        through prime, IOW it implements userspace mmap() on dma-buf fd.
>        This could be used for texturing from CPU rendered buffer, passing
>        buffers among processes without performing copies in the userspace.
>     2. the series lets the client write on the mmap'ed memory, and
>     3. it deals with GPU and CPU caches synchronization.
> 
> Based on previous discussions seems that people are fine with 1. and 2. but 
> not really with 3., given that caches coherency is a bit more boring to deal 
> with.
> 
> It's easier to use this new infra on "coherent hardware" (systems with the
> memory cache that is shared by the GPU and CPU) because they rarely need to
> use that kind of synchronization. But would be much more convenient to have 
> the very same interface exposed for clients no matter whether the underlying 
> hardware is cache coherent or not.
> 
> One idea that came up was to force clients to call the sync ioctls after the
> dma-buf was mmaped. But apparently there's no easy, and performant, way to do
> so cause seems too costly to go over the page table entry and check the dirty
> bits. Also, depending on the instructions order sent for the devices, it
> might be needed a sync call after the mapped region gets accessed as well, to
> flush all cachelines and make sure for example the GPU domain won't read stale 
> data. So that would make the things even more complicated, if we ever decide
> to go to this direction of forcing sync ioctls. The alternative therefore is to
> simply document it very well, strong wording the clients to use the sync ioctl 
> regardless otherwise they will mis-behave. Do we have objections or maybe 
> other wiser ways to circumvent this? I've made similar comments in August and
> no one has came up with better ideas.

I still think this is as good as it'll get. We can't force userspace to
behave without a serious perf hit, and without enforcing it all the time
there's not much use in it. Also there's the problem that mmap interfaces
in the kernel don't really allow you (at least easily) to intercept mmap
access.

It might make sense later on as a debug feature in case you do have a
coherency bug and want to know who screwed up. Similar to what exists for
the dma api.

Quickly looked through the patches and looks really nice. Especially the
test coverage (including frontbuffer coherency checks for i915) is
awesome. Imo as soon as we have an ack from chromium upstream that they're
ok with this approach and your chromium patches, and after detailed code
review is done this can go in. An ack from Thomas Hellstrom on the
simplified coherency management interface would be good too.

Thanks, Daniel

> Lastly, the diff of v6 series is that I've basically addressed concerns
> pointed in the igt tests, organized those changes better a bit (in smaller
> patches), documented the usage of sync ioctls and I have extensively tested 
> this in different types of hardware.
> 
> https://github.com/tiagovignatti/drm-intel/commits/drm-intel-nightly_dma-buf-mmap-v6
> https://github.com/tiagovignatti/intel-gpu-tools/commits/dma-buf-mmap-v6
> 
> Tiago
> 
> 
> Daniel Thompson (1):
>   drm: prime: Honour O_RDWR during prime-handle-to-fd
> 
> Daniel Vetter (1):
>   dma-buf: Add ioctls to allow userspace to flush
> 
> Tiago Vignatti (3):
>   dma-buf: Remove range-based flush
>   drm/i915: Implement end_cpu_access
>   drm/i915: Use CPU mapping for userspace dma-buf mmap()
> 
>  Documentation/dma-buf-sharing.txt         | 41 +++++++++++++++-------
>  drivers/dma-buf/dma-buf.c                 | 56 ++++++++++++++++++++++++++-----
>  drivers/gpu/drm/drm_prime.c               | 10 ++----
>  drivers/gpu/drm/i915/i915_gem_dmabuf.c    | 42 +++++++++++++++++++++--
>  drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |  4 +--
>  drivers/gpu/drm/udl/udl_fb.c              |  2 --
>  drivers/staging/android/ion/ion.c         |  6 ++--
>  drivers/staging/android/ion/ion_test.c    |  4 +--
>  include/linux/dma-buf.h                   | 12 +++----
>  include/uapi/drm/drm.h                    |  1 +
>  include/uapi/linux/dma-buf.h              | 38 +++++++++++++++++++++
>  11 files changed, 169 insertions(+), 47 deletions(-)
>  create mode 100644 include/uapi/linux/dma-buf.h
> 
> 
> And the igt changes:
> Rob Bradford (1):
>   prime_mmap: Add new test for calling mmap() on dma-buf fds
> 
> Tiago Vignatti (5):
>   lib: Add gem_userptr and __gem_userptr helpers
>   prime_mmap: Add basic tests to write in a bo using CPU
>   lib: Add prime_sync_start and prime_sync_end helpers
>   tests: Add kms_mmap_write_crc for cache coherency tests
>   tests: Add prime_mmap_coherency for cache coherency tests
> 
>  benchmarks/gem_userptr_benchmark.c |  55 +----
>  lib/ioctl_wrappers.c               |  92 +++++++
>  lib/ioctl_wrappers.h               |  32 +++
>  tests/Makefile.sources             |   3 +
>  tests/gem_userptr_blits.c          | 104 ++------
>  tests/kms_mmap_write_crc.c         | 281 +++++++++++++++++++++
>  tests/prime_mmap.c                 | 494 +++++++++++++++++++++++++++++++++++++
>  tests/prime_mmap_coherency.c       | 246 ++++++++++++++++++
>  8 files changed, 1180 insertions(+), 127 deletions(-)
>  create mode 100644 tests/kms_mmap_write_crc.c
>  create mode 100644 tests/prime_mmap.c
>  create mode 100644 tests/prime_mmap_coherency.c
> 
> -- 
> 2.1.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers
  2015-12-16 22:25 ` [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers Tiago Vignatti
@ 2015-12-17 10:18   ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2015-12-17 10:18 UTC (permalink / raw)
  To: Tiago Vignatti
  Cc: daniel.thompson, daniel.vetter, thellstrom, dri-devel, jglisse

On Wed, Dec 16, 2015 at 08:25:41PM -0200, Tiago Vignatti wrote:
> This patch adds dma-buf mmap synchronization ioctls that can be used by tests
> for cache coherency management e.g. when CPU and GPU domains are being accessed
> through dma-buf at the same time.
> 
> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
> ---
>  lib/ioctl_wrappers.c | 26 ++++++++++++++++++++++++++
>  lib/ioctl_wrappers.h | 15 +++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
> index 86a61ba..0d84d00 100644
> --- a/lib/ioctl_wrappers.c
> +++ b/lib/ioctl_wrappers.c
> @@ -1400,6 +1400,32 @@ off_t prime_get_size(int dma_buf_fd)
>  }
>  
>  /**
> + * prime_sync_start
> + * @dma_buf_fd: dma-buf fd handle
> + */
> +void prime_sync_start(int dma_buf_fd)
> +{
> +	struct local_dma_buf_sync sync_start;
> +
> +	memset(&sync_start, 0, sizeof(sync_start));
> +	sync_start.flags = LOCAL_DMA_BUF_SYNC_START | LOCAL_DMA_BUF_SYNC_RW;
> +	do_ioctl(dma_buf_fd, LOCAL_DMA_BUF_IOCTL_SYNC, &sync_start);
> +}
> +
> +/**
> + * prime_sync_end
> + * @dma_buf_fd: dma-buf fd handle
> + */
> +void prime_sync_end(int dma_buf_fd)
> +{
> +	struct local_dma_buf_sync sync_end;
> +
> +	memset(&sync_end, 0, sizeof(sync_end));
> +	sync_end.flags = LOCAL_DMA_BUF_SYNC_END | LOCAL_DMA_BUF_SYNC_RW;
> +	do_ioctl(dma_buf_fd, LOCAL_DMA_BUF_IOCTL_SYNC, &sync_end);
> +}
> +
> +/**
>   * igt_require_fb_modifiers:
>   * @fd: Open DRM file descriptor.
>   *
> diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
> index d3ffba2..cbd7a73 100644
> --- a/lib/ioctl_wrappers.h
> +++ b/lib/ioctl_wrappers.h
> @@ -148,6 +148,19 @@ void gem_require_caching(int fd);
>  void gem_require_ring(int fd, int ring_id);
>  
>  /* prime */
> +struct local_dma_buf_sync {
> +	uint64_t flags;
> +};
> +
> +#define LOCAL_DMA_BUF_SYNC_RW        (3 << 0)
> +#define LOCAL_DMA_BUF_SYNC_START     (0 << 2)
> +#define LOCAL_DMA_BUF_SYNC_END       (1 << 2)
> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
> +		(DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)

What seems to be missing is negative ioclt tests, i.e. making sure invalid
flags and stuff gets properly rejected.
-Daniel

> +
> +#define LOCAL_DMA_BUF_BASE 'b'
> +#define LOCAL_DMA_BUF_IOCTL_SYNC _IOW(LOCAL_DMA_BUF_BASE, 0, struct local_dma_buf_sync)
> +
>  int prime_handle_to_fd(int fd, uint32_t handle);
>  #ifndef DRM_RDWR
>  #define DRM_RDWR O_RDWR
> @@ -155,6 +168,8 @@ int prime_handle_to_fd(int fd, uint32_t handle);
>  int prime_handle_to_fd_for_mmap(int fd, uint32_t handle);
>  uint32_t prime_fd_to_handle(int fd, int dma_buf_fd);
>  off_t prime_get_size(int dma_buf_fd);
> +void prime_sync_start(int dma_buf_fd);
> +void prime_sync_end(int dma_buf_fd);
>  
>  /* addfb2 fb modifiers */
>  struct local_drm_mode_fb_cmd2 {
> -- 
> 2.1.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-16 22:25 ` [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush Tiago Vignatti
@ 2015-12-17 18:19   ` Alex Deucher
  2015-12-18 18:46     ` Tiago Vignatti
  2015-12-17 21:58   ` Thomas Hellstrom
  1 sibling, 1 reply; 25+ messages in thread
From: Alex Deucher @ 2015-12-17 18:19 UTC (permalink / raw)
  To: Tiago Vignatti
  Cc: Daniel Thompson, Daniel Vetter, Thomas Hellstrom,
	Maling list - DRI developers, Jerome Glisse, Daniel Vetter

On Wed, Dec 16, 2015 at 5:25 PM, Tiago Vignatti
<tiago.vignatti@intel.com> wrote:
> From: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> The userspace might need some sort of cache coherency management e.g. when CPU
> and GPU domains are being accessed through dma-buf at the same time. To
> circumvent this problem there are begin/end coherency markers, that forward
> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
> used like following:
>      - mmap dma-buf fd
>      - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>        to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>        want (with the new data being consumed by the GPU or say scanout device)
>      - munmap once you don't need the buffer any more
>
> v2 (Tiago): Fix header file type names (u64 -> __u64)
> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
> dma-buf functions. Check for overflows in start/length.
> v4 (Tiago): use 2d regions for sync.
> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
> remove range information from struct dma_buf_sync.
> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
> documentation about the recommendation on using sync ioctls.
>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
> ---
>  Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>  drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
>  3 files changed, 102 insertions(+), 1 deletion(-)
>  create mode 100644 include/uapi/linux/dma-buf.h
>
> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
> index 4f4a84b..2ddd4b2 100644
> --- a/Documentation/dma-buf-sharing.txt
> +++ b/Documentation/dma-buf-sharing.txt
> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
>     handles, too). So it's beneficial to support this in a similar fashion on
>     dma-buf to have a good transition path for existing Android userspace.
>
> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
> +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
> +   important to note though is that, even if it is not mandatory, the userspace
> +   is strongly recommended to always use the cache synchronization ioctl
> +   (DMA_BUF_IOCTL_SYNC) discussed next.
> +
> +   Some systems might need some sort of cache coherency management e.g. when
> +   CPU and GPU domains are being accessed through dma-buf at the same time. To
> +   circumvent this problem there are begin/end coherency markers, that forward
> +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
> +   would be used like following:
> +     - mmap dma-buf fd
> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
> +       want (with the new data being consumed by the GPU or say scanout device)
> +     - munmap once you don't need the buffer any more
> +
> +    In principle systems with the memory cache shared by the GPU and CPU may
> +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
> +    to use these ioctls before and after, respectively, when accessing the
> +    mapped address.
>
>  2. Supporting existing mmap interfaces in importers
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index b2ac13b..9a298bd 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -34,6 +34,8 @@
>  #include <linux/poll.h>
>  #include <linux/reservation.h>
>
> +#include <uapi/linux/dma-buf.h>
> +
>  static inline int is_dma_buf_file(struct file *);
>
>  struct dma_buf_list {
> @@ -251,11 +253,52 @@ out:
>         return events;
>  }
>
> +static long dma_buf_ioctl(struct file *file,
> +                         unsigned int cmd, unsigned long arg)
> +{
> +       struct dma_buf *dmabuf;
> +       struct dma_buf_sync sync;
> +       enum dma_data_direction direction;
> +
> +       dmabuf = file->private_data;
> +
> +       if (!is_dma_buf_file(file))
> +               return -EINVAL;
> +
> +       switch (cmd) {
> +       case DMA_BUF_IOCTL_SYNC:
> +               if (copy_from_user(&sync, (void __user *) arg, sizeof(sync)))
> +                       return -EFAULT;
> +
> +               if (sync.flags & DMA_BUF_SYNC_RW)
> +                       direction = DMA_BIDIRECTIONAL;
> +               else if (sync.flags & DMA_BUF_SYNC_READ)
> +                       direction = DMA_FROM_DEVICE;
> +               else if (sync.flags & DMA_BUF_SYNC_WRITE)
> +                       direction = DMA_TO_DEVICE;
> +               else
> +                       return -EINVAL;
> +
> +               if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
> +                       return -EINVAL;
> +
> +               if (sync.flags & DMA_BUF_SYNC_END)
> +                       dma_buf_end_cpu_access(dmabuf, direction);
> +               else
> +                       dma_buf_begin_cpu_access(dmabuf, direction);
> +
> +               return 0;
> +       default:
> +               return -ENOTTY;
> +       }
> +}
> +
>  static const struct file_operations dma_buf_fops = {
>         .release        = dma_buf_release,
>         .mmap           = dma_buf_mmap_internal,
>         .llseek         = dma_buf_llseek,
>         .poll           = dma_buf_poll,
> +       .unlocked_ioctl = dma_buf_ioctl,
>  };
>
>  /*
> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> new file mode 100644
> index 0000000..bd195f2
> --- /dev/null
> +++ b/include/uapi/linux/dma-buf.h
> @@ -0,0 +1,38 @@
> +/*
> + * Framework for buffer objects that can be shared across devices/subsystems.
> + *
> + * Copyright(C) 2015 Intel Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _DMA_BUF_UAPI_H_
> +#define _DMA_BUF_UAPI_H_
> +
> +/* begin/end dma-buf functions used for userspace mmap. */
> +struct dma_buf_sync {
> +       __u64 flags;
> +};
> +
> +#define DMA_BUF_SYNC_READ      (1 << 0)
> +#define DMA_BUF_SYNC_WRITE     (2 << 0)
> +#define DMA_BUF_SYNC_RW        (3 << 0)

Maybe make this:

#define DMA_BUF_SYNC_RW        (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)

or maybe:

#define DMA_BUF_SYNC_RW_MASK        (3 << 0)

> +#define DMA_BUF_SYNC_START     (0 << 2)
> +#define DMA_BUF_SYNC_END       (1 << 2)

if you go with the mask above, maybe:
#define DMA_BUF_SYNC_MASK     (1 << 2)

> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
> +       (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
> +
> +#define DMA_BUF_BASE           'b'
> +#define DMA_BUF_IOCTL_SYNC     _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
> +
> +#endif
> --
> 2.1.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-16 22:25 ` [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush Tiago Vignatti
  2015-12-17 18:19   ` Alex Deucher
@ 2015-12-17 21:58   ` Thomas Hellstrom
  2015-12-18 15:29     ` Daniel Vetter
  2015-12-18 19:50     ` Tiago Vignatti
  1 sibling, 2 replies; 25+ messages in thread
From: Thomas Hellstrom @ 2015-12-17 21:58 UTC (permalink / raw)
  To: Tiago Vignatti, dri-devel
  Cc: jglisse, daniel.vetter, daniel.thompson, Daniel Vetter

On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
> From: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> The userspace might need some sort of cache coherency management e.g. when CPU
> and GPU domains are being accessed through dma-buf at the same time. To
> circumvent this problem there are begin/end coherency markers, that forward
> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
> used like following:
>      - mmap dma-buf fd
>      - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>        to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>        want (with the new data being consumed by the GPU or say scanout device)
>      - munmap once you don't need the buffer any more
>
> v2 (Tiago): Fix header file type names (u64 -> __u64)
> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
> dma-buf functions. Check for overflows in start/length.
> v4 (Tiago): use 2d regions for sync.
> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
> remove range information from struct dma_buf_sync.
> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
> documentation about the recommendation on using sync ioctls.
>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
> ---
>  Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>  drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
>  3 files changed, 102 insertions(+), 1 deletion(-)
>  create mode 100644 include/uapi/linux/dma-buf.h
>
> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
> index 4f4a84b..2ddd4b2 100644
> --- a/Documentation/dma-buf-sharing.txt
> +++ b/Documentation/dma-buf-sharing.txt
> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
>     handles, too). So it's beneficial to support this in a similar fashion on
>     dma-buf to have a good transition path for existing Android userspace.
>  
> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
> +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
> +   important to note though is that, even if it is not mandatory, the userspace
> +   is strongly recommended to always use the cache synchronization ioctl
> +   (DMA_BUF_IOCTL_SYNC) discussed next.
> +
> +   Some systems might need some sort of cache coherency management e.g. when
> +   CPU and GPU domains are being accessed through dma-buf at the same time. To
> +   circumvent this problem there are begin/end coherency markers, that forward
> +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
> +   would be used like following:
> +     - mmap dma-buf fd
> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
> +       want (with the new data being consumed by the GPU or say scanout device)
> +     - munmap once you don't need the buffer any more
> +
> +    In principle systems with the memory cache shared by the GPU and CPU may
> +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
> +    to use these ioctls before and after, respectively, when accessing the
> +    mapped address.
>  

I think the wording here is far too weak. If this is a generic
user-space interface and syncing
is required for
a) Correctness: then syncing must be mandatory.
b) Optimal performance then an implementation must generate expected
results also in the absence of SYNC ioctls, but is allowed to rely on
correct pairing of SYNC_START and SYNC_END to render correctly.


Also recalling the discussion of multidimensional sync, which we said we
would add when it was needed, my worst nightmare is when (not if) there
are a number of important applications starting to abuse this interface
for small writes or reads to large DMA buffers. It will work flawlessly
on coherent architectures, and probably some incoherent architectures as
well, but will probably  be mostly useless on VMware. What is the plan
for adding 2D sync to make it work? How do we pursuade app developers to
think of damage regions, and can we count on support from the DRM
community when this happens?

/Thomas







>  2. Supporting existing mmap interfaces in importers
>  
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index b2ac13b..9a298bd 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -34,6 +34,8 @@
>  #include <linux/poll.h>
>  #include <linux/reservation.h>
>  
> +#include <uapi/linux/dma-buf.h>
> +
>  static inline int is_dma_buf_file(struct file *);
>  
>  struct dma_buf_list {
> @@ -251,11 +253,52 @@ out:
>  	return events;
>  }
>  
> +static long dma_buf_ioctl(struct file *file,
> +			  unsigned int cmd, unsigned long arg)
> +{
> +	struct dma_buf *dmabuf;
> +	struct dma_buf_sync sync;
> +	enum dma_data_direction direction;
> +
> +	dmabuf = file->private_data;
> +
> +	if (!is_dma_buf_file(file))
> +		return -EINVAL;
> +
> +	switch (cmd) {
> +	case DMA_BUF_IOCTL_SYNC:
> +		if (copy_from_user(&sync, (void __user *) arg, sizeof(sync)))
> +			return -EFAULT;
> +
> +		if (sync.flags & DMA_BUF_SYNC_RW)
> +			direction = DMA_BIDIRECTIONAL;
> +		else if (sync.flags & DMA_BUF_SYNC_READ)
> +			direction = DMA_FROM_DEVICE;
> +		else if (sync.flags & DMA_BUF_SYNC_WRITE)
> +			direction = DMA_TO_DEVICE;
> +		else
> +			return -EINVAL;
> +
> +		if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
> +			return -EINVAL;
> +
> +		if (sync.flags & DMA_BUF_SYNC_END)
> +			dma_buf_end_cpu_access(dmabuf, direction);
> +		else
> +			dma_buf_begin_cpu_access(dmabuf, direction);
> +
> +		return 0;
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
>  static const struct file_operations dma_buf_fops = {
>  	.release	= dma_buf_release,
>  	.mmap		= dma_buf_mmap_internal,
>  	.llseek		= dma_buf_llseek,
>  	.poll		= dma_buf_poll,
> +	.unlocked_ioctl	= dma_buf_ioctl,
>  };
>  
>  /*
> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> new file mode 100644
> index 0000000..bd195f2
> --- /dev/null
> +++ b/include/uapi/linux/dma-buf.h
> @@ -0,0 +1,38 @@
> +/*
> + * Framework for buffer objects that can be shared across devices/subsystems.
> + *
> + * Copyright(C) 2015 Intel Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _DMA_BUF_UAPI_H_
> +#define _DMA_BUF_UAPI_H_
> +
> +/* begin/end dma-buf functions used for userspace mmap. */
> +struct dma_buf_sync {
> +	__u64 flags;
> +};
> +
> +#define DMA_BUF_SYNC_READ      (1 << 0)
> +#define DMA_BUF_SYNC_WRITE     (2 << 0)
> +#define DMA_BUF_SYNC_RW        (3 << 0)
> +#define DMA_BUF_SYNC_START     (0 << 2)
> +#define DMA_BUF_SYNC_END       (1 << 2)
> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \
> +	(DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
> +
> +#define DMA_BUF_BASE		'b'
> +#define DMA_BUF_IOCTL_SYNC	_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
> +
> +#endif


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-17 21:58   ` Thomas Hellstrom
@ 2015-12-18 15:29     ` Daniel Vetter
  2015-12-18 19:50     ` Tiago Vignatti
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2015-12-18 15:29 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: daniel.thompson, dri-devel, jglisse, daniel.vetter, Daniel Vetter

On Thu, Dec 17, 2015 at 10:58:20PM +0100, Thomas Hellstrom wrote:
> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
> > From: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > The userspace might need some sort of cache coherency management e.g. when CPU
> > and GPU domains are being accessed through dma-buf at the same time. To
> > circumvent this problem there are begin/end coherency markers, that forward
> > directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
> > of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
> > used like following:
> >      - mmap dma-buf fd
> >      - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
> >        to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
> >        want (with the new data being consumed by the GPU or say scanout device)
> >      - munmap once you don't need the buffer any more
> >
> > v2 (Tiago): Fix header file type names (u64 -> __u64)
> > v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
> > dma-buf functions. Check for overflows in start/length.
> > v4 (Tiago): use 2d regions for sync.
> > v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
> > remove range information from struct dma_buf_sync.
> > v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
> > documentation about the recommendation on using sync ioctls.
> >
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
> > ---
> >  Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
> >  drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
> >  3 files changed, 102 insertions(+), 1 deletion(-)
> >  create mode 100644 include/uapi/linux/dma-buf.h
> >
> > diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
> > index 4f4a84b..2ddd4b2 100644
> > --- a/Documentation/dma-buf-sharing.txt
> > +++ b/Documentation/dma-buf-sharing.txt
> > @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
> >     handles, too). So it's beneficial to support this in a similar fashion on
> >     dma-buf to have a good transition path for existing Android userspace.
> >  
> > -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
> > +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
> > +   important to note though is that, even if it is not mandatory, the userspace
> > +   is strongly recommended to always use the cache synchronization ioctl
> > +   (DMA_BUF_IOCTL_SYNC) discussed next.
> > +
> > +   Some systems might need some sort of cache coherency management e.g. when
> > +   CPU and GPU domains are being accessed through dma-buf at the same time. To
> > +   circumvent this problem there are begin/end coherency markers, that forward
> > +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
> > +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
> > +   would be used like following:
> > +     - mmap dma-buf fd
> > +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
> > +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
> > +       want (with the new data being consumed by the GPU or say scanout device)
> > +     - munmap once you don't need the buffer any more
> > +
> > +    In principle systems with the memory cache shared by the GPU and CPU may
> > +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
> > +    to use these ioctls before and after, respectively, when accessing the
> > +    mapped address.
> >  
> 
> I think the wording here is far too weak. If this is a generic
> user-space interface and syncing
> is required for
> a) Correctness: then syncing must be mandatory.
> b) Optimal performance then an implementation must generate expected
> results also in the absence of SYNC ioctls, but is allowed to rely on
> correct pairing of SYNC_START and SYNC_END to render correctly.

Yeah I guess the wroking should be stronger to make it clear you better
call these, always.

> Also recalling the discussion of multidimensional sync, which we said we
> would add when it was needed, my worst nightmare is when (not if) there
> are a number of important applications starting to abuse this interface
> for small writes or reads to large DMA buffers. It will work flawlessly
> on coherent architectures, and probably some incoherent architectures as
> well, but will probably  be mostly useless on VMware. What is the plan
> for adding 2D sync to make it work? How do we pursuade app developers to
> think of damage regions, and can we count on support from the DRM
> community when this happens?

The current use-case in cros seems to be to do full-blown buffer uploads,
nothing partial. I don't think there's anything we can do really (flushing
for small uploads will make i915 crawl too, albeit you can still see a
slideshow and it's not a complete stop), except maybe improve the docs
with a statement that trying to use dma-buf mmap for small uploads will be
result in horrible performance. I think inevitable reality is that some
truly horrible software will always ship. I am pretty hopeful though that
this won't be too common, since the main users for anything dma-buf are on
mobile, and those systems (at least on i915) are all incoherent. So as
long as it'll run acceptably on i915 I think it should run at least ok-ish
on vmwgfx too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-17 18:19   ` Alex Deucher
@ 2015-12-18 18:46     ` Tiago Vignatti
  0 siblings, 0 replies; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-18 18:46 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Daniel Thompson, Daniel Vetter, Thomas Hellstrom,
	Maling list - DRI developers, Jerome Glisse, Daniel Vetter

On 12/17/2015 04:19 PM, Alex Deucher wrote:
> On Wed, Dec 16, 2015 at 5:25 PM, Tiago Vignatti
> <tiago.vignatti@intel.com> wrote:
>> From: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> The userspace might need some sort of cache coherency management e.g. when CPU
>> and GPU domains are being accessed through dma-buf at the same time. To
>> circumvent this problem there are begin/end coherency markers, that forward
>> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
>> used like following:
>>       - mmap dma-buf fd
>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>>         to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>>         want (with the new data being consumed by the GPU or say scanout device)
>>       - munmap once you don't need the buffer any more
>>
>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
>> dma-buf functions. Check for overflows in start/length.
>> v4 (Tiago): use 2d regions for sync.
>> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
>> remove range information from struct dma_buf_sync.
>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>> documentation about the recommendation on using sync ioctls.
>>
>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
>> ---
>>   Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>   drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
>>   3 files changed, 102 insertions(+), 1 deletion(-)
>>   create mode 100644 include/uapi/linux/dma-buf.h
>>
>> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
>> index 4f4a84b..2ddd4b2 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
>>      handles, too). So it's beneficial to support this in a similar fashion on
>>      dma-buf to have a good transition path for existing Android userspace.
>>
>> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
>> +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
>> +   important to note though is that, even if it is not mandatory, the userspace
>> +   is strongly recommended to always use the cache synchronization ioctl
>> +   (DMA_BUF_IOCTL_SYNC) discussed next.
>> +
>> +   Some systems might need some sort of cache coherency management e.g. when
>> +   CPU and GPU domains are being accessed through dma-buf at the same time. To
>> +   circumvent this problem there are begin/end coherency markers, that forward
>> +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>> +   would be used like following:
>> +     - mmap dma-buf fd
>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>> +       want (with the new data being consumed by the GPU or say scanout device)
>> +     - munmap once you don't need the buffer any more
>> +
>> +    In principle systems with the memory cache shared by the GPU and CPU may
>> +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
>> +    to use these ioctls before and after, respectively, when accessing the
>> +    mapped address.
>>
>>   2. Supporting existing mmap interfaces in importers
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index b2ac13b..9a298bd 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -34,6 +34,8 @@
>>   #include <linux/poll.h>
>>   #include <linux/reservation.h>
>>
>> +#include <uapi/linux/dma-buf.h>
>> +
>>   static inline int is_dma_buf_file(struct file *);
>>
>>   struct dma_buf_list {
>> @@ -251,11 +253,52 @@ out:
>>          return events;
>>   }
>>
>> +static long dma_buf_ioctl(struct file *file,
>> +                         unsigned int cmd, unsigned long arg)
>> +{
>> +       struct dma_buf *dmabuf;
>> +       struct dma_buf_sync sync;
>> +       enum dma_data_direction direction;
>> +
>> +       dmabuf = file->private_data;
>> +
>> +       if (!is_dma_buf_file(file))
>> +               return -EINVAL;
>> +
>> +       switch (cmd) {
>> +       case DMA_BUF_IOCTL_SYNC:
>> +               if (copy_from_user(&sync, (void __user *) arg, sizeof(sync)))
>> +                       return -EFAULT;
>> +
>> +               if (sync.flags & DMA_BUF_SYNC_RW)
>> +                       direction = DMA_BIDIRECTIONAL;
>> +               else if (sync.flags & DMA_BUF_SYNC_READ)
>> +                       direction = DMA_FROM_DEVICE;
>> +               else if (sync.flags & DMA_BUF_SYNC_WRITE)
>> +                       direction = DMA_TO_DEVICE;
>> +               else
>> +                       return -EINVAL;
>> +
>> +               if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
>> +                       return -EINVAL;
>> +
>> +               if (sync.flags & DMA_BUF_SYNC_END)
>> +                       dma_buf_end_cpu_access(dmabuf, direction);
>> +               else
>> +                       dma_buf_begin_cpu_access(dmabuf, direction);
>> +
>> +               return 0;
>> +       default:
>> +               return -ENOTTY;
>> +       }
>> +}
>> +
>>   static const struct file_operations dma_buf_fops = {
>>          .release        = dma_buf_release,
>>          .mmap           = dma_buf_mmap_internal,
>>          .llseek         = dma_buf_llseek,
>>          .poll           = dma_buf_poll,
>> +       .unlocked_ioctl = dma_buf_ioctl,
>>   };
>>
>>   /*
>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
>> new file mode 100644
>> index 0000000..bd195f2
>> --- /dev/null
>> +++ b/include/uapi/linux/dma-buf.h
>> @@ -0,0 +1,38 @@
>> +/*
>> + * Framework for buffer objects that can be shared across devices/subsystems.
>> + *
>> + * Copyright(C) 2015 Intel Ltd
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published by
>> + * the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef _DMA_BUF_UAPI_H_
>> +#define _DMA_BUF_UAPI_H_
>> +
>> +/* begin/end dma-buf functions used for userspace mmap. */
>> +struct dma_buf_sync {
>> +       __u64 flags;
>> +};
>> +
>> +#define DMA_BUF_SYNC_READ      (1 << 0)
>> +#define DMA_BUF_SYNC_WRITE     (2 << 0)
>> +#define DMA_BUF_SYNC_RW        (3 << 0)
>
> Maybe make this:
>
> #define DMA_BUF_SYNC_RW        (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)
>
> or maybe:
>
> #define DMA_BUF_SYNC_RW_MASK        (3 << 0)

Thanks Alex. I think I like your first option for being a bit more 
suggestive than the other -- I'm changing for it now.

Tiago

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/5] drm/i915: Implement end_cpu_access
  2015-12-17  8:01   ` Chris Wilson
@ 2015-12-18 19:02     ` Tiago Vignatti
  2015-12-18 19:19       ` Tiago Vignatti
  0 siblings, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-18 19:02 UTC (permalink / raw)
  To: Chris Wilson, dri-devel, daniel.thompson, daniel.vetter,
	thellstrom, jglisse

On 12/17/2015 06:01 AM, Chris Wilson wrote:
> On Wed, Dec 16, 2015 at 08:25:36PM -0200, Tiago Vignatti wrote:
>> This function is meant to be used with dma-buf mmap, when finishing the CPU
>> access of the mapped pointer.
>>
>> +static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
>> +{
>> +	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>> +	struct drm_device *dev = obj->base.dev;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE);
>> +	int ret;
>> +
>> +	mutex_lock(&dev->struct_mutex);
>> +	was_interruptible = dev_priv->mm.interruptible;
>> +	dev_priv->mm.interruptible = false;
>> +
>> +	ret = i915_gem_object_set_to_gtt_domain(obj, write);
>
> This only needs to pass .write=false. The dma-buf direction is
> only for the period of the user access, and we are now flushing the
> caches. This is equivalent to the sw-finish ioctl and ideally we just
> want the i915_gem_object_flush_cpu_write_domain().

in fact the only usage so far I found for end_cpu_access is when the 
pinned buffer is scanout out. Should I pretty much copy sw-finish in 
end_cpu_access then?

Thanks,

Tiago

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/5] drm/i915: Implement end_cpu_access
  2015-12-18 19:02     ` Tiago Vignatti
@ 2015-12-18 19:19       ` Tiago Vignatti
  2015-12-22 20:51         ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-18 19:19 UTC (permalink / raw)
  To: dri-devel

On 12/18/2015 05:02 PM, Tiago Vignatti wrote:
> On 12/17/2015 06:01 AM, Chris Wilson wrote:
>> On Wed, Dec 16, 2015 at 08:25:36PM -0200, Tiago Vignatti wrote:
>>> This function is meant to be used with dma-buf mmap, when finishing
>>> the CPU
>>> access of the mapped pointer.
>>>
>>> +static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum
>>> dma_data_direction direction)
>>> +{
>>> +    struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>>> +    struct drm_device *dev = obj->base.dev;
>>> +    struct drm_i915_private *dev_priv = to_i915(dev);
>>> +    bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL
>>> || direction == DMA_TO_DEVICE);
>>> +    int ret;
>>> +
>>> +    mutex_lock(&dev->struct_mutex);
>>> +    was_interruptible = dev_priv->mm.interruptible;
>>> +    dev_priv->mm.interruptible = false;
>>> +
>>> +    ret = i915_gem_object_set_to_gtt_domain(obj, write);
>>
>> This only needs to pass .write=false. The dma-buf direction is
>> only for the period of the user access, and we are now flushing the
>> caches. This is equivalent to the sw-finish ioctl and ideally we just
>> want the i915_gem_object_flush_cpu_write_domain().
>
> in fact the only usage so far I found for end_cpu_access is when the
> pinned buffer is scanout out. Should I pretty much copy sw-finish in
> end_cpu_access then?

And do you think it's okay to declare 
i915_gem_object_flush_cpu_write_domain outside its file's only scope?

Tiago
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-17 21:58   ` Thomas Hellstrom
  2015-12-18 15:29     ` Daniel Vetter
@ 2015-12-18 19:50     ` Tiago Vignatti
  2015-12-21  9:38       ` Thomas Hellstrom
  1 sibling, 1 reply; 25+ messages in thread
From: Tiago Vignatti @ 2015-12-18 19:50 UTC (permalink / raw)
  To: Thomas Hellstrom, dri-devel
  Cc: jglisse, daniel.vetter, daniel.thompson, Daniel Vetter

On 12/17/2015 07:58 PM, Thomas Hellstrom wrote:
> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
>> From: Daniel Vetter <daniel.vetter@ffwll.ch>
>>
>> The userspace might need some sort of cache coherency management e.g. when CPU
>> and GPU domains are being accessed through dma-buf at the same time. To
>> circumvent this problem there are begin/end coherency markers, that forward
>> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use
>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be
>> used like following:
>>       - mmap dma-buf fd
>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>>         to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>>         want (with the new data being consumed by the GPU or say scanout device)
>>       - munmap once you don't need the buffer any more
>>
>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end
>> dma-buf functions. Check for overflows in start/length.
>> v4 (Tiago): use 2d regions for sync.
>> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and
>> remove range information from struct dma_buf_sync.
>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>> documentation about the recommendation on using sync ioctls.
>>
>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
>> ---
>>   Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>   drivers/dma-buf/dma-buf.c         | 43 +++++++++++++++++++++++++++++++++++++++
>>   include/uapi/linux/dma-buf.h      | 38 ++++++++++++++++++++++++++++++++++
>>   3 files changed, 102 insertions(+), 1 deletion(-)
>>   create mode 100644 include/uapi/linux/dma-buf.h
>>
>> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
>> index 4f4a84b..2ddd4b2 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
>>      handles, too). So it's beneficial to support this in a similar fashion on
>>      dma-buf to have a good transition path for existing Android userspace.
>>
>> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
>> +   No special interfaces, userspace simply calls mmap on the dma-buf fd. Very
>> +   important to note though is that, even if it is not mandatory, the userspace
>> +   is strongly recommended to always use the cache synchronization ioctl
>> +   (DMA_BUF_IOCTL_SYNC) discussed next.
>> +
>> +   Some systems might need some sort of cache coherency management e.g. when
>> +   CPU and GPU domains are being accessed through dma-buf at the same time. To
>> +   circumvent this problem there are begin/end coherency markers, that forward
>> +   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>> +   would be used like following:
>> +     - mmap dma-buf fd
>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
>> +       want (with the new data being consumed by the GPU or say scanout device)
>> +     - munmap once you don't need the buffer any more
>> +
>> +    In principle systems with the memory cache shared by the GPU and CPU may
>> +    not need SYNC_START and SYNC_END but still, userspace is always encouraged
>> +    to use these ioctls before and after, respectively, when accessing the
>> +    mapped address.
>>
>
> I think the wording here is far too weak. If this is a generic
> user-space interface and syncing
> is required for
> a) Correctness: then syncing must be mandatory.
> b) Optimal performance then an implementation must generate expected
> results also in the absence of SYNC ioctls, but is allowed to rely on
> correct pairing of SYNC_START and SYNC_END to render correctly.

Thomas, do you think the following write-up captures this?


-   No special interfaces, userspace simply calls mmap on the dma-buf fd.
+   No special interfaces, userspace simply calls mmap on the dma-buf 
fd, making
+   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is 
*always*
+   used when the access happens. This is discussed next paragraphs.
+
+   Some systems might need some sort of cache coherency management e.g. 
when
+   CPU and GPU domains are being accessed through dma-buf at the same 
time. To
+   circumvent this problem there are begin/end coherency markers, that 
forward
+   directly to existing dma-buf device drivers vfunc hooks. Userspace 
can make
+   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
+   would be used like following:
+     - mmap dma-buf fd
+     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. 
read/write
+       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
+       want (with the new data being consumed by the GPU or say scanout 
device)
+     - munmap once you don't need the buffer any more
+
+    Therefore, for correctness and optimal performance, systems with 
the memory
+    cache shared by the GPU and CPU i.e. the "coherent" and also 
"incoherent"
+    systems are always required to use SYNC_START and SYNC_END before and
+    after, respectively, when accessing the mapped address.


Thank you,

Tiago
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush
  2015-12-18 19:50     ` Tiago Vignatti
@ 2015-12-21  9:38       ` Thomas Hellstrom
  0 siblings, 0 replies; 25+ messages in thread
From: Thomas Hellstrom @ 2015-12-21  9:38 UTC (permalink / raw)
  To: Tiago Vignatti, dri-devel
  Cc: jglisse, daniel.vetter, daniel.thompson, Daniel Vetter

On 12/18/2015 08:50 PM, Tiago Vignatti wrote:
> On 12/17/2015 07:58 PM, Thomas Hellstrom wrote:
>> On 12/16/2015 11:25 PM, Tiago Vignatti wrote:
>>> From: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>
>>> The userspace might need some sort of cache coherency management
>>> e.g. when CPU
>>> and GPU domains are being accessed through dma-buf at the same time. To
>>> circumvent this problem there are begin/end coherency markers, that
>>> forward
>>> directly to existing dma-buf device drivers vfunc hooks. Userspace
>>> can make use
>>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
>>> would be
>>> used like following:
>>>       - mmap dma-buf fd
>>>       - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>>         to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>>         want (with the new data being consumed by the GPU or say
>>> scanout device)
>>>       - munmap once you don't need the buffer any more
>>>
>>> v2 (Tiago): Fix header file type names (u64 -> __u64)
>>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the
>>> begin/end
>>> dma-buf functions. Check for overflows in start/length.
>>> v4 (Tiago): use 2d regions for sync.
>>> v5 (Tiago): forget about 2d regions (v4); use _IOW in
>>> DMA_BUF_IOCTL_SYNC and
>>> remove range information from struct dma_buf_sync.
>>> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust
>>> documentation about the recommendation on using sync ioctls.
>>>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Signed-off-by: Tiago Vignatti <tiago.vignatti@intel.com>
>>> ---
>>>   Documentation/dma-buf-sharing.txt | 22 +++++++++++++++++++-
>>>   drivers/dma-buf/dma-buf.c         | 43
>>> +++++++++++++++++++++++++++++++++++++++
>>>   include/uapi/linux/dma-buf.h      | 38
>>> ++++++++++++++++++++++++++++++++++
>>>   3 files changed, 102 insertions(+), 1 deletion(-)
>>>   create mode 100644 include/uapi/linux/dma-buf.h
>>>
>>> diff --git a/Documentation/dma-buf-sharing.txt
>>> b/Documentation/dma-buf-sharing.txt
>>> index 4f4a84b..2ddd4b2 100644
>>> --- a/Documentation/dma-buf-sharing.txt
>>> +++ b/Documentation/dma-buf-sharing.txt
>>> @@ -350,7 +350,27 @@ Being able to mmap an export dma-buf buffer
>>> object has 2 main use-cases:
>>>      handles, too). So it's beneficial to support this in a similar
>>> fashion on
>>>      dma-buf to have a good transition path for existing Android
>>> userspace.
>>>
>>> -   No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd.
>>> +   No special interfaces, userspace simply calls mmap on the
>>> dma-buf fd. Very
>>> +   important to note though is that, even if it is not mandatory,
>>> the userspace
>>> +   is strongly recommended to always use the cache synchronization
>>> ioctl
>>> +   (DMA_BUF_IOCTL_SYNC) discussed next.
>>> +
>>> +   Some systems might need some sort of cache coherency management
>>> e.g. when
>>> +   CPU and GPU domains are being accessed through dma-buf at the
>>> same time. To
>>> +   circumvent this problem there are begin/end coherency markers,
>>> that forward
>>> +   directly to existing dma-buf device drivers vfunc hooks.
>>> Userspace can make
>>> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
>>> sequence
>>> +   would be used like following:
>>> +     - mmap dma-buf fd
>>> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
>>> read/write
>>> +       to mmap area 3. SYNC_END ioctl. This can be repeated as
>>> often as you
>>> +       want (with the new data being consumed by the GPU or say
>>> scanout device)
>>> +     - munmap once you don't need the buffer any more
>>> +
>>> +    In principle systems with the memory cache shared by the GPU
>>> and CPU may
>>> +    not need SYNC_START and SYNC_END but still, userspace is always
>>> encouraged
>>> +    to use these ioctls before and after, respectively, when
>>> accessing the
>>> +    mapped address.
>>>
>>
>> I think the wording here is far too weak. If this is a generic
>> user-space interface and syncing
>> is required for
>> a) Correctness: then syncing must be mandatory.
>> b) Optimal performance then an implementation must generate expected
>> results also in the absence of SYNC ioctls, but is allowed to rely on
>> correct pairing of SYNC_START and SYNC_END to render correctly.
>
> Thomas, do you think the following write-up captures this?
>
>
> -   No special interfaces, userspace simply calls mmap on the dma-buf fd.
> +   No special interfaces, userspace simply calls mmap on the dma-buf
> fd, making
> +   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is
> *always*
> +   used when the access happens. This is discussed next paragraphs.
> +
> +   Some systems might need some sort of cache coherency management
> e.g. when
> +   CPU and GPU domains are being accessed through dma-buf at the same
> time. To
> +   circumvent this problem there are begin/end coherency markers,
> that forward
> +   directly to existing dma-buf device drivers vfunc hooks. Userspace
> can make
> +   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
> sequence
> +   would be used like following:
> +     - mmap dma-buf fd
> +     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2.
> read/write
> +       to mmap area 3. SYNC_END ioctl. This can be repeated as often
> as you
> +       want (with the new data being consumed by the GPU or say
> scanout device)
> +     - munmap once you don't need the buffer any more
> +
> +    Therefore, for correctness and optimal performance, systems with
> the memory
> +    cache shared by the GPU and CPU i.e. the "coherent" and also
> "incoherent"
> +    systems are always required to use SYNC_START and SYNC_END before
> and
> +    after, respectively, when accessing the mapped address.
>
>
> Thank you,
>
> Tiago
Yes, that sounds better,

Thanks,
Thomas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/5] drm/i915: Implement end_cpu_access
  2015-12-18 19:19       ` Tiago Vignatti
@ 2015-12-22 20:51         ` Chris Wilson
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2015-12-22 20:51 UTC (permalink / raw)
  To: Tiago Vignatti; +Cc: dri-devel

On Fri, Dec 18, 2015 at 05:19:24PM -0200, Tiago Vignatti wrote:
> On 12/18/2015 05:02 PM, Tiago Vignatti wrote:
> >On 12/17/2015 06:01 AM, Chris Wilson wrote:
> >>On Wed, Dec 16, 2015 at 08:25:36PM -0200, Tiago Vignatti wrote:
> >>>This function is meant to be used with dma-buf mmap, when finishing
> >>>the CPU
> >>>access of the mapped pointer.
> >>>
> >>>+static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum
> >>>dma_data_direction direction)
> >>>+{
> >>>+    struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
> >>>+    struct drm_device *dev = obj->base.dev;
> >>>+    struct drm_i915_private *dev_priv = to_i915(dev);
> >>>+    bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL
> >>>|| direction == DMA_TO_DEVICE);
> >>>+    int ret;
> >>>+
> >>>+    mutex_lock(&dev->struct_mutex);
> >>>+    was_interruptible = dev_priv->mm.interruptible;
> >>>+    dev_priv->mm.interruptible = false;
> >>>+
> >>>+    ret = i915_gem_object_set_to_gtt_domain(obj, write);
> >>
> >>This only needs to pass .write=false. The dma-buf direction is
> >>only for the period of the user access, and we are now flushing the
> >>caches. This is equivalent to the sw-finish ioctl and ideally we just
> >>want the i915_gem_object_flush_cpu_write_domain().
> >
> >in fact the only usage so far I found for end_cpu_access is when the
> >pinned buffer is scanout out. Should I pretty much copy sw-finish in
> >end_cpu_access then?
> 
> And do you think it's okay to declare
> i915_gem_object_flush_cpu_write_domain outside its file's only
> scope?

Whilst the simplicity of just doing the flush appeals, calling
set_gtt_domain(write=false) isn't that much heavier (the difference will
be lost in the noise of any clflushing) and is going to be always correct.
Whereas just flushing the CPU domain may be a hassle for us in future.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-12-22 20:51 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-16 22:25 Direct userspace dma-buf mmap (v6) Tiago Vignatti
2015-12-16 22:25 ` [PATCH v6 1/5] drm: prime: Honour O_RDWR during prime-handle-to-fd Tiago Vignatti
2015-12-16 22:25 ` [PATCH v6 2/5] dma-buf: Remove range-based flush Tiago Vignatti
2015-12-16 22:25 ` [PATCH v6 3/5] dma-buf: Add ioctls to allow userspace to flush Tiago Vignatti
2015-12-17 18:19   ` Alex Deucher
2015-12-18 18:46     ` Tiago Vignatti
2015-12-17 21:58   ` Thomas Hellstrom
2015-12-18 15:29     ` Daniel Vetter
2015-12-18 19:50     ` Tiago Vignatti
2015-12-21  9:38       ` Thomas Hellstrom
2015-12-16 22:25 ` [PATCH v6 4/5] drm/i915: Implement end_cpu_access Tiago Vignatti
2015-12-17  8:01   ` Chris Wilson
2015-12-18 19:02     ` Tiago Vignatti
2015-12-18 19:19       ` Tiago Vignatti
2015-12-22 20:51         ` Chris Wilson
2015-12-16 22:25 ` [PATCH v6 5/5] drm/i915: Use CPU mapping for userspace dma-buf mmap() Tiago Vignatti
2015-12-16 22:25 ` [PATCH igt v6 1/6] lib: Add gem_userptr and __gem_userptr helpers Tiago Vignatti
2015-12-16 22:25 ` [PATCH igt v6 2/6] prime_mmap: Add new test for calling mmap() on dma-buf fds Tiago Vignatti
2015-12-16 22:25 ` [PATCH igt v6 3/6] prime_mmap: Add basic tests to write in a bo using CPU Tiago Vignatti
2015-12-16 22:25 ` [PATCH igt v6 4/6] lib: Add prime_sync_start and prime_sync_end helpers Tiago Vignatti
2015-12-17 10:18   ` Daniel Vetter
2015-12-16 22:25 ` [PATCH igt v6 5/6] tests: Add kms_mmap_write_crc for cache coherency tests Tiago Vignatti
2015-12-17  7:53   ` Chris Wilson
2015-12-16 22:25 ` [PATCH igt v6 6/6] tests: Add prime_mmap_coherency " Tiago Vignatti
2015-12-17 10:15 ` Direct userspace dma-buf mmap (v6) Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.