All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency
@ 2017-05-08 15:03 Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 01/18] vb2: Rename confusingly named internal buffer preparation functions Sakari Ailus
                   ` (18 more replies)
  0 siblings, 19 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Hello,

This is a rebased and partially reworked version of the vb2 cache hints
support patch series posted by first myself, then Laurent and then myself
again.

I'm still posting this as RFC primarily because more testing and driver
changes will be needed. In particular, a lot of platform drivers assume
non-coherent memory but are not properly labelled as such.

Please see the end of the message for detailed changes.

This set unifies the cache coherency hint flags and corrects cache
management in videobuf2 dma-contig and dma-sg memtype implementation. The
support for non-coherent memory is completed: support for MMAP buffers is
added and begin_cpu_access and end_cpu_access functions are added to DMA
ops.

Comments are welcome.

changes since RFC v3:

- Document V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour for QBUF and DQBUF
  for CAPTURE and OUTPUT buffers.

- Set queue dma_attrs DMA_ATTR_NON_CONSISTENT flag for omap3isp.

- Put dma_sgt in vb2_dc_buf in order to avoid allocating and releasing it
  separately. It's generally needed anyway.

- Buffer preparation DMA direction is generally DMA_TO_DEVICE for both
  CAPTURE and OUTPUT buffers: V4L2 does not guarantee that the user space
  could not write to capture buffers as well.

  Documentation/DMA-API.txt:

	- Before reading values that have been written by DMA from the device
	  (use the DMA_FROM_DEVICE direction)
	- After writing values that will be written to the device using DMA
	  (use the DMA_TO_DEVICE) direction
	- before *and* after handing memory to the device if the memory is
	  DMA_BIDIRECTIONAL

  Cache maintenance can also be skipped for OUTPUT buffers in buffer
  finish as the hardware did not write to the buffer.

changes since RFC v2:

- Nicer looking tests for the need for syncing.

- Also set DMA attributes for USERPTR buffers.

- Unconditionally assign buf->attrs for MMAP buffers.

- Don't call vb2_dc_get_base_sgt() until buf->dev is set.

- Provide {begin,end}_cpu_access() dmabuf ops for cache management.

- Make similar changes to dma-sg memops to support DMA attributes.


Sakari Ailus (13):
  vb2: Rename confusingly named internal buffer preparation functions
  vb2: Move buffer cache synchronisation to prepare from queue
  vb2: Move cache synchronisation from buffer done to dqbuf handler
  v4l: Unify cache management hint buffer flags
  vb2: Anticipate queue specific DMA attributes for USERPTR buffers
  vb2: dma-contig: Assign DMA attrs for a buffer unconditionally
  vb2: dma-contig: Remove redundant sgt_base field
  vb2: dma-contig: Don't warn on failure in obtaining scatterlist
  vb2: dma-contig: Move vb2_dc_get_base_sgt() up
  vb2: dma-contig: Fix DMA attribute and cache management
  vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs
  vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs
  vb2: Improve struct vb2_mem_ops documentation; alloc and put are for
    MMAP

Samu Onkalo (1):
  vb2: Don't sync cache for a buffer if so requested

 Documentation/media/uapi/v4l/buffer.rst            |  24 ++--
 .../media/uapi/v4l/vidioc-prepare-buf.rst          |   5 +-
 drivers/media/v4l2-core/videobuf2-core.c           | 129 ++++++++++++++-------
 drivers/media/v4l2-core/videobuf2-dma-contig.c     | 120 ++++++++++++-------
 drivers/media/v4l2-core/videobuf2-dma-sg.c         |  47 ++++++--
 drivers/media/v4l2-core/videobuf2-v4l2.c           |  14 ++-
 drivers/media/v4l2-core/videobuf2-vmalloc.c        |   3 +-
 include/media/videobuf2-core.h                     |  46 +++++---
 include/trace/events/v4l2.h                        |   3 +-
 include/uapi/linux/videodev2.h                     |   7 +-
 10 files changed, 263 insertions(+), 135 deletions(-)

-- 
Regards,
Sakari



Sakari Ailus (17):
  vb2: Rename confusingly named internal buffer preparation functions
  vb2: Move buffer cache synchronisation to prepare from queue
  vb2: Move cache synchronisation from buffer done to dqbuf handler
  v4l: Unify cache management hint buffer flags
  vb2: Anticipate queue specific DMA attributes for USERPTR buffers
  vb2: dma-contig: Assign DMA attrs for a buffer unconditionally
  vb2: dma-contig: Remove redundant sgt_base field
  vb2: dma-contig: Don't warn on failure in obtaining scatterlist
  vb2: dma-contig: Allocate sgt as part of struct vb2_dc_buf
  vb2: dma-contig: Fix DMA attribute and cache management
  vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs
  vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs
  vb2: Improve struct vb2_mem_ops documentation; alloc and put are for
    MMAP
  vb2: Dma direction is always DMA_TO_DEVICE in buffer preparation
  vb2: Do sync plane cache only for CAPTURE buffers in finish memop
  docs-rst: Document precise V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour
  v4l: Use non-consistent DMA mappings for hardware that deserves it

Samu Onkalo (1):
  vb2: Don't sync cache for a buffer if so requested

 Documentation/media/uapi/v4l/buffer.rst            |  45 +++--
 .../media/uapi/v4l/vidioc-prepare-buf.rst          |   5 +-
 drivers/media/platform/omap3isp/ispvideo.c         |   1 +
 drivers/media/v4l2-core/videobuf2-core.c           | 122 ++++++++-----
 drivers/media/v4l2-core/videobuf2-dma-contig.c     | 191 ++++++++++++---------
 drivers/media/v4l2-core/videobuf2-dma-sg.c         |  85 ++++++---
 drivers/media/v4l2-core/videobuf2-v4l2.c           |  14 +-
 drivers/media/v4l2-core/videobuf2-vmalloc.c        |   3 +-
 include/media/videobuf2-core.h                     |  46 +++--
 include/trace/events/v4l2.h                        |   3 +-
 include/uapi/linux/videodev2.h                     |   7 +-
 11 files changed, 338 insertions(+), 184 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC v4 01/18] vb2: Rename confusingly named internal buffer preparation functions
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Rename __qbuf_*() functions which are specific to a buffer type as
__prepare_*() which matches with what they do. The naming was there for
historical reasons; the purpose of the functions was changed without
renaming them.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 94afbbf9..8df680d 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -956,9 +956,9 @@ void vb2_discard_done(struct vb2_queue *q)
 EXPORT_SYMBOL_GPL(vb2_discard_done);
 
 /**
- * __qbuf_mmap() - handle qbuf of an MMAP buffer
+ * __prepare_mmap() - prepare an MMAP buffer
  */
-static int __qbuf_mmap(struct vb2_buffer *vb, const void *pb)
+static int __prepare_mmap(struct vb2_buffer *vb, const void *pb)
 {
 	int ret = 0;
 
@@ -969,9 +969,9 @@ static int __qbuf_mmap(struct vb2_buffer *vb, const void *pb)
 }
 
 /**
- * __qbuf_userptr() - handle qbuf of a USERPTR buffer
+ * __prepare_userptr() - prepare a USERPTR buffer
  */
-static int __qbuf_userptr(struct vb2_buffer *vb, const void *pb)
+static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
 {
 	struct vb2_plane planes[VB2_MAX_PLANES];
 	struct vb2_queue *q = vb->vb2_queue;
@@ -1087,9 +1087,9 @@ static int __qbuf_userptr(struct vb2_buffer *vb, const void *pb)
 }
 
 /**
- * __qbuf_dmabuf() - handle qbuf of a DMABUF buffer
+ * __prepare_dmabuf() - prepare a DMABUF buffer
  */
-static int __qbuf_dmabuf(struct vb2_buffer *vb, const void *pb)
+static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
 {
 	struct vb2_plane planes[VB2_MAX_PLANES];
 	struct vb2_queue *q = vb->vb2_queue;
@@ -1255,13 +1255,13 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 
 	switch (q->memory) {
 	case VB2_MEMORY_MMAP:
-		ret = __qbuf_mmap(vb, pb);
+		ret = __prepare_mmap(vb, pb);
 		break;
 	case VB2_MEMORY_USERPTR:
-		ret = __qbuf_userptr(vb, pb);
+		ret = __prepare_userptr(vb, pb);
 		break;
 	case VB2_MEMORY_DMABUF:
-		ret = __qbuf_dmabuf(vb, pb);
+		ret = __prepare_dmabuf(vb, pb);
 		break;
 	default:
 		WARN(1, "Invalid queue type\n");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 02/18] vb2: Move buffer cache synchronisation to prepare from queue
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
@ 2017-05-08 15:03   ` Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
                     ` (17 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The buffer cache should be synchronised in buffer preparation, not when
the buffer is queued to the device. Fix this.

Mmap buffers do not need cache synchronisation since they are always
coherent.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 8df680d..8bf3369 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -1227,23 +1227,19 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
 static void __enqueue_in_driver(struct vb2_buffer *vb)
 {
 	struct vb2_queue *q = vb->vb2_queue;
-	unsigned int plane;
 
 	vb->state = VB2_BUF_STATE_ACTIVE;
 	atomic_inc(&q->owned_by_drv_count);
 
 	trace_vb2_buf_queue(q, vb);
 
-	/* sync buffers */
-	for (plane = 0; plane < vb->num_planes; ++plane)
-		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
-
 	call_void_vb_qop(vb, buf_queue, vb);
 }
 
 static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 {
 	struct vb2_queue *q = vb->vb2_queue;
+	unsigned int plane;
 	int ret;
 
 	if (q->error) {
@@ -1268,11 +1264,19 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 		ret = -EINVAL;
 	}
 
-	if (ret)
+	if (ret) {
 		dprintk(1, "buffer preparation failed: %d\n", ret);
-	vb->state = ret ? VB2_BUF_STATE_DEQUEUED : VB2_BUF_STATE_PREPARED;
+		vb->state = VB2_BUF_STATE_DEQUEUED;
+		return ret;
+	}
 
-	return ret;
+	/* sync buffers */
+	for (plane = 0; plane < vb->num_planes; ++plane)
+		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
+
+	vb->state = VB2_BUF_STATE_PREPARED;
+
+	return 0;
 }
 
 int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 02/18] vb2: Move buffer cache synchronisation to prepare from queue
@ 2017-05-08 15:03   ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: daniel.vetter, dri-devel, hverkuil, kyungmin.park,
	laurent.pinchart, posciak, m.szyprowski

The buffer cache should be synchronised in buffer preparation, not when
the buffer is queued to the device. Fix this.

Mmap buffers do not need cache synchronisation since they are always
coherent.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 8df680d..8bf3369 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -1227,23 +1227,19 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
 static void __enqueue_in_driver(struct vb2_buffer *vb)
 {
 	struct vb2_queue *q = vb->vb2_queue;
-	unsigned int plane;
 
 	vb->state = VB2_BUF_STATE_ACTIVE;
 	atomic_inc(&q->owned_by_drv_count);
 
 	trace_vb2_buf_queue(q, vb);
 
-	/* sync buffers */
-	for (plane = 0; plane < vb->num_planes; ++plane)
-		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
-
 	call_void_vb_qop(vb, buf_queue, vb);
 }
 
 static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 {
 	struct vb2_queue *q = vb->vb2_queue;
+	unsigned int plane;
 	int ret;
 
 	if (q->error) {
@@ -1268,11 +1264,19 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 		ret = -EINVAL;
 	}
 
-	if (ret)
+	if (ret) {
 		dprintk(1, "buffer preparation failed: %d\n", ret);
-	vb->state = ret ? VB2_BUF_STATE_DEQUEUED : VB2_BUF_STATE_PREPARED;
+		vb->state = VB2_BUF_STATE_DEQUEUED;
+		return ret;
+	}
 
-	return ret;
+	/* sync buffers */
+	for (plane = 0; plane < vb->num_planes; ++plane)
+		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
+
+	vb->state = VB2_BUF_STATE_PREPARED;
+
+	return 0;
 }
 
 int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 01/18] vb2: Rename confusingly named internal buffer preparation functions Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2018-10-05  4:34     ` Tomasz Figa
  2017-05-08 15:03 ` [RFC v4 04/18] v4l: Unify cache management hint buffer flags Sakari Ailus
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The cache synchronisation may be a time consuming operation and thus not
best performed in an interrupt which is a typical context for
vb2_buffer_done() calls. This may consume up to tens of ms on some
machines, depending on the buffer size.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 8bf3369..e866115 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -889,7 +889,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
 {
 	struct vb2_queue *q = vb->vb2_queue;
 	unsigned long flags;
-	unsigned int plane;
 
 	if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
 		return;
@@ -910,10 +909,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
 	dprintk(4, "done processing on buffer %d, state: %d\n",
 			vb->index, state);
 
-	/* sync buffers */
-	for (plane = 0; plane < vb->num_planes; ++plane)
-		call_void_memop(vb, finish, vb->planes[plane].mem_priv);
-
 	spin_lock_irqsave(&q->done_lock, flags);
 	if (state == VB2_BUF_STATE_QUEUED ||
 	    state == VB2_BUF_STATE_REQUEUEING) {
@@ -1573,6 +1568,10 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)
 
 	vb->state = VB2_BUF_STATE_DEQUEUED;
 
+	/* sync buffers */
+	for (i = 0; i < vb->num_planes; ++i)
+		call_void_memop(vb, finish, vb->planes[i].mem_priv);
+
 	/* unmap DMABUF buffer */
 	if (q->memory == VB2_MEMORY_DMABUF)
 		for (i = 0; i < vb->num_planes; ++i) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 04/18] v4l: Unify cache management hint buffer flags
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (2 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 05/18] vb2: Anticipate queue specific DMA attributes for USERPTR buffers Sakari Ailus
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The V4L2_BUF_FLAG_NO_CACHE_INVALIDATE and V4L2_BUF_FLAG_NO_CACHE_CLEAN
buffer flags are currently not used by the kernel. Replace the definitions
by a single V4L2_BUF_FLAG_NO_CACHE_SYNC flag to be used by further
patches.

Different cache architectures should not be visible to the user space
which can make no meaningful use of the differences anyway. In case a
device can make use of non-coherent memory accesses, the necessary cache
operations depend on the CPU architecture and the buffer type, not the
requests of the user. The cache operation itself may be skipped on the
user's request which was the purpose of the two flags.

On ARM the invalidate and clean are separate operations whereas on
x86(-64) the two are a single operation (flush). Whether the hardware uses
the buffer for reading (V4L2_BUF_TYPE_*_OUTPUT*) or writing
(V4L2_BUF_TYPE_*CAPTURE*) already defines the required cache operation
(clean and invalidate, respectively). No user input is required.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 Documentation/media/uapi/v4l/buffer.rst            | 24 ++++++++--------------
 .../media/uapi/v4l/vidioc-prepare-buf.rst          |  5 ++---
 include/trace/events/v4l2.h                        |  3 +--
 include/uapi/linux/videodev2.h                     |  7 +++++--
 4 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
index ae6ee73..9eb42bd 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -559,23 +559,17 @@ Buffer Flags
 	:ref:`VIDIOC_PREPARE_BUF <VIDIOC_QBUF>`,
 	:ref:`VIDIOC_QBUF` or
 	:ref:`VIDIOC_DQBUF <VIDIOC_QBUF>` ioctl is called.
-    * .. _`V4L2-BUF-FLAG-NO-CACHE-INVALIDATE`:
+    * .. _`V4L2-BUF-FLAG-NO-CACHE-SYNC`:
 
-      - ``V4L2_BUF_FLAG_NO_CACHE_INVALIDATE``
+      - ``V4L2_BUF_FLAG_NO_CACHE_SYNC``
       - 0x00000800
-      - Caches do not have to be invalidated for this buffer. Typically
-	applications shall use this flag if the data captured in the
-	buffer is not going to be touched by the CPU, instead the buffer
-	will, probably, be passed on to a DMA-capable hardware unit for
-	further processing or output.
-    * .. _`V4L2-BUF-FLAG-NO-CACHE-CLEAN`:
-
-      - ``V4L2_BUF_FLAG_NO_CACHE_CLEAN``
-      - 0x00001000
-      - Caches do not have to be cleaned for this buffer. Typically
-	applications shall use this flag for output buffers if the data in
-	this buffer has not been created by the CPU but by some
-	DMA-capable unit, in which case caches have not been used.
+      - Do not perform CPU cache synchronisation operations when the buffer is
+	queued or dequeued. The user is responsible for the correct use of
+	this flag. It should be only used when the buffer is not accessed
+	using the CPU, e.g. the buffer is written to by a hardware block and
+	then read by another one, in which case the flag should be set in both
+	:ref:`VIDIOC_QBUF` and :ref:`VIDIOC_DQBUF` ioctls. This flag has no
+	effect on some devices / architectures.
     * .. _`V4L2-BUF-FLAG-LAST`:
 
       - ``V4L2_BUF_FLAG_LAST``
diff --git a/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst b/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
index bdcfd9f..80aeb7e 100644
--- a/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
@@ -36,9 +36,8 @@ pass ownership of the buffer to the driver before actually enqueuing it,
 using the :ref:`VIDIOC_QBUF` ioctl, and to prepare it for future I/O. Such
 preparations may include cache invalidation or cleaning. Performing them
 in advance saves time during the actual I/O. In case such cache
-operations are not required, the application can use one of
-``V4L2_BUF_FLAG_NO_CACHE_INVALIDATE`` and
-``V4L2_BUF_FLAG_NO_CACHE_CLEAN`` flags to skip the respective step.
+operations are not required, the application can use the
+``V4L2_BUF_FLAG_NO_CACHE_SYNC`` flag to skip the cache synchronization step.
 
 The struct :c:type:`v4l2_buffer` structure is specified in
 :ref:`buffer`.
diff --git a/include/trace/events/v4l2.h b/include/trace/events/v4l2.h
index b3a85b3..3d5897d 100644
--- a/include/trace/events/v4l2.h
+++ b/include/trace/events/v4l2.h
@@ -81,8 +81,7 @@ SHOW_FIELD
 		{ V4L2_BUF_FLAG_ERROR,		     "ERROR" },		      \
 		{ V4L2_BUF_FLAG_TIMECODE,	     "TIMECODE" },	      \
 		{ V4L2_BUF_FLAG_PREPARED,	     "PREPARED" },	      \
-		{ V4L2_BUF_FLAG_NO_CACHE_INVALIDATE, "NO_CACHE_INVALIDATE" }, \
-		{ V4L2_BUF_FLAG_NO_CACHE_CLEAN,	     "NO_CACHE_CLEAN" },      \
+		{ V4L2_BUF_FLAG_NO_CACHE_SYNC,	     "NO_CACHE_SYNC" },	      \
 		{ V4L2_BUF_FLAG_TIMESTAMP_MASK,	     "TIMESTAMP_MASK" },      \
 		{ V4L2_BUF_FLAG_TIMESTAMP_UNKNOWN,   "TIMESTAMP_UNKNOWN" },   \
 		{ V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC, "TIMESTAMP_MONOTONIC" }, \
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2b8feb8..2e046bb 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -939,8 +939,11 @@ struct v4l2_buffer {
 #define V4L2_BUF_FLAG_TIMECODE			0x00000100
 /* Buffer is prepared for queuing */
 #define V4L2_BUF_FLAG_PREPARED			0x00000400
-/* Cache handling flags */
-#define V4L2_BUF_FLAG_NO_CACHE_INVALIDATE	0x00000800
+/* Cache sync hint */
+#define V4L2_BUF_FLAG_NO_CACHE_SYNC		0x00000800
+/* DEPRECATED. THIS WILL BE REMOVED IN THE FUTURE! */
+#define V4L2_BUF_FLAG_NO_CACHE_INVALIDATE	V4L2_BUF_FLAG_NO_CACHE_SYNC
+/* DEPRECATED. THIS WILL BE REMOVED IN THE FUTURE! */
 #define V4L2_BUF_FLAG_NO_CACHE_CLEAN		0x00001000
 /* Timestamp type */
 #define V4L2_BUF_FLAG_TIMESTAMP_MASK		0x0000e000
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 05/18] vb2: Anticipate queue specific DMA attributes for USERPTR buffers
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (3 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 04/18] v4l: Unify cache management hint buffer flags Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 06/18] vb2: dma-contig: Assign DMA attrs for a buffer unconditionally Sakari Ailus
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The DMA attributes were available for the memop implementation for MMAP
buffers but not for USERPTR buffers. Do the same for USERPTR. This patch
makes no functional changes.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-core.c       | 2 +-
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 3 ++-
 drivers/media/v4l2-core/videobuf2-dma-sg.c     | 3 ++-
 drivers/media/v4l2-core/videobuf2-vmalloc.c    | 3 ++-
 include/media/videobuf2-core.h                 | 3 ++-
 5 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index e866115..c659b64 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -1025,7 +1025,7 @@ static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
 		mem_priv = call_ptr_memop(vb, get_userptr,
 				q->alloc_devs[plane] ? : q->dev,
 				planes[plane].m.userptr,
-				planes[plane].length, dma_dir);
+				planes[plane].length, dma_dir, q->dma_attrs);
 		if (IS_ERR(mem_priv)) {
 			dprintk(1, "failed acquiring userspace memory for plane %d\n",
 				plane);
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index d29a07f..30082a4 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -475,7 +475,8 @@ static inline dma_addr_t vb2_dc_pfn_to_dma(struct device *dev, unsigned long pfn
 #endif
 
 static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
-	unsigned long size, enum dma_data_direction dma_dir)
+	unsigned long size, enum dma_data_direction dma_dir,
+	unsigned long attrs)
 {
 	struct vb2_dc_buf *buf;
 	struct frame_vector *vec;
diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c b/drivers/media/v4l2-core/videobuf2-dma-sg.c
index 29fde1a..102ddb2 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
@@ -220,7 +220,8 @@ static void vb2_dma_sg_finish(void *buf_priv)
 
 static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,
 				    unsigned long size,
-				    enum dma_data_direction dma_dir)
+				    enum dma_data_direction dma_dir,
+				    unsigned long dma_attrs)
 {
 	struct vb2_dma_sg_buf *buf;
 	struct sg_table *sgt;
diff --git a/drivers/media/v4l2-core/videobuf2-vmalloc.c b/drivers/media/v4l2-core/videobuf2-vmalloc.c
index f83253a..a4914fc 100644
--- a/drivers/media/v4l2-core/videobuf2-vmalloc.c
+++ b/drivers/media/v4l2-core/videobuf2-vmalloc.c
@@ -73,7 +73,8 @@ static void vb2_vmalloc_put(void *buf_priv)
 
 static void *vb2_vmalloc_get_userptr(struct device *dev, unsigned long vaddr,
 				     unsigned long size,
-				     enum dma_data_direction dma_dir)
+				     enum dma_data_direction dma_dir,
+				     unsigned long dma_attrs)
 {
 	struct vb2_vmalloc_buf *buf;
 	struct frame_vector *vec;
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index cb97c22..4172f6e 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -122,7 +122,8 @@ struct vb2_mem_ops {
 
 	void		*(*get_userptr)(struct device *dev, unsigned long vaddr,
 					unsigned long size,
-					enum dma_data_direction dma_dir);
+					enum dma_data_direction dma_dir,
+					unsigned long dma_attrs);
 	void		(*put_userptr)(void *buf_priv);
 
 	void		(*prepare)(void *buf_priv);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 06/18] vb2: dma-contig: Assign DMA attrs for a buffer unconditionally
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (4 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 05/18] vb2: Anticipate queue specific DMA attributes for USERPTR buffers Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field Sakari Ailus
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

attrs used to be a pointer and the caller of vb2_dc_alloc() could
optionally provide it, or NULL. This was when struct dma_attrs was used
to describe DMA attributes rather than an unsigned long value. There is no
longer a need to maintain the condition, assign the value unconditionally.
There is no functional difference because the memory was initialised to
zero anyway.

Fixes: 00085f1e ("dma-mapping: use unsigned long for dma_attrs")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 30082a4..a8a46a8 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -149,8 +149,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 	if (!buf)
 		return ERR_PTR(-ENOMEM);
 
-	if (attrs)
-		buf->attrs = attrs;
+	buf->attrs = attrs;
 	buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
 					GFP_KERNEL | gfp_flags, buf->attrs);
 	if (!buf->cookie) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (5 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 06/18] vb2: dma-contig: Assign DMA attrs for a buffer unconditionally Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-10 10:13     ` Tomasz Figa
  2017-05-08 15:03   ` Sakari Ailus
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The struct vb2_dc_buf contains two struct sg_table fields: sgt_base and
dma_sgt. The former is used by DMA-BUF buffers whereas the latter is used
by USERPTR.

Unify the two, leaving dma_sgt.

MMAP buffers do not need cache flushing since they have been allocated
using dma_alloc_coherent().

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index a8a46a8..ddbbcf0 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -31,12 +31,13 @@ struct vb2_dc_buf {
 	unsigned long			attrs;
 	enum dma_data_direction		dma_dir;
 	struct sg_table			*dma_sgt;
-	struct frame_vector		*vec;
 
 	/* MMAP related */
 	struct vb2_vmarea_handler	handler;
 	refcount_t			refcount;
-	struct sg_table			*sgt_base;
+
+	/* USERPTR related */
+	struct frame_vector		*vec;
 
 	/* DMABUF related */
 	struct dma_buf_attachment	*db_attach;
@@ -96,7 +97,7 @@ static void vb2_dc_prepare(void *buf_priv)
 	struct sg_table *sgt = buf->dma_sgt;
 
 	/* DMABUF exporter will flush the cache for us */
-	if (!sgt || buf->db_attach)
+	if (!buf->vec)
 		return;
 
 	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
@@ -109,7 +110,7 @@ static void vb2_dc_finish(void *buf_priv)
 	struct sg_table *sgt = buf->dma_sgt;
 
 	/* DMABUF exporter will flush the cache for us */
-	if (!sgt || buf->db_attach)
+	if (!buf->vec)
 		return;
 
 	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
@@ -126,9 +127,9 @@ static void vb2_dc_put(void *buf_priv)
 	if (!refcount_dec_and_test(&buf->refcount))
 		return;
 
-	if (buf->sgt_base) {
-		sg_free_table(buf->sgt_base);
-		kfree(buf->sgt_base);
+	if (buf->dma_sgt) {
+		sg_free_table(buf->dma_sgt);
+		kfree(buf->dma_sgt);
 	}
 	dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
 		       buf->attrs);
@@ -239,13 +240,13 @@ static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf, struct device *dev,
 	/* Copy the buf->base_sgt scatter list to the attachment, as we can't
 	 * map the same scatter list to multiple attachments at the same time.
 	 */
-	ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL);
+	ret = sg_alloc_table(sgt, buf->dma_sgt->orig_nents, GFP_KERNEL);
 	if (ret) {
 		kfree(attach);
 		return -ENOMEM;
 	}
 
-	rd = buf->sgt_base->sgl;
+	rd = buf->dma_sgt->sgl;
 	wr = sgt->sgl;
 	for (i = 0; i < sgt->orig_nents; ++i) {
 		sg_set_page(wr, sg_page(rd), rd->length, rd->offset);
@@ -396,10 +397,10 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 	exp_info.flags = flags;
 	exp_info.priv = buf;
 
-	if (!buf->sgt_base)
-		buf->sgt_base = vb2_dc_get_base_sgt(buf);
+	if (!buf->dma_sgt)
+		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
 
-	if (WARN_ON(!buf->sgt_base))
+	if (WARN_ON(!buf->dma_sgt))
 		return NULL;
 
 	dbuf = dma_buf_export(&exp_info);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 08/18] vb2: dma-contig: Don't warn on failure in obtaining scatterlist
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
@ 2017-05-08 15:03   ` Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
                     ` (17 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

vb2_dc_get_base_sgt() which obtains the scatterlist already prints
information on why the scatterlist could not be obtained.

Also, remove the useless warning of a failed kmalloc().

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index ddbbcf0..22636cd 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -370,10 +370,8 @@ static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
 	struct sg_table *sgt;
 
 	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-	if (!sgt) {
-		dev_err(buf->dev, "failed to alloc sg table\n");
+	if (!sgt)
 		return NULL;
-	}
 
 	ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
 		buf->size, buf->attrs);
@@ -400,7 +398,7 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 	if (!buf->dma_sgt)
 		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
 
-	if (WARN_ON(!buf->dma_sgt))
+	if (!buf->dma_sgt)
 		return NULL;
 
 	dbuf = dma_buf_export(&exp_info);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 08/18] vb2: dma-contig: Don't warn on failure in obtaining scatterlist
@ 2017-05-08 15:03   ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: daniel.vetter, dri-devel, hverkuil, kyungmin.park,
	laurent.pinchart, posciak, m.szyprowski

vb2_dc_get_base_sgt() which obtains the scatterlist already prints
information on why the scatterlist could not be obtained.

Also, remove the useless warning of a failed kmalloc().

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index ddbbcf0..22636cd 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -370,10 +370,8 @@ static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
 	struct sg_table *sgt;
 
 	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-	if (!sgt) {
-		dev_err(buf->dev, "failed to alloc sg table\n");
+	if (!sgt)
 		return NULL;
-	}
 
 	ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
 		buf->size, buf->attrs);
@@ -400,7 +398,7 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 	if (!buf->dma_sgt)
 		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
 
-	if (WARN_ON(!buf->dma_sgt))
+	if (!buf->dma_sgt)
 		return NULL;
 
 	dbuf = dma_buf_export(&exp_info);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 09/18] vb2: dma-contig: Allocate sgt as part of struct vb2_dc_buf
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (7 preceding siblings ...)
  2017-05-08 15:03   ` Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The scatterlist table is needed in the vast majority of the cases.
Allocate struct sg_table as part of the struct. This has the benefit of
making managing the buffer data structure allocation, setup and release
more simple.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 83 ++++++++++----------------
 1 file changed, 32 insertions(+), 51 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 22636cd..0afc3da 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -30,6 +30,7 @@ struct vb2_dc_buf {
 	dma_addr_t			dma_addr;
 	unsigned long			attrs;
 	enum dma_data_direction		dma_dir;
+	struct sg_table			__dma_sgt;
 	struct sg_table			*dma_sgt;
 
 	/* MMAP related */
@@ -127,10 +128,9 @@ static void vb2_dc_put(void *buf_priv)
 	if (!refcount_dec_and_test(&buf->refcount))
 		return;
 
-	if (buf->dma_sgt) {
+	if (buf->dma_sgt)
 		sg_free_table(buf->dma_sgt);
-		kfree(buf->dma_sgt);
-	}
+
 	dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
 		       buf->attrs);
 	put_device(buf->dev);
@@ -364,26 +364,6 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
 	.release = vb2_dc_dmabuf_ops_release,
 };
 
-static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
-{
-	int ret;
-	struct sg_table *sgt;
-
-	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-	if (!sgt)
-		return NULL;
-
-	ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
-		buf->size, buf->attrs);
-	if (ret < 0) {
-		dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
-		kfree(sgt);
-		return NULL;
-	}
-
-	return sgt;
-}
-
 static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 {
 	struct vb2_dc_buf *buf = buf_priv;
@@ -395,11 +375,19 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 	exp_info.flags = flags;
 	exp_info.priv = buf;
 
-	if (!buf->dma_sgt)
-		buf->dma_sgt = vb2_dc_get_base_sgt(buf);
+	if (!buf->dma_sgt) {
+		int ret;
 
-	if (!buf->dma_sgt)
-		return NULL;
+		ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt,
+					    buf->cookie, buf->dma_addr,
+					    buf->size, buf->attrs);
+		if (ret < 0) {
+			dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
+			return NULL;
+		}
+
+		buf->dma_sgt = &buf->__dma_sgt;
+	}
 
 	dbuf = dma_buf_export(&exp_info);
 	if (IS_ERR(dbuf))
@@ -435,7 +423,6 @@ static void vb2_dc_put_userptr(void *buf_priv)
 		for (i = 0; i < frame_vector_count(buf->vec); i++)
 			set_page_dirty_lock(pages[i]);
 		sg_free_table(sgt);
-		kfree(sgt);
 	}
 	vb2_destroy_framevec(buf->vec);
 	kfree(buf);
@@ -481,7 +468,6 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	unsigned long offset;
 	int n_pages, i;
 	int ret = 0;
-	struct sg_table *sgt;
 	unsigned long contig_size;
 	unsigned long dma_align = dma_get_cache_alignment();
 
@@ -529,33 +515,31 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 		goto out;
 	}
 
-	sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
-	if (!sgt) {
-		pr_err("failed to allocate sg table\n");
-		ret = -ENOMEM;
-		goto fail_pfnvec;
-	}
-
-	ret = sg_alloc_table_from_pages(sgt, frame_vector_pages(vec), n_pages,
-		offset, size, GFP_KERNEL);
+	ret = sg_alloc_table_from_pages(&buf->__dma_sgt,
+					frame_vector_pages(vec), n_pages,
+					offset, size, GFP_KERNEL);
 	if (ret) {
 		pr_err("failed to initialize sg table\n");
-		goto fail_sgt;
+		goto fail_pfnvec;
 	}
 
+	buf->dma_sgt = &buf->__dma_sgt;
+
 	/*
 	 * No need to sync to the device, this will happen later when the
 	 * prepare() memop is called.
 	 */
-	sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				      buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-	if (sgt->nents <= 0) {
+	buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
+					       buf->dma_sgt->orig_nents,
+					       buf->dma_dir,
+					       DMA_ATTR_SKIP_CPU_SYNC);
+	if (buf->dma_sgt->nents <= 0) {
 		pr_err("failed to map scatterlist\n");
 		ret = -EIO;
 		goto fail_sgt_init;
 	}
 
-	contig_size = vb2_dc_get_contiguous_size(sgt);
+	contig_size = vb2_dc_get_contiguous_size(buf->dma_sgt);
 	if (contig_size < size) {
 		pr_err("contiguous mapping is too small %lu/%lu\n",
 			contig_size, size);
@@ -563,22 +547,19 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 		goto fail_map_sg;
 	}
 
-	buf->dma_addr = sg_dma_address(sgt->sgl);
-	buf->dma_sgt = sgt;
+	buf->dma_addr = sg_dma_address(buf->dma_sgt->sgl);
 out:
 	buf->size = size;
 
 	return buf;
 
 fail_map_sg:
-	dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-			   buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+	dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
+			   buf->dma_sgt->orig_nents, buf->dma_dir,
+			   DMA_ATTR_SKIP_CPU_SYNC);
 
 fail_sgt_init:
-	sg_free_table(sgt);
-
-fail_sgt:
-	kfree(sgt);
+	sg_free_table(buf->dma_sgt);
 
 fail_pfnvec:
 	vb2_destroy_framevec(vec);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
@ 2017-05-08 15:03   ` Sakari Ailus
  2017-05-08 15:03   ` Sakari Ailus
                     ` (17 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Patch ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA
attrs") added support for driver specific DMA attributes to
videobuf2-dma-contig but it had several issues in it.

In particular,

- cache operations were only performed on USERPTR buffers,

- DMA attributes were set only for MMAP buffers and

- it did not provide begin_cpu_access() and end_cpu_access() dma_buf_ops
  callbacks for cache syncronisation on exported MMAP buffers.

This patch corrects these issues.

Also arrange the header files alphabetically.

Fixes: ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA attrs")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 94 ++++++++++++++++++++------
 1 file changed, 72 insertions(+), 22 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 0afc3da..8b0298a 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -11,12 +11,12 @@
  */
 
 #include <linux/dma-buf.h>
+#include <linux/dma-mapping.h>
 #include <linux/module.h>
 #include <linux/refcount.h>
 #include <linux/scatterlist.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
-#include <linux/dma-mapping.h>
 
 #include <media/videobuf2-v4l2.h>
 #include <media/videobuf2-dma-contig.h>
@@ -97,12 +97,13 @@ static void vb2_dc_prepare(void *buf_priv)
 	struct vb2_dc_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (!buf->vec)
-		return;
-
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-			       buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
+				       buf->dma_dir);
 }
 
 static void vb2_dc_finish(void *buf_priv)
@@ -110,11 +111,13 @@ static void vb2_dc_finish(void *buf_priv)
 	struct vb2_dc_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (!buf->vec)
-		return;
-
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
+				    buf->dma_dir);
 }
 
 /*********************************************/
@@ -142,6 +145,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 			  gfp_t gfp_flags)
 {
 	struct vb2_dc_buf *buf;
+	int ret;
 
 	if (WARN_ON(!dev))
 		return ERR_PTR(-EINVAL);
@@ -152,9 +156,9 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 
 	buf->attrs = attrs;
 	buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
-					GFP_KERNEL | gfp_flags, buf->attrs);
+				      GFP_KERNEL | gfp_flags, buf->attrs);
 	if (!buf->cookie) {
-		dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
+		dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
 		kfree(buf);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -167,6 +171,16 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 	buf->size = size;
 	buf->dma_dir = dma_dir;
 
+	ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt, buf->cookie,
+				    buf->dma_addr, buf->size, buf->attrs);
+	if (ret < 0) {
+		dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
+			       buf->attrs);
+		put_device(dev);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	buf->dma_sgt = &buf->__dma_sgt;
 	buf->handler.refcount = &buf->refcount;
 	buf->handler.put = vb2_dc_put;
 	buf->handler.arg = buf;
@@ -339,6 +353,40 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum)
 	return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
 }
 
+static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
+					      enum dma_data_direction direction)
+{
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
+				    buf->dma_dir);
+
+	return 0;
+}
+
+static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
+					    enum dma_data_direction direction)
+{
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
+				       buf->dma_dir);
+
+	return 0;
+}
+
 static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
 {
 	struct vb2_dc_buf *buf = dbuf->priv;
@@ -359,6 +407,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
 	.unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
 	.kmap = vb2_dc_dmabuf_ops_kmap,
 	.kmap_atomic = vb2_dc_dmabuf_ops_kmap,
+	.begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
+	.end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
 	.vmap = vb2_dc_dmabuf_ops_vmap,
 	.mmap = vb2_dc_dmabuf_ops_mmap,
 	.release = vb2_dc_dmabuf_ops_release,
@@ -412,11 +462,12 @@ static void vb2_dc_put_userptr(void *buf_priv)
 
 	if (sgt) {
 		/*
-		 * No need to sync to CPU, it's already synced to the CPU
-		 * since the finish() memop will have been called before this.
+		 * Don't ask to skip cache sync in case if the user
+		 * did ask to skip cache flush the last time the
+		 * buffer was dequeued.
 		 */
 		dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				   buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+				   buf->dma_dir, buf->attrs);
 		pages = frame_vector_pages(buf->vec);
 		/* sgt should exist only if vector contains pages... */
 		BUG_ON(IS_ERR(pages));
@@ -491,6 +542,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
+	buf->attrs = attrs;
 
 	offset = vaddr & ~PAGE_MASK;
 	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
@@ -526,13 +578,11 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->dma_sgt = &buf->__dma_sgt;
 
 	/*
-	 * No need to sync to the device, this will happen later when the
-	 * prepare() memop is called.
+	 * Sync the cache now; the user might not ever ask for it.
 	 */
 	buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
 					       buf->dma_sgt->orig_nents,
-					       buf->dma_dir,
-					       DMA_ATTR_SKIP_CPU_SYNC);
+					       buf->dma_dir, buf->attrs);
 	if (buf->dma_sgt->nents <= 0) {
 		pr_err("failed to map scatterlist\n");
 		ret = -EIO;
@@ -556,7 +606,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 fail_map_sg:
 	dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
 			   buf->dma_sgt->orig_nents, buf->dma_dir,
-			   DMA_ATTR_SKIP_CPU_SYNC);
+			   buf->attrs);
 
 fail_sgt_init:
 	sg_free_table(buf->dma_sgt);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management
@ 2017-05-08 15:03   ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: daniel.vetter, dri-devel, hverkuil, kyungmin.park,
	laurent.pinchart, posciak, m.szyprowski

Patch ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA
attrs") added support for driver specific DMA attributes to
videobuf2-dma-contig but it had several issues in it.

In particular,

- cache operations were only performed on USERPTR buffers,

- DMA attributes were set only for MMAP buffers and

- it did not provide begin_cpu_access() and end_cpu_access() dma_buf_ops
  callbacks for cache syncronisation on exported MMAP buffers.

This patch corrects these issues.

Also arrange the header files alphabetically.

Fixes: ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA attrs")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 94 ++++++++++++++++++++------
 1 file changed, 72 insertions(+), 22 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 0afc3da..8b0298a 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -11,12 +11,12 @@
  */
 
 #include <linux/dma-buf.h>
+#include <linux/dma-mapping.h>
 #include <linux/module.h>
 #include <linux/refcount.h>
 #include <linux/scatterlist.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
-#include <linux/dma-mapping.h>
 
 #include <media/videobuf2-v4l2.h>
 #include <media/videobuf2-dma-contig.h>
@@ -97,12 +97,13 @@ static void vb2_dc_prepare(void *buf_priv)
 	struct vb2_dc_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (!buf->vec)
-		return;
-
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-			       buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
+				       buf->dma_dir);
 }
 
 static void vb2_dc_finish(void *buf_priv)
@@ -110,11 +111,13 @@ static void vb2_dc_finish(void *buf_priv)
 	struct vb2_dc_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (!buf->vec)
-		return;
-
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
+				    buf->dma_dir);
 }
 
 /*********************************************/
@@ -142,6 +145,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 			  gfp_t gfp_flags)
 {
 	struct vb2_dc_buf *buf;
+	int ret;
 
 	if (WARN_ON(!dev))
 		return ERR_PTR(-EINVAL);
@@ -152,9 +156,9 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 
 	buf->attrs = attrs;
 	buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
-					GFP_KERNEL | gfp_flags, buf->attrs);
+				      GFP_KERNEL | gfp_flags, buf->attrs);
 	if (!buf->cookie) {
-		dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
+		dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
 		kfree(buf);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -167,6 +171,16 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
 	buf->size = size;
 	buf->dma_dir = dma_dir;
 
+	ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt, buf->cookie,
+				    buf->dma_addr, buf->size, buf->attrs);
+	if (ret < 0) {
+		dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
+			       buf->attrs);
+		put_device(dev);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	buf->dma_sgt = &buf->__dma_sgt;
 	buf->handler.refcount = &buf->refcount;
 	buf->handler.put = vb2_dc_put;
 	buf->handler.arg = buf;
@@ -339,6 +353,40 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum)
 	return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
 }
 
+static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
+					      enum dma_data_direction direction)
+{
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
+				    buf->dma_dir);
+
+	return 0;
+}
+
+static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
+					    enum dma_data_direction direction)
+{
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
+				       buf->dma_dir);
+
+	return 0;
+}
+
 static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
 {
 	struct vb2_dc_buf *buf = dbuf->priv;
@@ -359,6 +407,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
 	.unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
 	.kmap = vb2_dc_dmabuf_ops_kmap,
 	.kmap_atomic = vb2_dc_dmabuf_ops_kmap,
+	.begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
+	.end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
 	.vmap = vb2_dc_dmabuf_ops_vmap,
 	.mmap = vb2_dc_dmabuf_ops_mmap,
 	.release = vb2_dc_dmabuf_ops_release,
@@ -412,11 +462,12 @@ static void vb2_dc_put_userptr(void *buf_priv)
 
 	if (sgt) {
 		/*
-		 * No need to sync to CPU, it's already synced to the CPU
-		 * since the finish() memop will have been called before this.
+		 * Don't ask to skip cache sync in case if the user
+		 * did ask to skip cache flush the last time the
+		 * buffer was dequeued.
 		 */
 		dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				   buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+				   buf->dma_dir, buf->attrs);
 		pages = frame_vector_pages(buf->vec);
 		/* sgt should exist only if vector contains pages... */
 		BUG_ON(IS_ERR(pages));
@@ -491,6 +542,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
+	buf->attrs = attrs;
 
 	offset = vaddr & ~PAGE_MASK;
 	vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
@@ -526,13 +578,11 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->dma_sgt = &buf->__dma_sgt;
 
 	/*
-	 * No need to sync to the device, this will happen later when the
-	 * prepare() memop is called.
+	 * Sync the cache now; the user might not ever ask for it.
 	 */
 	buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
 					       buf->dma_sgt->orig_nents,
-					       buf->dma_dir,
-					       DMA_ATTR_SKIP_CPU_SYNC);
+					       buf->dma_dir, buf->attrs);
 	if (buf->dma_sgt->nents <= 0) {
 		pr_err("failed to map scatterlist\n");
 		ret = -EIO;
@@ -556,7 +606,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
 fail_map_sg:
 	dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
 			   buf->dma_sgt->orig_nents, buf->dma_dir,
-			   DMA_ATTR_SKIP_CPU_SYNC);
+			   buf->attrs);
 
 fail_sgt_init:
 	sg_free_table(buf->dma_sgt);
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 11/18] vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (9 preceding siblings ...)
  2017-05-08 15:03   ` Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 12/18] vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs Sakari Ailus
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The scatterlist should always be present when the cache would need to be
flushed. Each buffer type has its own means to provide that. Add
WARN_ON_ONCE() to check the scatterist exists.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 8b0298a..f572911 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -101,7 +101,7 @@ static void vb2_dc_prepare(void *buf_priv)
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
 		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
 				       buf->dma_dir);
 }
@@ -115,7 +115,7 @@ static void vb2_dc_finish(void *buf_priv)
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
 		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
 				    buf->dma_dir);
 }
@@ -363,7 +363,7 @@ static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
 		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
 				    buf->dma_dir);
 
@@ -380,7 +380,7 @@ static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
 		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
 				       buf->dma_dir);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 12/18] vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (10 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 11/18] vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested Sakari Ailus
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The desirable DMA attributes are not generic for all devices using
Videobuf2 scatter-gather DMA ops. Let the drivers decide.

As a result, also the DMA-BUF exporter must provide ops for synchronising
the cache. This adds begin_cpu_access and end_cpu_access ops to
vb2_dc_dmabuf_ops.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-sg.c | 81 +++++++++++++++++++++++-------
 1 file changed, 62 insertions(+), 19 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c b/drivers/media/v4l2-core/videobuf2-dma-sg.c
index 102ddb2..5662f00 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
@@ -38,6 +38,7 @@ struct vb2_dma_sg_buf {
 	struct frame_vector		*vec;
 	int				offset;
 	enum dma_data_direction		dma_dir;
+	unsigned long			dma_attrs;
 	struct sg_table			sg_table;
 	/*
 	 * This will point to sg_table when used with the MMAP or USERPTR
@@ -114,6 +115,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
 
 	buf->vaddr = NULL;
 	buf->dma_dir = dma_dir;
+	buf->dma_attrs = dma_attrs;
 	buf->offset = 0;
 	buf->size = size;
 	/* size is already page aligned */
@@ -143,7 +145,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
 	 * prepare() memop is called.
 	 */
 	sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				      buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+				      buf->dma_dir, dma_attrs);
 	if (!sgt->nents)
 		goto fail_map;
 
@@ -181,7 +183,7 @@ static void vb2_dma_sg_put(void *buf_priv)
 		dprintk(1, "%s: Freeing buffer of %d pages\n", __func__,
 			buf->num_pages);
 		dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				   buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+				   buf->dma_dir, buf->dma_attrs);
 		if (buf->vaddr)
 			vm_unmap_ram(buf->vaddr, buf->num_pages);
 		sg_free_table(buf->dma_sgt);
@@ -198,12 +200,13 @@ static void vb2_dma_sg_prepare(void *buf_priv)
 	struct vb2_dma_sg_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (buf->db_attach)
-		return;
-
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-			       buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
+				       buf->dma_dir);
 }
 
 static void vb2_dma_sg_finish(void *buf_priv)
@@ -211,11 +214,13 @@ static void vb2_dma_sg_finish(void *buf_priv)
 	struct vb2_dma_sg_buf *buf = buf_priv;
 	struct sg_table *sgt = buf->dma_sgt;
 
-	/* DMABUF exporter will flush the cache for us */
-	if (buf->db_attach)
-		return;
-
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
+				    buf->dma_dir);
 }
 
 static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,
@@ -237,6 +242,7 @@ static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,
 	buf->vaddr = NULL;
 	buf->dev = dev;
 	buf->dma_dir = dma_dir;
+	buf->dma_attrs = dma_attrs;
 	buf->offset = vaddr & ~PAGE_MASK;
 	buf->size = size;
 	buf->dma_sgt = &buf->sg_table;
@@ -260,7 +266,7 @@ static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,
 	 * prepare() memop is called.
 	 */
 	sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-				      buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+				      buf->dma_dir, dma_attrs);
 	if (!sgt->nents)
 		goto userptr_fail_map;
 
@@ -288,7 +294,7 @@ static void vb2_dma_sg_put_userptr(void *buf_priv)
 	dprintk(1, "%s: Releasing userspace buffer of %d pages\n",
 	       __func__, buf->num_pages);
 	dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir,
-			   DMA_ATTR_SKIP_CPU_SYNC);
+			   buf->dma_attrs);
 	if (buf->vaddr)
 		vm_unmap_ram(buf->vaddr, buf->num_pages);
 	sg_free_table(buf->dma_sgt);
@@ -433,6 +439,7 @@ static struct sg_table *vb2_dma_sg_dmabuf_ops_map(
 	struct dma_buf_attachment *db_attach, enum dma_data_direction dma_dir)
 {
 	struct vb2_dma_sg_attachment *attach = db_attach->priv;
+	struct vb2_dma_sg_buf *buf = db_attach->dmabuf->priv;
 	/* stealing dmabuf mutex to serialize map/unmap operations */
 	struct mutex *lock = &db_attach->dmabuf->lock;
 	struct sg_table *sgt;
@@ -448,14 +455,14 @@ static struct sg_table *vb2_dma_sg_dmabuf_ops_map(
 
 	/* release any previous cache */
 	if (attach->dma_dir != DMA_NONE) {
-		dma_unmap_sg(db_attach->dev, sgt->sgl, sgt->orig_nents,
-			attach->dma_dir);
+		dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
+				   attach->dma_dir, buf->dma_attrs);
 		attach->dma_dir = DMA_NONE;
 	}
 
 	/* mapping to the client with new direction */
-	sgt->nents = dma_map_sg(db_attach->dev, sgt->sgl, sgt->orig_nents,
-				dma_dir);
+	sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
+				      dma_dir, buf->dma_attrs);
 	if (!sgt->nents) {
 		pr_err("failed to map scatterlist\n");
 		mutex_unlock(lock);
@@ -488,6 +495,40 @@ static void *vb2_dma_sg_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnu
 	return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
 }
 
+static int vb2_dma_sg_dmabuf_ops_begin_cpu_access(
+	struct dma_buf *dbuf, enum dma_data_direction direction)
+{
+	struct vb2_dma_sg_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
+				    buf->dma_dir);
+
+	return 0;
+}
+
+static int vb2_dma_sg_dmabuf_ops_end_cpu_access(
+	struct dma_buf *dbuf, enum dma_data_direction direction)
+{
+	struct vb2_dma_sg_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	/*
+	 * DMABUF exporter will flush the cache for us; only USERPTR
+	 * and MMAP buffers with non-coherent memory will be flushed.
+	 */
+	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
+		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
+				       buf->dma_dir);
+
+	return 0;
+}
+
 static void *vb2_dma_sg_dmabuf_ops_vmap(struct dma_buf *dbuf)
 {
 	struct vb2_dma_sg_buf *buf = dbuf->priv;
@@ -508,6 +549,8 @@ static struct dma_buf_ops vb2_dma_sg_dmabuf_ops = {
 	.unmap_dma_buf = vb2_dma_sg_dmabuf_ops_unmap,
 	.kmap = vb2_dma_sg_dmabuf_ops_kmap,
 	.kmap_atomic = vb2_dma_sg_dmabuf_ops_kmap,
+	.begin_cpu_access = vb2_dma_sg_dmabuf_ops_begin_cpu_access,
+	.end_cpu_access = vb2_dma_sg_dmabuf_ops_end_cpu_access,
 	.vmap = vb2_dma_sg_dmabuf_ops_vmap,
 	.mmap = vb2_dma_sg_dmabuf_ops_mmap,
 	.release = vb2_dma_sg_dmabuf_ops_release,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (11 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 12/18] vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-10 11:00   ` Tomasz Figa
  2017-05-08 15:03 ` [RFC v4 14/18] vb2: Improve struct vb2_mem_ops documentation; alloc and put are for MMAP Sakari Ailus
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart, Samu Onkalo

From: Samu Onkalo <samu.onkalo@intel.com>

The user may request to the driver (vb2) to skip the cache maintenance
operations in case the buffer does not need cache synchronisation, e.g. in
cases where the buffer is passed between hardware blocks without it being
touched by the CPU.

Also document that the prepare and finish vb2_mem_ops might not get called
every time the buffer ownership changes between the kernel and the user
space.

Signed-off-by: Samu Onkalo <samu.onkalo@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 101 +++++++++++++++++++++----------
 drivers/media/v4l2-core/videobuf2-v4l2.c |  14 ++++-
 include/media/videobuf2-core.h           |  23 ++++---
 3 files changed, 97 insertions(+), 41 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index c659b64..fbe0db1 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -189,6 +189,28 @@ static void __vb2_queue_cancel(struct vb2_queue *q);
 static void __enqueue_in_driver(struct vb2_buffer *vb);
 
 /**
+ * __mem_prepare_planes() - call finish mem op for all planes of the buffer
+ */
+static void __mem_prepare_planes(struct vb2_buffer *vb)
+{
+	unsigned int plane;
+
+	for (plane = 0; plane < vb->num_planes; ++plane)
+		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
+}
+
+/**
+ * __mem_finish_planes() - call finish mem op for all planes of the buffer
+ */
+static void __mem_finish_planes(struct vb2_buffer *vb)
+{
+	unsigned int plane;
+
+	for (plane = 0; plane < vb->num_planes; ++plane)
+		call_void_memop(vb, finish, vb->planes[plane].mem_priv);
+}
+
+/**
  * __vb2_buf_mem_alloc() - allocate video memory for the given buffer
  */
 static int __vb2_buf_mem_alloc(struct vb2_buffer *vb)
@@ -953,20 +975,29 @@ EXPORT_SYMBOL_GPL(vb2_discard_done);
 /**
  * __prepare_mmap() - prepare an MMAP buffer
  */
-static int __prepare_mmap(struct vb2_buffer *vb, const void *pb)
+static int __prepare_mmap(struct vb2_buffer *vb, const void *pb,
+			  bool no_cache_sync)
 {
-	int ret = 0;
+	int ret;
 
-	if (pb)
+	if (pb) {
 		ret = call_bufop(vb->vb2_queue, fill_vb2_buffer,
 				 vb, pb, vb->planes);
-	return ret ? ret : call_vb_qop(vb, buf_prepare, vb);
+		if (ret)
+			return ret;
+	}
+
+	if (!no_cache_sync)
+		__mem_prepare_planes(vb);
+
+	return call_vb_qop(vb, buf_prepare, vb);
 }
 
 /**
  * __prepare_userptr() - prepare a USERPTR buffer
  */
-static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
+static int __prepare_userptr(struct vb2_buffer *vb, const void *pb,
+			     bool no_cache_sync)
 {
 	struct vb2_plane planes[VB2_MAX_PLANES];
 	struct vb2_queue *q = vb->vb2_queue;
@@ -1057,6 +1088,11 @@ static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
 			dprintk(1, "buffer initialization failed\n");
 			goto err;
 		}
+
+		/* This is new buffer memory --- always synchronise cache. */
+		__mem_prepare_planes(vb);
+	} else if (!no_cache_sync) {
+		__mem_prepare_planes(vb);
 	}
 
 	ret = call_vb_qop(vb, buf_prepare, vb);
@@ -1084,7 +1120,8 @@ static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
 /**
  * __prepare_dmabuf() - prepare a DMABUF buffer
  */
-static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
+static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb,
+			    bool no_cache_sync)
 {
 	struct vb2_plane planes[VB2_MAX_PLANES];
 	struct vb2_queue *q = vb->vb2_queue;
@@ -1199,6 +1236,11 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
 			dprintk(1, "buffer initialization failed\n");
 			goto err;
 		}
+
+		/* This is new buffer memory --- always synchronise cache. */
+		__mem_prepare_planes(vb);
+	} else if (!no_cache_sync) {
+		__mem_prepare_planes(vb);
 	}
 
 	ret = call_vb_qop(vb, buf_prepare, vb);
@@ -1231,10 +1273,10 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
 	call_void_vb_qop(vb, buf_queue, vb);
 }
 
-static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
+static int __buf_prepare(struct vb2_buffer *vb, const void *pb,
+			 bool no_cache_sync)
 {
 	struct vb2_queue *q = vb->vb2_queue;
-	unsigned int plane;
 	int ret;
 
 	if (q->error) {
@@ -1246,13 +1288,13 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 
 	switch (q->memory) {
 	case VB2_MEMORY_MMAP:
-		ret = __prepare_mmap(vb, pb);
+		ret = __prepare_mmap(vb, pb, no_cache_sync);
 		break;
 	case VB2_MEMORY_USERPTR:
-		ret = __prepare_userptr(vb, pb);
+		ret = __prepare_userptr(vb, pb, no_cache_sync);
 		break;
 	case VB2_MEMORY_DMABUF:
-		ret = __prepare_dmabuf(vb, pb);
+		ret = __prepare_dmabuf(vb, pb, no_cache_sync);
 		break;
 	default:
 		WARN(1, "Invalid queue type\n");
@@ -1265,16 +1307,13 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 		return ret;
 	}
 
-	/* sync buffers */
-	for (plane = 0; plane < vb->num_planes; ++plane)
-		call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
-
 	vb->state = VB2_BUF_STATE_PREPARED;
 
 	return 0;
 }
 
-int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
+int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb,
+			 bool no_cache_sync)
 {
 	struct vb2_buffer *vb;
 	int ret;
@@ -1286,7 +1325,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
 		return -EINVAL;
 	}
 
-	ret = __buf_prepare(vb, pb);
+	ret = __buf_prepare(vb, pb, no_cache_sync);
 	if (ret)
 		return ret;
 
@@ -1362,7 +1401,8 @@ static int vb2_start_streaming(struct vb2_queue *q)
 	return ret;
 }
 
-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+		  bool no_cache_sync)
 {
 	struct vb2_buffer *vb;
 	int ret;
@@ -1371,7 +1411,7 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
 
 	switch (vb->state) {
 	case VB2_BUF_STATE_DEQUEUED:
-		ret = __buf_prepare(vb, pb);
+		ret = __buf_prepare(vb, pb, no_cache_sync);
 		if (ret)
 			return ret;
 		break;
@@ -1557,7 +1597,7 @@ EXPORT_SYMBOL_GPL(vb2_wait_for_all_buffers);
 /**
  * __vb2_dqbuf() - bring back the buffer to the DEQUEUED state
  */
-static void __vb2_dqbuf(struct vb2_buffer *vb)
+static void __vb2_dqbuf(struct vb2_buffer *vb, bool no_cache_sync)
 {
 	struct vb2_queue *q = vb->vb2_queue;
 	unsigned int i;
@@ -1568,9 +1608,8 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)
 
 	vb->state = VB2_BUF_STATE_DEQUEUED;
 
-	/* sync buffers */
-	for (i = 0; i < vb->num_planes; ++i)
-		call_void_memop(vb, finish, vb->planes[i].mem_priv);
+	if (!no_cache_sync)
+		__mem_finish_planes(vb);
 
 	/* unmap DMABUF buffer */
 	if (q->memory == VB2_MEMORY_DMABUF)
@@ -1583,7 +1622,7 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)
 }
 
 int vb2_core_dqbuf(struct vb2_queue *q, unsigned int *pindex, void *pb,
-		   bool nonblocking)
+		   bool nonblocking, bool no_cache_sync)
 {
 	struct vb2_buffer *vb = NULL;
 	int ret;
@@ -1620,7 +1659,7 @@ int vb2_core_dqbuf(struct vb2_queue *q, unsigned int *pindex, void *pb,
 	trace_vb2_dqbuf(q, vb);
 
 	/* go back to dequeued state */
-	__vb2_dqbuf(vb);
+	__vb2_dqbuf(vb, no_cache_sync);
 
 	dprintk(1, "dqbuf of buffer %d, with state %d\n",
 			vb->index, vb->state);
@@ -1694,7 +1733,7 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
 			vb->state = VB2_BUF_STATE_PREPARED;
 			call_void_vb_qop(vb, buf_finish, vb);
 		}
-		__vb2_dqbuf(vb);
+		__vb2_dqbuf(vb, false);
 	}
 }
 
@@ -2242,7 +2281,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 		 * Queue all buffers.
 		 */
 		for (i = 0; i < q->num_buffers; i++) {
-			ret = vb2_core_qbuf(q, i, NULL);
+			ret = vb2_core_qbuf(q, i, NULL, false);
 			if (ret)
 				goto err_reqbufs;
 			fileio->bufs[i].queued = 1;
@@ -2345,7 +2384,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
 		/*
 		 * Call vb2_dqbuf to get buffer back.
 		 */
-		ret = vb2_core_dqbuf(q, &index, NULL, nonblock);
+		ret = vb2_core_dqbuf(q, &index, NULL, nonblock, false);
 		dprintk(5, "vb2_dqbuf result: %d\n", ret);
 		if (ret)
 			return ret;
@@ -2421,7 +2460,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
 
 		if (copy_timestamp)
 			b->timestamp = ktime_get_ns();
-		ret = vb2_core_qbuf(q, index, NULL);
+		ret = vb2_core_qbuf(q, index, NULL, false);
 		dprintk(5, "vb2_dbuf result: %d\n", ret);
 		if (ret)
 			return ret;
@@ -2507,7 +2546,7 @@ static int vb2_thread(void *data)
 		} else {
 			call_void_qop(q, wait_finish, q);
 			if (!threadio->stop)
-				ret = vb2_core_dqbuf(q, &index, NULL, 0);
+				ret = vb2_core_dqbuf(q, &index, NULL, 0, false);
 			call_void_qop(q, wait_prepare, q);
 			dprintk(5, "file io: vb2_dqbuf result: %d\n", ret);
 			if (!ret)
@@ -2524,7 +2563,7 @@ static int vb2_thread(void *data)
 		if (copy_timestamp)
 			vb->timestamp = ktime_get_ns();;
 		if (!threadio->stop)
-			ret = vb2_core_qbuf(q, vb->index, NULL);
+			ret = vb2_core_qbuf(q, vb->index, NULL, false);
 		call_void_qop(q, wait_prepare, q);
 		if (ret || threadio->stop)
 			break;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 0c06699..a0c4511 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -499,8 +499,11 @@ int vb2_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b)
 	}
 
 	ret = vb2_queue_or_prepare_buf(q, b, "prepare_buf");
+	if (ret)
+		return ret;
 
-	return ret ? ret : vb2_core_prepare_buf(q, b->index, b);
+	return vb2_core_prepare_buf(q, b->index, b,
+				    b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);
 }
 EXPORT_SYMBOL_GPL(vb2_prepare_buf);
 
@@ -568,7 +571,11 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
 	}
 
 	ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
-	return ret ? ret : vb2_core_qbuf(q, b->index, b);
+	if (ret)
+		return ret;
+
+	return vb2_core_qbuf(q, b->index, b,
+			     b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);
 }
 EXPORT_SYMBOL_GPL(vb2_qbuf);
 
@@ -586,7 +593,8 @@ int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking)
 		return -EINVAL;
 	}
 
-	ret = vb2_core_dqbuf(q, NULL, b, nonblocking);
+	ret = vb2_core_dqbuf(q, NULL, b, nonblocking,
+			     b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);
 
 	/*
 	 *  After calling the VIDIOC_DQBUF V4L2_BUF_FLAG_DONE must be
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 4172f6e..08f1d0e 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -83,10 +83,14 @@ struct vb2_threadio_data;
  *		dmabuf.
  * @unmap_dmabuf: releases access control to the dmabuf - allocator is notified
  *		  that this driver is done using the dmabuf for now.
- * @prepare:	called every time the buffer is passed from userspace to the
- *		driver, useful for cache synchronisation, optional.
- * @finish:	called every time the buffer is passed back from the driver
- *		to the userspace, also optional.
+ * @prepare:	Called on the plane when the buffer ownership is passed from
+ *		the user space to the kernel and the plane must be cache
+ *		syncronised. The V4L2_BUF_FLAG_NO_CACHE_SYNC buffer flag may
+ *		be used to skip this call. Optional.
+ * @finish:	Called on the plane when the buffer ownership is passed from
+ *		the kernel to the user space and the plane must be cache
+ *		syncronised. The V4L2_BUF_FLAG_NO_CACHE_SYNC buffer flag may
+ *		be used to skip this call. Optional.
  * @vaddr:	return a kernel virtual address to a given memory buffer
  *		associated with the passed private structure or NULL if no
  *		such mapping exists.
@@ -696,6 +700,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
  * @index:	id number of the buffer
  * @pb:		buffer structure passed from userspace to vidioc_prepare_buf
  *		handler in driver
+ * @no_cache_sync if true, skip cache synchronization
  *
  * Should be called from vidioc_prepare_buf ioctl handler of a driver.
  * The passed buffer should have been verified.
@@ -705,7 +710,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
  * The return values from this function are intended to be directly returned
  * from vidioc_prepare_buf handler in driver.
  */
-int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
+int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb,
+			 bool no_cache_sync);
 
 /**
  * vb2_core_qbuf() - Queue a buffer from userspace
@@ -714,6 +720,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
  * @index:	id number of the buffer
  * @pb:		buffer structure passed from userspace to vidioc_qbuf handler
  *		in driver
+ * @no_cache_sync if true, skip cache synchronization
  *
  * Should be called from vidioc_qbuf ioctl handler of a driver.
  * The passed buffer should have been verified.
@@ -728,7 +735,8 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
  * The return values from this function are intended to be directly returned
  * from vidioc_qbuf handler in driver.
  */
-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+		  bool no_cache_sync);
 
 /**
  * vb2_core_dqbuf() - Dequeue a buffer to the userspace
@@ -739,6 +747,7 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
  * @nonblocking: if true, this call will not sleep waiting for a buffer if no
  *		 buffers ready for dequeuing are present. Normally the driver
  *		 would be passing (file->f_flags & O_NONBLOCK) here
+ * @no_cache_sync if true, skip cache synchronization
  *
  * Should be called from vidioc_dqbuf ioctl handler of a driver.
  * The passed buffer should have been verified.
@@ -755,7 +764,7 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
  * from vidioc_dqbuf handler in driver.
  */
 int vb2_core_dqbuf(struct vb2_queue *q, unsigned int *pindex, void *pb,
-		   bool nonblocking);
+		   bool nonblocking, bool no_cache_sync);
 
 int vb2_core_streamon(struct vb2_queue *q, unsigned int type);
 int vb2_core_streamoff(struct vb2_queue *q, unsigned int type);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 14/18] vb2: Improve struct vb2_mem_ops documentation; alloc and put are for MMAP
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (12 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 15/18] vb2: Dma direction is always DMA_TO_DEVICE in buffer preparation Sakari Ailus
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The alloc() and put() ops are for MMAP buffers only. Document it.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 include/media/videobuf2-core.h | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 08f1d0e..dd67ae6 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -46,16 +46,16 @@ struct vb2_threadio_data;
 
 /**
  * struct vb2_mem_ops - memory handling/memory allocator operations
- * @alloc:	allocate video memory and, optionally, allocator private data,
- *		return ERR_PTR() on failure or a pointer to allocator private,
- *		per-buffer data on success; the returned private structure
- *		will then be passed as @buf_priv argument to other ops in this
- *		structure. Additional gfp_flags to use when allocating the
- *		are also passed to this operation. These flags are from the
- *		gfp_flags field of vb2_queue.
- * @put:	inform the allocator that the buffer will no longer be used;
- *		usually will result in the allocator freeing the buffer (if
- *		no other users of this buffer are present); the @buf_priv
+ * @alloc:	allocate video memory for an MMAP buffer and, optionally,
+ *		allocator private data, return ERR_PTR() on failure or a pointer
+ *		to allocator private, per-buffer data on success; the returned
+ *		private structure will then be passed as @buf_priv argument to
+ *		other ops in this structure. Additional gfp_flags to use when
+ *		allocating the memory are also passed to this operation. These
+ *		flags are from the gfp_flags field of vb2_queue.
+ * @put:	inform the allocator that the MMAP buffer will no longer be
+ *		used; usually will result in the allocator freeing the buffer
+ *		(if no other users of this buffer are present); the @buf_priv
  *		argument is the allocator private per-buffer structure
  *		previously returned from the alloc callback.
  * @get_dmabuf: acquire userspace memory for a hardware operation; used for
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 15/18] vb2: Dma direction is always DMA_TO_DEVICE in buffer preparation
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (13 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 14/18] vb2: Improve struct vb2_mem_ops documentation; alloc and put are for MMAP Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 16/18] vb2: Do sync plane cache only for CAPTURE buffers in finish memop Sakari Ailus
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

The patch changes the DMA direction from DMA_FROM_DEVICE to DMA_TO_DEVICE
for capture buffers.

The DMA API does not require that any synchronisation is done to the
buffer when it is passed to hardware for writing _but_ there's a caveat:
the user *must not* have written to the buffer.

The V4L2 API does however not require this. Instead, it requires that the
user does not access the buffer since it is queued to the device using
VIDIOC_QBUF IOCTL until the buffer is dequeued again using VIDIOC_DQBUF
IOCTL.

So in this case we want to ensure there will be no dirty cache lines that
could end up to memory possibly after the device has written to the same
memory area. What data gets written to the system memory from the cache is
not extremely important. Still, an for debugging purposes an application
capturing images could fill the buffer with a known pattern which will be
overwritten by the device, hence DMA_TO_DEVICE.

If an application can guarantee that it has not written to the buffer, it
can specify the V4L2_BUF_FLAG_NO_CACHE_SYNC flag to omit the sync
operation.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 2 +-
 drivers/media/v4l2-core/videobuf2-dma-sg.c     | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index f572911..320e53a 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -103,7 +103,7 @@ static void vb2_dc_prepare(void *buf_priv)
 	 */
 	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
 		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-				       buf->dma_dir);
+				       DMA_TO_DEVICE);
 }
 
 static void vb2_dc_finish(void *buf_priv)
diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c b/drivers/media/v4l2-core/videobuf2-dma-sg.c
index 5662f00..88b2530 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
@@ -206,7 +206,7 @@ static void vb2_dma_sg_prepare(void *buf_priv)
 	 */
 	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
 		dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-				       buf->dma_dir);
+				       DMA_TO_DEVICE);
 }
 
 static void vb2_dma_sg_finish(void *buf_priv)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 16/18] vb2: Do sync plane cache only for CAPTURE buffers in finish memop
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (14 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 15/18] vb2: Dma direction is always DMA_TO_DEVICE in buffer preparation Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 17/18] docs-rst: Document precise V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour Sakari Ailus
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

There is no need synchronise buffer cache for OUTPUT devices when the
buffer is dequeued as the hardware only reads from the buffer. Only sync
for capture buffers.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 5 +++--
 drivers/media/v4l2-core/videobuf2-dma-sg.c     | 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 320e53a..41d5782 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -115,9 +115,10 @@ static void vb2_dc_finish(void *buf_priv)
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt))
+	if (buf->attrs & DMA_ATTR_NON_CONSISTENT && !WARN_ON_ONCE(!sgt) &&
+	    buf->dma_dir == DMA_FROM_DEVICE)
 		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
-				    buf->dma_dir);
+				    DMA_FROM_DEVICE);
 }
 
 /*********************************************/
diff --git a/drivers/media/v4l2-core/videobuf2-dma-sg.c b/drivers/media/v4l2-core/videobuf2-dma-sg.c
index 88b2530..e30c869 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-sg.c
@@ -218,9 +218,10 @@ static void vb2_dma_sg_finish(void *buf_priv)
 	 * DMABUF exporter will flush the cache for us; only USERPTR
 	 * and MMAP buffers with non-coherent memory will be flushed.
 	 */
-	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT)
+	if (buf->dma_attrs & DMA_ATTR_NON_CONSISTENT &&
+	    buf->dma_dir == DMA_FROM_DEVICE)
 		dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
-				    buf->dma_dir);
+				    DMA_FROM_DEVICE);
 }
 
 static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 17/18] docs-rst: Document precise V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (15 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 16/18] vb2: Do sync plane cache only for CAPTURE buffers in finish memop Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-05-08 15:03 ` [RFC v4 18/18] v4l: Use non-consistent DMA mappings for hardware that deserves it Sakari Ailus
  2017-06-07 17:13 ` [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Mauro Carvalho Chehab
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Document when should the user specify V4L2_BUF_FLAG_NO_CACHE_SYNC flag.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 Documentation/media/uapi/v4l/buffer.rst | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
index 9eb42bd..e1f93dd 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -570,6 +570,27 @@ Buffer Flags
 	then read by another one, in which case the flag should be set in both
 	:ref:`VIDIOC_QBUF` and :ref:`VIDIOC_DQBUF` ioctls. This flag has no
 	effect on some devices / architectures.
+
+	More specifically, this flag causes cache synchronisation to
+	be skipped for OUTPUT buffers when the buffer is queued. The
+	flag can be used if the buffer has been previously written to
+	by hardware but has not been written to by the CPU.
+
+	Additionally, if this flag is specified for a CAPTURE buffer
+	when it is queued, cache synchronisation is skipped. This
+	signals that the application can guarantee that it has not
+	written to the buffer memory since it was last dequeued from
+	the device.
+
+	Specifying this flag for a CAPTURE buffer when
+	dequeueing a buffer will skip cache maintenance for the buffer
+	memory. An application may not access the buffer memory in
+	that case but it may well be passed onwards to another device
+	in the system.
+
+	Specifying this flag has no effect when dequeuing an OUTPUT
+	buffer.
+
     * .. _`V4L2-BUF-FLAG-LAST`:
 
       - ``V4L2_BUF_FLAG_LAST``
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RFC v4 18/18] v4l: Use non-consistent DMA mappings for hardware that deserves it
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (16 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 17/18] docs-rst: Document precise V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour Sakari Ailus
@ 2017-05-08 15:03 ` Sakari Ailus
  2017-06-07 17:13 ` [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Mauro Carvalho Chehab
  18 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-08 15:03 UTC (permalink / raw)
  To: linux-media
  Cc: dri-devel, posciak, m.szyprowski, kyungmin.park, hverkuil,
	sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Set DMA_ATTR_NON_CONSISTENT for hardware that uses non-consistent memory.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
---
 drivers/media/platform/omap3isp/ispvideo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/platform/omap3isp/ispvideo.c b/drivers/media/platform/omap3isp/ispvideo.c
index 218e6d7..b74444e 100644
--- a/drivers/media/platform/omap3isp/ispvideo.c
+++ b/drivers/media/platform/omap3isp/ispvideo.c
@@ -1333,6 +1333,7 @@ static int isp_video_open(struct file *file)
 	queue->buf_struct_size = sizeof(struct isp_buffer);
 	queue->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
 	queue->dev = video->isp->dev;
+	queue->dma_attrs = DMA_ATTR_NON_CONSISTENT;
 
 	ret = vb2_queue_init(&handle->queue);
 	if (ret < 0) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field
  2017-05-08 15:03 ` [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field Sakari Ailus
@ 2017-05-10 10:13     ` Tomasz Figa
  0 siblings, 0 replies; 33+ messages in thread
From: Tomasz Figa @ 2017-05-10 10:13 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-media, dri-devel, posciak, Marek Szyprowski, Kyungmin Park,
	Hans Verkuil, sumit.semwal, Rob Clark, Daniel Vetter, labbott,
	Laurent Pinchart

Hi Sakari,

Some comments inline.

On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
> The struct vb2_dc_buf contains two struct sg_table fields: sgt_base and
> dma_sgt. The former is used by DMA-BUF buffers whereas the latter is used
> by USERPTR.
>
> Unify the two, leaving dma_sgt.
>
> MMAP buffers do not need cache flushing since they have been allocated
> using dma_alloc_coherent().
>
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-contig.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index a8a46a8..ddbbcf0 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -31,12 +31,13 @@ struct vb2_dc_buf {
>         unsigned long                   attrs;
>         enum dma_data_direction         dma_dir;
>         struct sg_table                 *dma_sgt;
> -       struct frame_vector             *vec;
>
>         /* MMAP related */
>         struct vb2_vmarea_handler       handler;
>         refcount_t                      refcount;
> -       struct sg_table                 *sgt_base;
> +
> +       /* USERPTR related */
> +       struct frame_vector             *vec;
>
>         /* DMABUF related */
>         struct dma_buf_attachment       *db_attach;
> @@ -96,7 +97,7 @@ static void vb2_dc_prepare(void *buf_priv)
>         struct sg_table *sgt = buf->dma_sgt;
>
>         /* DMABUF exporter will flush the cache for us */
> -       if (!sgt || buf->db_attach)
> +       if (!buf->vec)

While at it, can we change the comment above to actually refer to what
this condition is checking? Maybe it's just me, but it's very
confusing, as the condition is actually (!USERPTR), while the comment
mentions DMABUF alone and not even mentioning about MMAP. Maybe we
could have something like this:

/*
 * Only USERPTR needs cache maintenance. DMABUF exporter will flush
 * the cache for us, while MMAP buffers are coherent by design.
 */

I guess it could be done as a separate patch after this series,
especially considering the message might actually change, since we are
going to allow cached MMAP buffers.

>                 return;
>
>         dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> @@ -109,7 +110,7 @@ static void vb2_dc_finish(void *buf_priv)
>         struct sg_table *sgt = buf->dma_sgt;
>
>         /* DMABUF exporter will flush the cache for us */

Here too.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field
@ 2017-05-10 10:13     ` Tomasz Figa
  0 siblings, 0 replies; 33+ messages in thread
From: Tomasz Figa @ 2017-05-10 10:13 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: Daniel Vetter, dri-devel, Hans Verkuil, Kyungmin Park,
	Laurent Pinchart, linux-media, posciak, Marek Szyprowski

Hi Sakari,

Some comments inline.

On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
> The struct vb2_dc_buf contains two struct sg_table fields: sgt_base and
> dma_sgt. The former is used by DMA-BUF buffers whereas the latter is used
> by USERPTR.
>
> Unify the two, leaving dma_sgt.
>
> MMAP buffers do not need cache flushing since they have been allocated
> using dma_alloc_coherent().
>
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-contig.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index a8a46a8..ddbbcf0 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -31,12 +31,13 @@ struct vb2_dc_buf {
>         unsigned long                   attrs;
>         enum dma_data_direction         dma_dir;
>         struct sg_table                 *dma_sgt;
> -       struct frame_vector             *vec;
>
>         /* MMAP related */
>         struct vb2_vmarea_handler       handler;
>         refcount_t                      refcount;
> -       struct sg_table                 *sgt_base;
> +
> +       /* USERPTR related */
> +       struct frame_vector             *vec;
>
>         /* DMABUF related */
>         struct dma_buf_attachment       *db_attach;
> @@ -96,7 +97,7 @@ static void vb2_dc_prepare(void *buf_priv)
>         struct sg_table *sgt = buf->dma_sgt;
>
>         /* DMABUF exporter will flush the cache for us */
> -       if (!sgt || buf->db_attach)
> +       if (!buf->vec)

While at it, can we change the comment above to actually refer to what
this condition is checking? Maybe it's just me, but it's very
confusing, as the condition is actually (!USERPTR), while the comment
mentions DMABUF alone and not even mentioning about MMAP. Maybe we
could have something like this:

/*
 * Only USERPTR needs cache maintenance. DMABUF exporter will flush
 * the cache for us, while MMAP buffers are coherent by design.
 */

I guess it could be done as a separate patch after this series,
especially considering the message might actually change, since we are
going to allow cached MMAP buffers.

>                 return;
>
>         dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> @@ -109,7 +110,7 @@ static void vb2_dc_finish(void *buf_priv)
>         struct sg_table *sgt = buf->dma_sgt;
>
>         /* DMABUF exporter will flush the cache for us */

Here too.

Best regards,
Tomasz
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management
  2017-05-08 15:03   ` Sakari Ailus
  (?)
@ 2017-05-10 10:32   ` Tomasz Figa
  2017-05-23 12:07       ` Sakari Ailus
  -1 siblings, 1 reply; 33+ messages in thread
From: Tomasz Figa @ 2017-05-10 10:32 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-media, dri-devel, posciak, Marek Szyprowski, Kyungmin Park,
	Hans Verkuil, sumit.semwal, Rob Clark, Daniel Vetter, labbott,
	Laurent Pinchart

Hi Sakari,

Some comments inline.

On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
> Patch ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA
> attrs") added support for driver specific DMA attributes to
> videobuf2-dma-contig but it had several issues in it.
>
> In particular,
>
> - cache operations were only performed on USERPTR buffers,
>
> - DMA attributes were set only for MMAP buffers and
>
> - it did not provide begin_cpu_access() and end_cpu_access() dma_buf_ops
>   callbacks for cache syncronisation on exported MMAP buffers.
>
> This patch corrects these issues.

I think the above are not really issues for the use cases the original
commit added the support for, i.e. disabling kernel mapping. There was
no intention of allowing cached MMAP buffers. Although I guess the
code added by it was not strict enough and didn't check the flags for
allowed ones.

>
> Also arrange the header files alphabetically.
>
> Fixes: ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA attrs")
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> ---
>  drivers/media/v4l2-core/videobuf2-dma-contig.c | 94 ++++++++++++++++++++------
>  1 file changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index 0afc3da..8b0298a 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -11,12 +11,12 @@
>   */
>
>  #include <linux/dma-buf.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/module.h>
>  #include <linux/refcount.h>
>  #include <linux/scatterlist.h>
>  #include <linux/sched.h>
>  #include <linux/slab.h>
> -#include <linux/dma-mapping.h>
>
>  #include <media/videobuf2-v4l2.h>
>  #include <media/videobuf2-dma-contig.h>
> @@ -97,12 +97,13 @@ static void vb2_dc_prepare(void *buf_priv)
>         struct vb2_dc_buf *buf = buf_priv;
>         struct sg_table *sgt = buf->dma_sgt;
>
> -       /* DMABUF exporter will flush the cache for us */

Ah, this annoying comment goes away here! Thanks and sorry for the
noise in previous patch.

> -       if (!buf->vec)
> -               return;
> -
> -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> -                              buf->dma_dir);
> +       /*
> +        * DMABUF exporter will flush the cache for us; only USERPTR
> +        * and MMAP buffers with non-coherent memory will be flushed.
> +        */

Sounds much better.

> +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)

Hmm, we change it for USERPTR to rely on presence of
DMA_ATTR_NON_CONSISTENT in buf->attrs, but I don't see it being set
for such anywhere by previous patches.

> +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> +                                      buf->dma_dir);
>  }
>
>  static void vb2_dc_finish(void *buf_priv)
> @@ -110,11 +111,13 @@ static void vb2_dc_finish(void *buf_priv)
>         struct vb2_dc_buf *buf = buf_priv;
>         struct sg_table *sgt = buf->dma_sgt;
>
> -       /* DMABUF exporter will flush the cache for us */
> -       if (!buf->vec)
> -               return;
> -
> -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
> +       /*
> +        * DMABUF exporter will flush the cache for us; only USERPTR
> +        * and MMAP buffers with non-coherent memory will be flushed.
> +        */
> +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)

Ditto.

> +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
> +                                   buf->dma_dir);
>  }
>
>  /*********************************************/
> @@ -142,6 +145,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
>                           gfp_t gfp_flags)
>  {
>         struct vb2_dc_buf *buf;
> +       int ret;
>
>         if (WARN_ON(!dev))
>                 return ERR_PTR(-EINVAL);
> @@ -152,9 +156,9 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
>
>         buf->attrs = attrs;
>         buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
> -                                       GFP_KERNEL | gfp_flags, buf->attrs);
> +                                     GFP_KERNEL | gfp_flags, buf->attrs);

White space change, not sure if intended.

>         if (!buf->cookie) {
> -               dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
> +               dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
>                 kfree(buf);
>                 return ERR_PTR(-ENOMEM);
>         }
> @@ -167,6 +171,16 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
>         buf->size = size;
>         buf->dma_dir = dma_dir;
>
> +       ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt, buf->cookie,
> +                                   buf->dma_addr, buf->size, buf->attrs);
> +       if (ret < 0) {
> +               dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
> +                              buf->attrs);
> +               put_device(dev);
> +               return ERR_PTR(-ENOMEM);
> +       }

Is this okay to always get this sgtable even for coherent MMAP
buffers? It will add some unnecessary memory overhead, which could be
simply avoided by checking for DMA_ATTR_NON_CONSISTENT in buf->attrs.

> +
> +       buf->dma_sgt = &buf->__dma_sgt;
>         buf->handler.refcount = &buf->refcount;
>         buf->handler.put = vb2_dc_put;
>         buf->handler.arg = buf;
> @@ -339,6 +353,40 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum)
>         return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
>  }
>
> +static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> +                                             enum dma_data_direction direction)
> +{
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       /*
> +        * DMABUF exporter will flush the cache for us; only USERPTR

Hmm, I think we can't end up in this function for a DMA-buf that was
exported by another exporter, but I might be missing something. In any
case, this comment is a bit confusing, as it's a part of DMABUF
exporter implementation. Also can we even export USERPTR buffers?

> +        * and MMAP buffers with non-coherent memory will be flushed.
> +        */
> +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)

Ditto.

> +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
> +                                   buf->dma_dir);
> +
> +       return 0;
> +}
> +
> +static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> +                                           enum dma_data_direction direction)
> +{
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       /*
> +        * DMABUF exporter will flush the cache for us; only USERPTR
> +        * and MMAP buffers with non-coherent memory will be flushed.
> +        */

Ditto.

> +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)

Ditto.

> +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
> +                                      buf->dma_dir);
> +
> +       return 0;
> +}
> +
>  static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
>  {
>         struct vb2_dc_buf *buf = dbuf->priv;
> @@ -359,6 +407,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
>         .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
>         .kmap = vb2_dc_dmabuf_ops_kmap,
>         .kmap_atomic = vb2_dc_dmabuf_ops_kmap,
> +       .begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
> +       .end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
>         .vmap = vb2_dc_dmabuf_ops_vmap,
>         .mmap = vb2_dc_dmabuf_ops_mmap,
>         .release = vb2_dc_dmabuf_ops_release,
> @@ -412,11 +462,12 @@ static void vb2_dc_put_userptr(void *buf_priv)
>
>         if (sgt) {
>                 /*
> -                * No need to sync to CPU, it's already synced to the CPU
> -                * since the finish() memop will have been called before this.
> +                * Don't ask to skip cache sync in case if the user
> +                * did ask to skip cache flush the last time the
> +                * buffer was dequeued.
>                  */
>                 dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
> -                                  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> +                                  buf->dma_dir, buf->attrs);

Hmm, what is "the user"?

Regardless of that, we need to set DMA_ATTR_SKIP_CPU_SYNC by default
for USERPTR to keep current behavior, but I don't see code doing that
anywhere in this or previous patches.

>                 pages = frame_vector_pages(buf->vec);
>                 /* sgt should exist only if vector contains pages... */
>                 BUG_ON(IS_ERR(pages));
> @@ -491,6 +542,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
>
>         buf->dev = dev;
>         buf->dma_dir = dma_dir;
> +       buf->attrs = attrs;
>
>         offset = vaddr & ~PAGE_MASK;
>         vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
> @@ -526,13 +578,11 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
>         buf->dma_sgt = &buf->__dma_sgt;
>
>         /*
> -        * No need to sync to the device, this will happen later when the
> -        * prepare() memop is called.
> +        * Sync the cache now; the user might not ever ask for it.

Ditto.

>          */
>         buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
>                                                buf->dma_sgt->orig_nents,
> -                                              buf->dma_dir,
> -                                              DMA_ATTR_SKIP_CPU_SYNC);
> +                                              buf->dma_dir, buf->attrs);
>         if (buf->dma_sgt->nents <= 0) {
>                 pr_err("failed to map scatterlist\n");
>                 ret = -EIO;
> @@ -556,7 +606,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
>  fail_map_sg:
>         dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
>                            buf->dma_sgt->orig_nents, buf->dma_dir,
> -                          DMA_ATTR_SKIP_CPU_SYNC);
> +                          buf->attrs);

Ditto.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested
  2017-05-08 15:03 ` [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested Sakari Ailus
@ 2017-05-10 11:00   ` Tomasz Figa
  2017-05-23 12:35     ` Sakari Ailus
  0 siblings, 1 reply; 33+ messages in thread
From: Tomasz Figa @ 2017-05-10 11:00 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-media, dri-devel, posciak, Marek Szyprowski, Kyungmin Park,
	Hans Verkuil, sumit.semwal, Rob Clark, Daniel Vetter, labbott,
	Laurent Pinchart, Samu Onkalo

Hi Sakari,

Few comments inline.

On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
> From: Samu Onkalo <samu.onkalo@intel.com>
>
> The user may request to the driver (vb2) to skip the cache maintenance
> operations in case the buffer does not need cache synchronisation, e.g. in
> cases where the buffer is passed between hardware blocks without it being
> touched by the CPU.
[snip]
> @@ -1199,6 +1236,11 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
>                         dprintk(1, "buffer initialization failed\n");
>                         goto err;
>                 }
> +
> +               /* This is new buffer memory --- always synchronise cache. */
> +               __mem_prepare_planes(vb);
> +       } else if (!no_cache_sync) {
> +               __mem_prepare_planes(vb);

Do we actually need this at all for DMABUF, given that respective
callbacks in both vb2_dc and vb2_sg actually bail out if so?

>         }
>
>         ret = call_vb_qop(vb, buf_prepare, vb);
[snip]
> @@ -568,7 +571,11 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>         }
>
>         ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> -       return ret ? ret : vb2_core_qbuf(q, b->index, b);
> +       if (ret)
> +               return ret;
> +
> +       return vb2_core_qbuf(q, b->index, b,
> +                            b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);

Can we really let the userspace alone control this? I believe there
are drivers that need to do some fixup in OUTPUT buffers before
sending to the hardware or in CAPTURE buffer after getting from the
hardware, respectively in buf_prepare or buf_finish. I believe such
driver needs to be able to override this behavior.

Actually I'm wondering if we really need this buffer flag at all.
Wouldn't the following work for typical use cases that we actually
care about performance of?

buffer_needs_cache_sync = (buffer_type_is_MMAP &&
buffer_is_non_coherent && (buffer_is_mmapped ||
buffer_has_kernel_mapping)) || buffer_is_USERPTR

The above should cover all the fast paths that are used only to
exchange data between devices, without the CPU involved, assuming that
drivers that don't need the fixups I mentioned before are properly
updated to request no kernel mapping for allocated buffers.

I've added (buffer_is_USERPTR) to the equation as it's really hard to
imagine a use case where there is no CPU access to the buffer, but
USERPTR needs to be used (instead of DMABUF). I might be missing
something, though.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management
  2017-05-10 10:32   ` Tomasz Figa
@ 2017-05-23 12:07       ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-23 12:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Sakari Ailus, linux-media, dri-devel, posciak, Marek Szyprowski,
	Kyungmin Park, Hans Verkuil, sumit.semwal, Rob Clark,
	Daniel Vetter, labbott, Laurent Pinchart

Hi Tomasz,

Thanks for the review!

Considering where things currently stand, I thing getting the three first
patches merged in the near future would make sense. The rest will need a bit
more work, and adding buffer flags that aren't used wouldn't really make
sense either. See below.

On Wed, May 10, 2017 at 06:32:43PM +0800, Tomasz Figa wrote:
> Hi Sakari,
> 
> Some comments inline.
> 
> On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
> <sakari.ailus@linux.intel.com> wrote:
> > Patch ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA
> > attrs") added support for driver specific DMA attributes to
> > videobuf2-dma-contig but it had several issues in it.
> >
> > In particular,
> >
> > - cache operations were only performed on USERPTR buffers,
> >
> > - DMA attributes were set only for MMAP buffers and
> >
> > - it did not provide begin_cpu_access() and end_cpu_access() dma_buf_ops
> >   callbacks for cache syncronisation on exported MMAP buffers.
> >
> > This patch corrects these issues.
> 
> I think the above are not really issues for the use cases the original
> commit added the support for, i.e. disabling kernel mapping. There was
> no intention of allowing cached MMAP buffers. Although I guess the
> code added by it was not strict enough and didn't check the flags for
> allowed ones.

Now that I've learned about the particularities of ARM DMA API
implementation, it seems that the exact implementation on ARM is well
reflected in videobuf2 mem ops. Videobuf2 isn't specific to ARM however.

> 
> >
> > Also arrange the header files alphabetically.
> >
> > Fixes: ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA attrs")
> > Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> > ---
> >  drivers/media/v4l2-core/videobuf2-dma-contig.c | 94 ++++++++++++++++++++------
> >  1 file changed, 72 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > index 0afc3da..8b0298a 100644
> > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > @@ -11,12 +11,12 @@
> >   */
> >
> >  #include <linux/dma-buf.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/module.h>
> >  #include <linux/refcount.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/sched.h>
> >  #include <linux/slab.h>
> > -#include <linux/dma-mapping.h>
> >
> >  #include <media/videobuf2-v4l2.h>
> >  #include <media/videobuf2-dma-contig.h>
> > @@ -97,12 +97,13 @@ static void vb2_dc_prepare(void *buf_priv)
> >         struct vb2_dc_buf *buf = buf_priv;
> >         struct sg_table *sgt = buf->dma_sgt;
> >
> > -       /* DMABUF exporter will flush the cache for us */
> 
> Ah, this annoying comment goes away here! Thanks and sorry for the
> noise in previous patch.
> 
> > -       if (!buf->vec)
> > -               return;
> > -
> > -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> > -                              buf->dma_dir);
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> 
> Sounds much better.
> 
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Hmm, we change it for USERPTR to rely on presence of
> DMA_ATTR_NON_CONSISTENT in buf->attrs, but I don't see it being set
> for such anywhere by previous patches.

When writing this patch, I assumed that all drivers that are expected to
work with non-coherent memory would specify that flag. They do not.

A workable approach could be to start with specifying that flag on ARM based
on the buffer type. The DMA API is a bit confusing though, as it assumes
that the DMA API implementation will have nop implementations of cache
synchronisation functions. At least the ARM implementation does not, but it
expects the user rather not call these functions to begin with. Probably
because there's no enough infomation available to figure that out.

> 
> > +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> > +                                      buf->dma_dir);
> >  }
> >
> >  static void vb2_dc_finish(void *buf_priv)
> > @@ -110,11 +111,13 @@ static void vb2_dc_finish(void *buf_priv)
> >         struct vb2_dc_buf *buf = buf_priv;
> >         struct sg_table *sgt = buf->dma_sgt;
> >
> > -       /* DMABUF exporter will flush the cache for us */
> > -       if (!buf->vec)
> > -               return;
> > -
> > -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
> > +                                   buf->dma_dir);
> >  }
> >
> >  /*********************************************/
> > @@ -142,6 +145,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >                           gfp_t gfp_flags)
> >  {
> >         struct vb2_dc_buf *buf;
> > +       int ret;
> >
> >         if (WARN_ON(!dev))
> >                 return ERR_PTR(-EINVAL);
> > @@ -152,9 +156,9 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >
> >         buf->attrs = attrs;
> >         buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
> > -                                       GFP_KERNEL | gfp_flags, buf->attrs);
> > +                                     GFP_KERNEL | gfp_flags, buf->attrs);
> 
> White space change, not sure if intended.

Well, partially. A whitespace fix a day keeps indent(1) away? :-)

> 
> >         if (!buf->cookie) {
> > -               dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
> > +               dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
> >                 kfree(buf);
> >                 return ERR_PTR(-ENOMEM);
> >         }
> > @@ -167,6 +171,16 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >         buf->size = size;
> >         buf->dma_dir = dma_dir;
> >
> > +       ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt, buf->cookie,
> > +                                   buf->dma_addr, buf->size, buf->attrs);
> > +       if (ret < 0) {
> > +               dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
> > +                              buf->attrs);
> > +               put_device(dev);
> > +               return ERR_PTR(-ENOMEM);
> > +       }
> 
> Is this okay to always get this sgtable even for coherent MMAP
> buffers? It will add some unnecessary memory overhead, which could be
> simply avoided by checking for DMA_ATTR_NON_CONSISTENT in buf->attrs.

Good point. I guess this can be dropped if DMA_ATTR_NON_CONSISTENT is not
set (assuming it is set for devices that need it).

As per DMA API documentation, explicit cache synchronisation is needed but I
presume the appropriate change would be to change the documentation.

> 
> > +
> > +       buf->dma_sgt = &buf->__dma_sgt;
> >         buf->handler.refcount = &buf->refcount;
> >         buf->handler.put = vb2_dc_put;
> >         buf->handler.arg = buf;
> > @@ -339,6 +353,40 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum)
> >         return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
> >  }
> >
> > +static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> > +                                             enum dma_data_direction direction)
> > +{
> > +       struct vb2_dc_buf *buf = dbuf->priv;
> > +       struct sg_table *sgt = buf->dma_sgt;
> > +
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> 
 > Hmm, I think we can't end up in this function for a DMA-buf that was
> exported by another exporter, but I might be missing something. In any
> case, this comment is a bit confusing, as it's a part of DMABUF
> exporter implementation. Also can we even export USERPTR buffers?

The comment needs to be revised, it has been copied from the finish and
prepare mem ops.

> 
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
> > +                                   buf->dma_dir);
> > +
> > +       return 0;
> > +}
> > +
> > +static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> > +                                           enum dma_data_direction direction)
> > +{
> > +       struct vb2_dc_buf *buf = dbuf->priv;
> > +       struct sg_table *sgt = buf->dma_sgt;
> > +
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> 
> Ditto.
> 
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
> > +                                      buf->dma_dir);
> > +
> > +       return 0;
> > +}
> > +
> >  static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
> >  {
> >         struct vb2_dc_buf *buf = dbuf->priv;
> > @@ -359,6 +407,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
> >         .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
> >         .kmap = vb2_dc_dmabuf_ops_kmap,
> >         .kmap_atomic = vb2_dc_dmabuf_ops_kmap,
> > +       .begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
> > +       .end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
> >         .vmap = vb2_dc_dmabuf_ops_vmap,
> >         .mmap = vb2_dc_dmabuf_ops_mmap,
> >         .release = vb2_dc_dmabuf_ops_release,
> > @@ -412,11 +462,12 @@ static void vb2_dc_put_userptr(void *buf_priv)
> >
> >         if (sgt) {
> >                 /*
> > -                * No need to sync to CPU, it's already synced to the CPU
> > -                * since the finish() memop will have been called before this.
> > +                * Don't ask to skip cache sync in case if the user
> > +                * did ask to skip cache flush the last time the
> > +                * buffer was dequeued.
> >                  */
> >                 dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
> > -                                  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +                                  buf->dma_dir, buf->attrs);
> 
> Hmm, what is "the user"?

The user space. I'll change the comment accordingly.

> 
> Regardless of that, we need to set DMA_ATTR_SKIP_CPU_SYNC by default
> for USERPTR to keep current behavior, but I don't see code doing that
> anywhere in this or previous patches.

Correct. We need to do that on ARM at least.

> 
> >                 pages = frame_vector_pages(buf->vec);
> >                 /* sgt should exist only if vector contains pages... */
> >                 BUG_ON(IS_ERR(pages));
> > @@ -491,6 +542,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >
> >         buf->dev = dev;
> >         buf->dma_dir = dma_dir;
> > +       buf->attrs = attrs;
> >
> >         offset = vaddr & ~PAGE_MASK;
> >         vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
> > @@ -526,13 +578,11 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >         buf->dma_sgt = &buf->__dma_sgt;
> >
> >         /*
> > -        * No need to sync to the device, this will happen later when the
> > -        * prepare() memop is called.
> > +        * Sync the cache now; the user might not ever ask for it.
> 
> Ditto.
> 
> >          */
> >         buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
> >                                                buf->dma_sgt->orig_nents,
> > -                                              buf->dma_dir,
> > -                                              DMA_ATTR_SKIP_CPU_SYNC);
> > +                                              buf->dma_dir, buf->attrs);
> >         if (buf->dma_sgt->nents <= 0) {
> >                 pr_err("failed to map scatterlist\n");
> >                 ret = -EIO;
> > @@ -556,7 +606,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >  fail_map_sg:
> >         dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
> >                            buf->dma_sgt->orig_nents, buf->dma_dir,
> > -                          DMA_ATTR_SKIP_CPU_SYNC);
> > +                          buf->attrs);
> 
> Ditto.
> 

-- 
Regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management
@ 2017-05-23 12:07       ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-23 12:07 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Daniel Vetter, dri-devel, Hans Verkuil, Kyungmin Park,
	Laurent Pinchart, Sakari Ailus, linux-media, posciak,
	Marek Szyprowski

Hi Tomasz,

Thanks for the review!

Considering where things currently stand, I thing getting the three first
patches merged in the near future would make sense. The rest will need a bit
more work, and adding buffer flags that aren't used wouldn't really make
sense either. See below.

On Wed, May 10, 2017 at 06:32:43PM +0800, Tomasz Figa wrote:
> Hi Sakari,
> 
> Some comments inline.
> 
> On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
> <sakari.ailus@linux.intel.com> wrote:
> > Patch ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA
> > attrs") added support for driver specific DMA attributes to
> > videobuf2-dma-contig but it had several issues in it.
> >
> > In particular,
> >
> > - cache operations were only performed on USERPTR buffers,
> >
> > - DMA attributes were set only for MMAP buffers and
> >
> > - it did not provide begin_cpu_access() and end_cpu_access() dma_buf_ops
> >   callbacks for cache syncronisation on exported MMAP buffers.
> >
> > This patch corrects these issues.
> 
> I think the above are not really issues for the use cases the original
> commit added the support for, i.e. disabling kernel mapping. There was
> no intention of allowing cached MMAP buffers. Although I guess the
> code added by it was not strict enough and didn't check the flags for
> allowed ones.

Now that I've learned about the particularities of ARM DMA API
implementation, it seems that the exact implementation on ARM is well
reflected in videobuf2 mem ops. Videobuf2 isn't specific to ARM however.

> 
> >
> > Also arrange the header files alphabetically.
> >
> > Fixes: ccc66e73 ("ARM: 8508/2: videobuf2-dc: Let drivers specify DMA attrs")
> > Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> > ---
> >  drivers/media/v4l2-core/videobuf2-dma-contig.c | 94 ++++++++++++++++++++------
> >  1 file changed, 72 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > index 0afc3da..8b0298a 100644
> > --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> > @@ -11,12 +11,12 @@
> >   */
> >
> >  #include <linux/dma-buf.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/module.h>
> >  #include <linux/refcount.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/sched.h>
> >  #include <linux/slab.h>
> > -#include <linux/dma-mapping.h>
> >
> >  #include <media/videobuf2-v4l2.h>
> >  #include <media/videobuf2-dma-contig.h>
> > @@ -97,12 +97,13 @@ static void vb2_dc_prepare(void *buf_priv)
> >         struct vb2_dc_buf *buf = buf_priv;
> >         struct sg_table *sgt = buf->dma_sgt;
> >
> > -       /* DMABUF exporter will flush the cache for us */
> 
> Ah, this annoying comment goes away here! Thanks and sorry for the
> noise in previous patch.
> 
> > -       if (!buf->vec)
> > -               return;
> > -
> > -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> > -                              buf->dma_dir);
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> 
> Sounds much better.
> 
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Hmm, we change it for USERPTR to rely on presence of
> DMA_ATTR_NON_CONSISTENT in buf->attrs, but I don't see it being set
> for such anywhere by previous patches.

When writing this patch, I assumed that all drivers that are expected to
work with non-coherent memory would specify that flag. They do not.

A workable approach could be to start with specifying that flag on ARM based
on the buffer type. The DMA API is a bit confusing though, as it assumes
that the DMA API implementation will have nop implementations of cache
synchronisation functions. At least the ARM implementation does not, but it
expects the user rather not call these functions to begin with. Probably
because there's no enough infomation available to figure that out.

> 
> > +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
> > +                                      buf->dma_dir);
> >  }
> >
> >  static void vb2_dc_finish(void *buf_priv)
> > @@ -110,11 +111,13 @@ static void vb2_dc_finish(void *buf_priv)
> >         struct vb2_dc_buf *buf = buf_priv;
> >         struct sg_table *sgt = buf->dma_sgt;
> >
> > -       /* DMABUF exporter will flush the cache for us */
> > -       if (!buf->vec)
> > -               return;
> > -
> > -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents,
> > +                                   buf->dma_dir);
> >  }
> >
> >  /*********************************************/
> > @@ -142,6 +145,7 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >                           gfp_t gfp_flags)
> >  {
> >         struct vb2_dc_buf *buf;
> > +       int ret;
> >
> >         if (WARN_ON(!dev))
> >                 return ERR_PTR(-EINVAL);
> > @@ -152,9 +156,9 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >
> >         buf->attrs = attrs;
> >         buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
> > -                                       GFP_KERNEL | gfp_flags, buf->attrs);
> > +                                     GFP_KERNEL | gfp_flags, buf->attrs);
> 
> White space change, not sure if intended.

Well, partially. A whitespace fix a day keeps indent(1) away? :-)

> 
> >         if (!buf->cookie) {
> > -               dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
> > +               dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
> >                 kfree(buf);
> >                 return ERR_PTR(-ENOMEM);
> >         }
> > @@ -167,6 +171,16 @@ static void *vb2_dc_alloc(struct device *dev, unsigned long attrs,
> >         buf->size = size;
> >         buf->dma_dir = dma_dir;
> >
> > +       ret = dma_get_sgtable_attrs(buf->dev, &buf->__dma_sgt, buf->cookie,
> > +                                   buf->dma_addr, buf->size, buf->attrs);
> > +       if (ret < 0) {
> > +               dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
> > +                              buf->attrs);
> > +               put_device(dev);
> > +               return ERR_PTR(-ENOMEM);
> > +       }
> 
> Is this okay to always get this sgtable even for coherent MMAP
> buffers? It will add some unnecessary memory overhead, which could be
> simply avoided by checking for DMA_ATTR_NON_CONSISTENT in buf->attrs.

Good point. I guess this can be dropped if DMA_ATTR_NON_CONSISTENT is not
set (assuming it is set for devices that need it).

As per DMA API documentation, explicit cache synchronisation is needed but I
presume the appropriate change would be to change the documentation.

> 
> > +
> > +       buf->dma_sgt = &buf->__dma_sgt;
> >         buf->handler.refcount = &buf->refcount;
> >         buf->handler.put = vb2_dc_put;
> >         buf->handler.arg = buf;
> > @@ -339,6 +353,40 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, unsigned long pgnum)
> >         return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
> >  }
> >
> > +static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> > +                                             enum dma_data_direction direction)
> > +{
> > +       struct vb2_dc_buf *buf = dbuf->priv;
> > +       struct sg_table *sgt = buf->dma_sgt;
> > +
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> 
 > Hmm, I think we can't end up in this function for a DMA-buf that was
> exported by another exporter, but I might be missing something. In any
> case, this comment is a bit confusing, as it's a part of DMABUF
> exporter implementation. Also can we even export USERPTR buffers?

The comment needs to be revised, it has been copied from the finish and
prepare mem ops.

> 
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents,
> > +                                   buf->dma_dir);
> > +
> > +       return 0;
> > +}
> > +
> > +static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> > +                                           enum dma_data_direction direction)
> > +{
> > +       struct vb2_dc_buf *buf = dbuf->priv;
> > +       struct sg_table *sgt = buf->dma_sgt;
> > +
> > +       /*
> > +        * DMABUF exporter will flush the cache for us; only USERPTR
> > +        * and MMAP buffers with non-coherent memory will be flushed.
> > +        */
> 
> Ditto.
> 
> > +       if (buf->attrs & DMA_ATTR_NON_CONSISTENT)
> 
> Ditto.
> 
> > +               dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents,
> > +                                      buf->dma_dir);
> > +
> > +       return 0;
> > +}
> > +
> >  static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
> >  {
> >         struct vb2_dc_buf *buf = dbuf->priv;
> > @@ -359,6 +407,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
> >         .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
> >         .kmap = vb2_dc_dmabuf_ops_kmap,
> >         .kmap_atomic = vb2_dc_dmabuf_ops_kmap,
> > +       .begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
> > +       .end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
> >         .vmap = vb2_dc_dmabuf_ops_vmap,
> >         .mmap = vb2_dc_dmabuf_ops_mmap,
> >         .release = vb2_dc_dmabuf_ops_release,
> > @@ -412,11 +462,12 @@ static void vb2_dc_put_userptr(void *buf_priv)
> >
> >         if (sgt) {
> >                 /*
> > -                * No need to sync to CPU, it's already synced to the CPU
> > -                * since the finish() memop will have been called before this.
> > +                * Don't ask to skip cache sync in case if the user
> > +                * did ask to skip cache flush the last time the
> > +                * buffer was dequeued.
> >                  */
> >                 dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
> > -                                  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +                                  buf->dma_dir, buf->attrs);
> 
> Hmm, what is "the user"?

The user space. I'll change the comment accordingly.

> 
> Regardless of that, we need to set DMA_ATTR_SKIP_CPU_SYNC by default
> for USERPTR to keep current behavior, but I don't see code doing that
> anywhere in this or previous patches.

Correct. We need to do that on ARM at least.

> 
> >                 pages = frame_vector_pages(buf->vec);
> >                 /* sgt should exist only if vector contains pages... */
> >                 BUG_ON(IS_ERR(pages));
> > @@ -491,6 +542,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >
> >         buf->dev = dev;
> >         buf->dma_dir = dma_dir;
> > +       buf->attrs = attrs;
> >
> >         offset = vaddr & ~PAGE_MASK;
> >         vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
> > @@ -526,13 +578,11 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >         buf->dma_sgt = &buf->__dma_sgt;
> >
> >         /*
> > -        * No need to sync to the device, this will happen later when the
> > -        * prepare() memop is called.
> > +        * Sync the cache now; the user might not ever ask for it.
> 
> Ditto.
> 
> >          */
> >         buf->dma_sgt->nents = dma_map_sg_attrs(buf->dev, buf->dma_sgt->sgl,
> >                                                buf->dma_sgt->orig_nents,
> > -                                              buf->dma_dir,
> > -                                              DMA_ATTR_SKIP_CPU_SYNC);
> > +                                              buf->dma_dir, buf->attrs);
> >         if (buf->dma_sgt->nents <= 0) {
> >                 pr_err("failed to map scatterlist\n");
> >                 ret = -EIO;
> > @@ -556,7 +606,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> >  fail_map_sg:
> >         dma_unmap_sg_attrs(buf->dev, buf->dma_sgt->sgl,
> >                            buf->dma_sgt->orig_nents, buf->dma_dir,
> > -                          DMA_ATTR_SKIP_CPU_SYNC);
> > +                          buf->attrs);
> 
> Ditto.
> 

-- 
Regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested
  2017-05-10 11:00   ` Tomasz Figa
@ 2017-05-23 12:35     ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-05-23 12:35 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Sakari Ailus, linux-media, dri-devel, posciak, Marek Szyprowski,
	Kyungmin Park, Hans Verkuil, sumit.semwal, Rob Clark,
	Daniel Vetter, labbott, Laurent Pinchart, Samu Onkalo

Hi Tomasz,

On Wed, May 10, 2017 at 07:00:10PM +0800, Tomasz Figa wrote:
> Hi Sakari,
> 
> Few comments inline.
> 
> On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
> <sakari.ailus@linux.intel.com> wrote:
> > From: Samu Onkalo <samu.onkalo@intel.com>
> >
> > The user may request to the driver (vb2) to skip the cache maintenance
> > operations in case the buffer does not need cache synchronisation, e.g. in
> > cases where the buffer is passed between hardware blocks without it being
> > touched by the CPU.
> [snip]
> > @@ -1199,6 +1236,11 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
> >                         dprintk(1, "buffer initialization failed\n");
> >                         goto err;
> >                 }
> > +
> > +               /* This is new buffer memory --- always synchronise cache. */
> > +               __mem_prepare_planes(vb);
> > +       } else if (!no_cache_sync) {
> > +               __mem_prepare_planes(vb);
> 
> Do we actually need this at all for DMABUF, given that respective
> callbacks in both vb2_dc and vb2_sg actually bail out if so?

I think the original purpose for the finish and prepare might have allowed
more than just cache synchronisation but that's all they've ever done. I
think the documentation should be changed to reflect this, and then we could
drop the call here.

There should be no need for this anyway.

> 
> >         }
> >
> >         ret = call_vb_qop(vb, buf_prepare, vb);
> [snip]
> > @@ -568,7 +571,11 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
> >         }
> >
> >         ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> > -       return ret ? ret : vb2_core_qbuf(q, b->index, b);
> > +       if (ret)
> > +               return ret;
> > +
> > +       return vb2_core_qbuf(q, b->index, b,
> > +                            b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);
> 
> Can we really let the userspace alone control this? I believe there
> are drivers that need to do some fixup in OUTPUT buffers before
> sending to the hardware or in CAPTURE buffer after getting from the
> hardware, respectively in buf_prepare or buf_finish. I believe such
> driver needs to be able to override this behavior.

Good point.

> 
> Actually I'm wondering if we really need this buffer flag at all.
> Wouldn't the following work for typical use cases that we actually
> care about performance of?
> 
> buffer_needs_cache_sync = (buffer_type_is_MMAP &&
> buffer_is_non_coherent && (buffer_is_mmapped ||
> buffer_has_kernel_mapping)) || buffer_is_USERPTR

Not in general case. The information the driver does not have currently is
whether or not the user has accessed the buffer (written to it) at various
points. I don't think there's another way to handle this than let the user
tell this to the kernel.

Even now, V4L2 does not require the application not to write to CAPTURE
buffers which means that cache synchronisation operations will need to be
used even when queueing such a buffer. This is where the flag helps, too.

DMA-BUFs are a different matter, this is not really addressed by the
patchset.

> 
> The above should cover all the fast paths that are used only to
> exchange data between devices, without the CPU involved, assuming that
> drivers that don't need the fixups I mentioned before are properly
> updated to request no kernel mapping for allocated buffers.
> 
> I've added (buffer_is_USERPTR) to the equation as it's really hard to
> imagine a use case where there is no CPU access to the buffer, but
> USERPTR needs to be used (instead of DMABUF). I might be missing
> something, though.

If you have a USERPTR buffer the backed memory of which you use with two
devices and don't touch the buffer memory, no cache synchronisation will be
needed.

DMA-BUF would benefit from improvements in the same area.

-- 
Regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency
  2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
                   ` (17 preceding siblings ...)
  2017-05-08 15:03 ` [RFC v4 18/18] v4l: Use non-consistent DMA mappings for hardware that deserves it Sakari Ailus
@ 2017-06-07 17:13 ` Mauro Carvalho Chehab
  2017-06-07 20:56   ` Sakari Ailus
  18 siblings, 1 reply; 33+ messages in thread
From: Mauro Carvalho Chehab @ 2017-06-07 17:13 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-media, dri-devel, posciak, m.szyprowski, kyungmin.park,
	hverkuil, sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Hi Sakari,

Em Mon,  8 May 2017 18:03:12 +0300
Sakari Ailus <sakari.ailus@linux.intel.com> escreveu:

> Hello,
> 
> This is a rebased and partially reworked version of the vb2 cache hints
> support patch series posted by first myself, then Laurent and then myself
> again.
> 
> I'm still posting this as RFC primarily because more testing and driver
> changes will be needed. In particular, a lot of platform drivers assume
> non-coherent memory but are not properly labelled as such.

The main issue I see is that, if the driver doesn't "annotate" if it
is requiring coherent or non-coherent memory, VB2 should be preserving
its old behavior, as, otherwise, it will risk causing regressions. 

So, perhaps instead of creating a single flag
V4L2_BUF_FLAG_NO_CACHE_SYNC, it would make sense to create two
flags, using either one of them on newer drivers. For old drivers
that won't set either, it would keep the current behavior.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency
  2017-06-07 17:13 ` [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Mauro Carvalho Chehab
@ 2017-06-07 20:56   ` Sakari Ailus
  0 siblings, 0 replies; 33+ messages in thread
From: Sakari Ailus @ 2017-06-07 20:56 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, dri-devel, posciak, m.szyprowski, kyungmin.park,
	hverkuil, sumit.semwal, robdclark, daniel.vetter, labbott,
	laurent.pinchart

Hi Mauro,

Mauro Carvalho Chehab wrote:
> Hi Sakari,
>
> Em Mon,  8 May 2017 18:03:12 +0300
> Sakari Ailus <sakari.ailus@linux.intel.com> escreveu:
>
>> Hello,
>>
>> This is a rebased and partially reworked version of the vb2 cache hints
>> support patch series posted by first myself, then Laurent and then myself
>> again.
>>
>> I'm still posting this as RFC primarily because more testing and driver
>> changes will be needed. In particular, a lot of platform drivers assume
>> non-coherent memory but are not properly labelled as such.
>
> The main issue I see is that, if the driver doesn't "annotate" if it
> is requiring coherent or non-coherent memory, VB2 should be preserving
> its old behavior, as, otherwise, it will risk causing regressions.

Some of the assumptions in VB2 mirror the particular design choices made 
in ARM DMA API implementation. This was found out during the review. The 
set requires further work in order to be mergeable to get around these 
issues, until then this remains in RFC stage.

I posted the three first patches separately --- these do not change how 
cache management works.

-- 
Kind regards,

Sakari Ailus
sakari.ailus@linux.intel.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler
  2017-05-08 15:03 ` [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler Sakari Ailus
@ 2018-10-05  4:34     ` Tomasz Figa
  0 siblings, 0 replies; 33+ messages in thread
From: Tomasz Figa @ 2018-10-05  4:34 UTC (permalink / raw)
  To: Sakari Ailus, Hans Verkuil
  Cc: Linux Media Mailing List, dri-devel, Pawel Osciak,
	Marek Szyprowski, Kyungmin Park, Sumit Semwal, Rob Clark,
	Daniel Vetter, labbott, Laurent Pinchart

Hi Sakari, Hans,

On Tue, May 9, 2017 at 12:05 AM Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
>
> The cache synchronisation may be a time consuming operation and thus not
> best performed in an interrupt which is a typical context for
> vb2_buffer_done() calls. This may consume up to tens of ms on some
> machines, depending on the buffer size.
>
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/v4l2-core/videobuf2-core.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index 8bf3369..e866115 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -889,7 +889,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>  {
>         struct vb2_queue *q = vb->vb2_queue;
>         unsigned long flags;
> -       unsigned int plane;
>
>         if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
>                 return;
> @@ -910,10 +909,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>         dprintk(4, "done processing on buffer %d, state: %d\n",
>                         vb->index, state);
>
> -       /* sync buffers */
> -       for (plane = 0; plane < vb->num_planes; ++plane)
> -               call_void_memop(vb, finish, vb->planes[plane].mem_priv);
> -
>         spin_lock_irqsave(&q->done_lock, flags);
>         if (state == VB2_BUF_STATE_QUEUED ||
>             state == VB2_BUF_STATE_REQUEUEING) {
> @@ -1573,6 +1568,10 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)
>
>         vb->state = VB2_BUF_STATE_DEQUEUED;
>
> +       /* sync buffers */
> +       for (i = 0; i < vb->num_planes; ++i)
> +               call_void_memop(vb, finish, vb->planes[i].mem_priv);
> +

Sorry for digging up this old patch. Posting for reference, in case
someone decides to use or take over this patch.

This piece of code seems to be executed after queue's .buf_finish()
callback. The latter is allowed to access the buffer contents, so it
looks like we're breaking it, because it would now access the buffer
before the cache is synchronized.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler
@ 2018-10-05  4:34     ` Tomasz Figa
  0 siblings, 0 replies; 33+ messages in thread
From: Tomasz Figa @ 2018-10-05  4:34 UTC (permalink / raw)
  To: Sakari Ailus, Hans Verkuil
  Cc: Daniel Vetter, dri-devel, Kyungmin Park, Laurent Pinchart,
	Linux Media Mailing List, Pawel Osciak, Marek Szyprowski

Hi Sakari, Hans,

On Tue, May 9, 2017 at 12:05 AM Sakari Ailus
<sakari.ailus@linux.intel.com> wrote:
>
> The cache synchronisation may be a time consuming operation and thus not
> best performed in an interrupt which is a typical context for
> vb2_buffer_done() calls. This may consume up to tens of ms on some
> machines, depending on the buffer size.
>
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
> ---
>  drivers/media/v4l2-core/videobuf2-core.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index 8bf3369..e866115 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -889,7 +889,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>  {
>         struct vb2_queue *q = vb->vb2_queue;
>         unsigned long flags;
> -       unsigned int plane;
>
>         if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
>                 return;
> @@ -910,10 +909,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>         dprintk(4, "done processing on buffer %d, state: %d\n",
>                         vb->index, state);
>
> -       /* sync buffers */
> -       for (plane = 0; plane < vb->num_planes; ++plane)
> -               call_void_memop(vb, finish, vb->planes[plane].mem_priv);
> -
>         spin_lock_irqsave(&q->done_lock, flags);
>         if (state == VB2_BUF_STATE_QUEUED ||
>             state == VB2_BUF_STATE_REQUEUEING) {
> @@ -1573,6 +1568,10 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)
>
>         vb->state = VB2_BUF_STATE_DEQUEUED;
>
> +       /* sync buffers */
> +       for (i = 0; i < vb->num_planes; ++i)
> +               call_void_memop(vb, finish, vb->planes[i].mem_priv);
> +

Sorry for digging up this old patch. Posting for reference, in case
someone decides to use or take over this patch.

This piece of code seems to be executed after queue's .buf_finish()
callback. The latter is allowed to access the buffer contents, so it
looks like we're breaking it, because it would now access the buffer
before the cache is synchronized.

Best regards,
Tomasz
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2018-10-05 11:31 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-08 15:03 [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Sakari Ailus
2017-05-08 15:03 ` [RFC v4 01/18] vb2: Rename confusingly named internal buffer preparation functions Sakari Ailus
2017-05-08 15:03 ` [RFC v4 02/18] vb2: Move buffer cache synchronisation to prepare from queue Sakari Ailus
2017-05-08 15:03   ` Sakari Ailus
2017-05-08 15:03 ` [RFC v4 03/18] vb2: Move cache synchronisation from buffer done to dqbuf handler Sakari Ailus
2018-10-05  4:34   ` Tomasz Figa
2018-10-05  4:34     ` Tomasz Figa
2017-05-08 15:03 ` [RFC v4 04/18] v4l: Unify cache management hint buffer flags Sakari Ailus
2017-05-08 15:03 ` [RFC v4 05/18] vb2: Anticipate queue specific DMA attributes for USERPTR buffers Sakari Ailus
2017-05-08 15:03 ` [RFC v4 06/18] vb2: dma-contig: Assign DMA attrs for a buffer unconditionally Sakari Ailus
2017-05-08 15:03 ` [RFC v4 07/18] vb2: dma-contig: Remove redundant sgt_base field Sakari Ailus
2017-05-10 10:13   ` Tomasz Figa
2017-05-10 10:13     ` Tomasz Figa
2017-05-08 15:03 ` [RFC v4 08/18] vb2: dma-contig: Don't warn on failure in obtaining scatterlist Sakari Ailus
2017-05-08 15:03   ` Sakari Ailus
2017-05-08 15:03 ` [RFC v4 09/18] vb2: dma-contig: Allocate sgt as part of struct vb2_dc_buf Sakari Ailus
2017-05-08 15:03 ` [RFC v4 10/18] vb2: dma-contig: Fix DMA attribute and cache management Sakari Ailus
2017-05-08 15:03   ` Sakari Ailus
2017-05-10 10:32   ` Tomasz Figa
2017-05-23 12:07     ` Sakari Ailus
2017-05-23 12:07       ` Sakari Ailus
2017-05-08 15:03 ` [RFC v4 11/18] vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs Sakari Ailus
2017-05-08 15:03 ` [RFC v4 12/18] vb2: dma-sg: Let drivers decide DMA attrs of MMAP and USERPTR bufs Sakari Ailus
2017-05-08 15:03 ` [RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested Sakari Ailus
2017-05-10 11:00   ` Tomasz Figa
2017-05-23 12:35     ` Sakari Ailus
2017-05-08 15:03 ` [RFC v4 14/18] vb2: Improve struct vb2_mem_ops documentation; alloc and put are for MMAP Sakari Ailus
2017-05-08 15:03 ` [RFC v4 15/18] vb2: Dma direction is always DMA_TO_DEVICE in buffer preparation Sakari Ailus
2017-05-08 15:03 ` [RFC v4 16/18] vb2: Do sync plane cache only for CAPTURE buffers in finish memop Sakari Ailus
2017-05-08 15:03 ` [RFC v4 17/18] docs-rst: Document precise V4L2_BUF_FLAG_NO_CACHE_SYNC flag behaviour Sakari Ailus
2017-05-08 15:03 ` [RFC v4 18/18] v4l: Use non-consistent DMA mappings for hardware that deserves it Sakari Ailus
2017-06-07 17:13 ` [RFC v4 00/18] vb2: Handle user cache hints, allow drivers to choose cache coherency Mauro Carvalho Chehab
2017-06-07 20:56   ` Sakari Ailus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.