All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] [media] vb2: defer part of vb2_buffer_done() and move dma-buf unmap from DQBUF
@ 2016-08-17 18:28 Javier Martinez Canillas
  2016-08-17 18:28 ` [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue Javier Martinez Canillas
  2016-08-17 18:28 ` [RFC PATCH 2/2] [media] vb2: move dma-buf unmap from __vb2_dqbuf() to vb2_done_work() Javier Martinez Canillas
  0 siblings, 2 replies; 5+ messages in thread
From: Javier Martinez Canillas @ 2016-08-17 18:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hans Verkuil, Sakari Ailus, Javier Martinez Canillas,
	Mauro Carvalho Chehab, Marek Szyprowski, Kyungmin Park,
	Pawel Osciak, linux-media

Hello,

This patch series attempt to do the dma-buf unmap as soon as possible, once
the driver has finished using the buffer. Instead of waiting until DQBUF to
do the unmap.

Patch #1 splits vb2_buffer_done() and moves part of its logic to a workqueue
to avoid calling the vb2 .finish mem ops. Since doing a buffer sync can take
a lot of time so isn't suitable for interrupt context. This was suggested by
Hans Verkuil on a previous patch [0].

Patch #2 then moves the dma-buf unmap out of DQBUF to vb2_done_work() now that
this is executed in process context since the dmabuf unmap operation can sleep.

[0]: https://lkml.org/lkml/2016/8/13/36

Best regards,
Javier


Javier Martinez Canillas (2):
  [media] vb2: defer sync buffers from vb2_buffer_done() with a
    workqueue
  [media] vb2: move dma-buf unmap from __vb2_dqbuf() to vb2_done_work()

 drivers/media/v4l2-core/videobuf2-core.c | 114 ++++++++++++++++++++-----------
 include/media/videobuf2-core.h           |   5 ++
 2 files changed, 79 insertions(+), 40 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue
  2016-08-17 18:28 [RFC PATCH 0/2] [media] vb2: defer part of vb2_buffer_done() and move dma-buf unmap from DQBUF Javier Martinez Canillas
@ 2016-08-17 18:28 ` Javier Martinez Canillas
  2016-08-17 19:50   ` Sakari Ailus
  2016-08-17 18:28 ` [RFC PATCH 2/2] [media] vb2: move dma-buf unmap from __vb2_dqbuf() to vb2_done_work() Javier Martinez Canillas
  1 sibling, 1 reply; 5+ messages in thread
From: Javier Martinez Canillas @ 2016-08-17 18:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hans Verkuil, Sakari Ailus, Javier Martinez Canillas,
	Mauro Carvalho Chehab, Marek Szyprowski, Kyungmin Park,
	Pawel Osciak, linux-media

The vb2_buffer_done() function can be called from interrupt context but it
currently calls the vb2 memory allocator .finish operation to sync buffers
and this can take a long time, so it's not suitable to be done there.

This patch defers part of the vb2_buffer_done() logic to a worker thread
to avoid doing the time consuming operation in interrupt context.

This will also allow to unmap the dmabuf as soon as possible once the buf
has been processed by the driver instead of waiting until user-space call
VIDIOC_DQBUF. Since the dmabuf unmap can sleep, it couldn't be done from
the vb2_buffer_done() function as it was before.

Suggested-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
---

 drivers/media/v4l2-core/videobuf2-core.c | 81 +++++++++++++++++++++-----------
 include/media/videobuf2-core.h           |  5 ++
 2 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 1dbd7beb71f0..14bed8acf3cf 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -315,6 +315,45 @@ static void __setup_offsets(struct vb2_buffer *vb)
 	}
 }
 
+static void vb2_done_work(struct work_struct *work)
+{
+	struct vb2_buffer *vb = container_of(work, struct vb2_buffer,
+					     done_work);
+	struct vb2_queue *q = vb->vb2_queue;
+	unsigned long flags;
+	unsigned int plane;
+
+	/* sync buffers */
+	for (plane = 0; plane < vb->num_planes; ++plane)
+		call_void_memop(vb, finish, vb->planes[plane].mem_priv);
+
+	spin_lock_irqsave(&q->done_lock, flags);
+	if (vb->state == VB2_BUF_STATE_QUEUED ||
+	    vb->state == VB2_BUF_STATE_REQUEUEING) {
+		vb->state = VB2_BUF_STATE_QUEUED;
+	} else {
+		/* Add the buffer to the done buffers list */
+		list_add_tail(&vb->done_entry, &q->done_list);
+	}
+	atomic_dec(&q->owned_by_drv_count);
+	spin_unlock_irqrestore(&q->done_lock, flags);
+
+	trace_vb2_buf_done(q, vb);
+
+	switch (vb->state) {
+	case VB2_BUF_STATE_QUEUED:
+		break;
+	case VB2_BUF_STATE_REQUEUEING:
+		if (q->start_streaming_called)
+			__enqueue_in_driver(vb);
+		break;
+	default:
+		/* Inform any processes that may be waiting for buffers */
+		wake_up(&q->done_wq);
+		break;
+	}
+}
+
 /**
  * __vb2_queue_alloc() - allocate videobuf buffer structures and (for MMAP type)
  * video buffer memory for all buffers/planes on the queue and initializes the
@@ -330,6 +369,10 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
 	struct vb2_buffer *vb;
 	int ret;
 
+	q->vb2_workqueue = alloc_ordered_workqueue("vb2_wq", 0);
+	if (!q->vb2_workqueue)
+		return buffer;
+
 	for (buffer = 0; buffer < num_buffers; ++buffer) {
 		/* Allocate videobuf buffer structures */
 		vb = kzalloc(q->buf_struct_size, GFP_KERNEL);
@@ -344,6 +387,8 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
 		vb->index = q->num_buffers + buffer;
 		vb->type = q->type;
 		vb->memory = memory;
+		INIT_WORK(&vb->done_work, vb2_done_work);
+
 		for (plane = 0; plane < num_planes; ++plane) {
 			vb->planes[plane].length = plane_sizes[plane];
 			vb->planes[plane].min_length = plane_sizes[plane];
@@ -982,7 +1027,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
 {
 	struct vb2_queue *q = vb->vb2_queue;
 	unsigned long flags;
-	unsigned int plane;
 
 	if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
 		return;
@@ -1003,36 +1047,11 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
 	dprintk(4, "done processing on buffer %d, state: %d\n",
 			vb->index, state);
 
-	/* sync buffers */
-	for (plane = 0; plane < vb->num_planes; ++plane)
-		call_void_memop(vb, finish, vb->planes[plane].mem_priv);
-
 	spin_lock_irqsave(&q->done_lock, flags);
-	if (state == VB2_BUF_STATE_QUEUED ||
-	    state == VB2_BUF_STATE_REQUEUEING) {
-		vb->state = VB2_BUF_STATE_QUEUED;
-	} else {
-		/* Add the buffer to the done buffers list */
-		list_add_tail(&vb->done_entry, &q->done_list);
-		vb->state = state;
-	}
-	atomic_dec(&q->owned_by_drv_count);
+	vb->state = state;
 	spin_unlock_irqrestore(&q->done_lock, flags);
 
-	trace_vb2_buf_done(q, vb);
-
-	switch (state) {
-	case VB2_BUF_STATE_QUEUED:
-		return;
-	case VB2_BUF_STATE_REQUEUEING:
-		if (q->start_streaming_called)
-			__enqueue_in_driver(vb);
-		return;
-	default:
-		/* Inform any processes that may be waiting for buffers */
-		wake_up(&q->done_wq);
-		break;
-	}
+	queue_work(q->vb2_workqueue, &vb->done_work);
 }
 EXPORT_SYMBOL_GPL(vb2_buffer_done);
 
@@ -1812,6 +1831,12 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
 	if (q->start_streaming_called)
 		call_void_qop(q, stop_streaming, q);
 
+	if (q->vb2_workqueue) {
+		flush_workqueue(q->vb2_workqueue);
+		destroy_workqueue(q->vb2_workqueue);
+		q->vb2_workqueue = NULL;
+	}
+
 	/*
 	 * If you see this warning, then the driver isn't cleaning up properly
 	 * in stop_streaming(). See the stop_streaming() documentation in
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index a4a9a55a0c42..dd765ef06cce 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -242,6 +242,9 @@ struct vb2_buffer {
 
 	struct list_head	queued_entry;
 	struct list_head	done_entry;
+
+	struct work_struct	done_work;
+
 #ifdef CONFIG_VIDEO_ADV_DEBUG
 	/*
 	 * Counters for how often these buffer-related ops are
@@ -523,6 +526,8 @@ struct vb2_queue {
 	struct vb2_fileio_data		*fileio;
 	struct vb2_threadio_data	*threadio;
 
+	struct workqueue_struct		*vb2_workqueue;
+
 #ifdef CONFIG_VIDEO_ADV_DEBUG
 	/*
 	 * Counters for how often these queue-related ops are
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/2] [media] vb2: move dma-buf unmap from __vb2_dqbuf() to vb2_done_work()
  2016-08-17 18:28 [RFC PATCH 0/2] [media] vb2: defer part of vb2_buffer_done() and move dma-buf unmap from DQBUF Javier Martinez Canillas
  2016-08-17 18:28 ` [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue Javier Martinez Canillas
@ 2016-08-17 18:28 ` Javier Martinez Canillas
  1 sibling, 0 replies; 5+ messages in thread
From: Javier Martinez Canillas @ 2016-08-17 18:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hans Verkuil, Sakari Ailus, Javier Martinez Canillas,
	Mauro Carvalho Chehab, Marek Szyprowski, Kyungmin Park,
	Pawel Osciak, linux-media

Currently the dma-buf is unmapped when the buffer is dequeued by userspace
but it's not used anymore after the driver finished processing the buffer.

So instead of doing the dma-buf unmapping in __vb2_dqbuf(), it can be made
in vb2_done_work() after the driver notified that has finished processing.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>

---

 drivers/media/v4l2-core/videobuf2-core.c | 33 ++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 14bed8acf3cf..5f930dbedfa4 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -315,6 +315,21 @@ static void __setup_offsets(struct vb2_buffer *vb)
 	}
 }
 
+ /**
+ * __vb2_unmap_dmabuf() - unmap dma-buf attached to buffer planes
+ */
+static void __vb2_unmap_dmabuf(struct vb2_buffer *vb)
+{
+	int i;
+
+	for (i = 0; i < vb->num_planes; ++i) {
+		if (!vb->planes[i].dbuf_mapped)
+			continue;
+		call_void_memop(vb, unmap_dmabuf, vb->planes[i].mem_priv);
+		vb->planes[i].dbuf_mapped = 0;
+	}
+}
+
 static void vb2_done_work(struct work_struct *work)
 {
 	struct vb2_buffer *vb = container_of(work, struct vb2_buffer,
@@ -348,6 +363,9 @@ static void vb2_done_work(struct work_struct *work)
 			__enqueue_in_driver(vb);
 		break;
 	default:
+		if (q->memory == VB2_MEMORY_DMABUF)
+			__vb2_unmap_dmabuf(vb);
+
 		/* Inform any processes that may be waiting for buffers */
 		wake_up(&q->done_wq);
 		break;
@@ -1725,23 +1743,11 @@ EXPORT_SYMBOL_GPL(vb2_wait_for_all_buffers);
  */
 static void __vb2_dqbuf(struct vb2_buffer *vb)
 {
-	struct vb2_queue *q = vb->vb2_queue;
-	unsigned int i;
-
 	/* nothing to do if the buffer is already dequeued */
 	if (vb->state == VB2_BUF_STATE_DEQUEUED)
 		return;
 
 	vb->state = VB2_BUF_STATE_DEQUEUED;
-
-	/* unmap DMABUF buffer */
-	if (q->memory == VB2_MEMORY_DMABUF)
-		for (i = 0; i < vb->num_planes; ++i) {
-			if (!vb->planes[i].dbuf_mapped)
-				continue;
-			call_void_memop(vb, unmap_dmabuf, vb->planes[i].mem_priv);
-			vb->planes[i].dbuf_mapped = 0;
-		}
 }
 
 /**
@@ -1885,6 +1891,9 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
 			call_void_vb_qop(vb, buf_finish, vb);
 		}
 		__vb2_dqbuf(vb);
+
+		if (q->memory == VB2_MEMORY_DMABUF)
+			__vb2_unmap_dmabuf(vb);
 	}
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue
  2016-08-17 18:28 ` [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue Javier Martinez Canillas
@ 2016-08-17 19:50   ` Sakari Ailus
  2016-08-17 21:21     ` Javier Martinez Canillas
  0 siblings, 1 reply; 5+ messages in thread
From: Sakari Ailus @ 2016-08-17 19:50 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: linux-kernel, Hans Verkuil, Sakari Ailus, Mauro Carvalho Chehab,
	Marek Szyprowski, Kyungmin Park, Pawel Osciak, linux-media

Hi Javier,

On Wed, Aug 17, 2016 at 02:28:56PM -0400, Javier Martinez Canillas wrote:
> The vb2_buffer_done() function can be called from interrupt context but it
> currently calls the vb2 memory allocator .finish operation to sync buffers
> and this can take a long time, so it's not suitable to be done there.
> 
> This patch defers part of the vb2_buffer_done() logic to a worker thread
> to avoid doing the time consuming operation in interrupt context.

I agree the interrupt handler is not the best place to perform the work in
vb2_buffer_done() (including cache flushing), but is a work queue an ideal
solution?

The work queue task is a regular kernel thread not subject to
sched_setscheduler(2) and alike, which user space programs can and do use to
change how the scheduler treats these processes. Requiring a work queue to
be run between the interrupt arriving from the hardware and the user space
process being able to dequeue the related buffer would hurt use cases where
strict time limits are crucial.

Neither I propose making the work queue to have real time priority either,
albeit I think might still be marginally better.

Additionally, the work queue brings another context switch per dequeued
buffer. This would also be undesirable on IoT and mobile systems that often
handle multiple buffer queues simultaneously.

Performing this task in the context of the process that actually dequeues
the buffer avoids both of these problem entirely as there are no other
processes involved.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue
  2016-08-17 19:50   ` Sakari Ailus
@ 2016-08-17 21:21     ` Javier Martinez Canillas
  0 siblings, 0 replies; 5+ messages in thread
From: Javier Martinez Canillas @ 2016-08-17 21:21 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-kernel, Hans Verkuil, Sakari Ailus, Mauro Carvalho Chehab,
	Marek Szyprowski, Kyungmin Park, Pawel Osciak, linux-media

Hello Sakari,

Thanks a lot for your feedback.

On 08/17/2016 03:50 PM, Sakari Ailus wrote:
> Hi Javier,
> 
> On Wed, Aug 17, 2016 at 02:28:56PM -0400, Javier Martinez Canillas wrote:
>> The vb2_buffer_done() function can be called from interrupt context but it
>> currently calls the vb2 memory allocator .finish operation to sync buffers
>> and this can take a long time, so it's not suitable to be done there.
>>
>> This patch defers part of the vb2_buffer_done() logic to a worker thread
>> to avoid doing the time consuming operation in interrupt context.
> 
> I agree the interrupt handler is not the best place to perform the work in
> vb2_buffer_done() (including cache flushing), but is a work queue an ideal
> solution?
>

I would also like to know Hans opinions since he suggested deferring the buffer
sync to be done in a worker thread (if I understood his suggestions correctly).
 
> The work queue task is a regular kernel thread not subject to
> sched_setscheduler(2) and alike, which user space programs can and do use to
> change how the scheduler treats these processes. Requiring a work queue to
> be run between the interrupt arriving from the hardware and the user space
> process being able to dequeue the related buffer would hurt use cases where
> strict time limits are crucial.
> 
> Neither I propose making the work queue to have real time priority either,
> albeit I think might still be marginally better.
>
> Additionally, the work queue brings another context switch per dequeued
> buffer. This would also be undesirable on IoT and mobile systems that often
> handle multiple buffer queues simultaneously.
>

Yes, I agree with you that this change might increase the latency.
 
> Performing this task in the context of the process that actually dequeues
> the buffer avoids both of these problem entirely as there are no other
> processes involved.
> 

You already mentioned in the other thread that you prefer to move the buffer
sync to DQBUF. But as I explained there, the reason why I want to move the
dma-buf unmapping out of DQBUF is to allow other drivers that share the same
DMA buffer to access the buffer even when a DQBUF has not been called yet.

This may be possible if vb2 supports implicit dma-buf fences and in this case
user-space doesn't even need to call DQBUF/QBUF, since the fences can be used
to serialize the access to the shared DMA buffer. Instead of using DQBUF/QBUF
as a serialization mechanism like is the case for the current non-fences case.

It would be possible to move the cache flushing to DQBUF and leave the dma-buf
unmap there, and do these operations when the driver calls vb2_buffer_done()
only when implicit fences are used. But it would simplify the vb2-core if this
is consistent between the fences and non-fences cases.

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-17 21:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-17 18:28 [RFC PATCH 0/2] [media] vb2: defer part of vb2_buffer_done() and move dma-buf unmap from DQBUF Javier Martinez Canillas
2016-08-17 18:28 ` [RFC PATCH 1/2] [media] vb2: defer sync buffers from vb2_buffer_done() with a workqueue Javier Martinez Canillas
2016-08-17 19:50   ` Sakari Ailus
2016-08-17 21:21     ` Javier Martinez Canillas
2016-08-17 18:28 ` [RFC PATCH 2/2] [media] vb2: move dma-buf unmap from __vb2_dqbuf() to vb2_done_work() Javier Martinez Canillas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.