linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v5 00/11] V4L2 Explicit Synchronization
@ 2017-11-15 17:10 Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi Gustavo Padovan
                   ` (11 more replies)
  0 siblings, 12 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

Hi,

After the comments received in the last patchset[1] and 
during the media summit [2] here is the new and improved version
of the patchset. The implementation is simpler, smaller and cover
a lot more cases.

If you look to the last patchset I got rid of a few things, the main
one is the OUT_FENCE event, one thing that we decided in Prague was
that, when using fences, we would keep ordering of all buffers queued
to vb2. That means they would be queued to the drivers in the same order
that the QBUF calls happen, just like it already happens when not using
fences. Fences can signal in whatever order, so we need this guarantee
here. Drivers can, however, not keep ordering when processing the
buffers.

But there is one conclusion of that that we didn't reached at the
summit, maybe because of the order we discussed things, and that is: we do
not need the OUT_FENCE event anymore, because now at the QBUF call time
we *always* know the order in which the buffers will be queued to the
v4l2 driver. So the out-fence fd is now returned using the fence_fd
field as a return argument, thus the event is not necessary anymore.

The fence_fd field is now used to comunicate both in-fences and
out-fences, just like we do for GPU drivers. We pass in-fences as input
arguments and get out-fences as return arguments on the QBUF call.
The approach is documented.

I also added a capability flag, V4L2_CAP_ORDERED, to tell userspace if
the v4l2 drivers keep the buffers ordered or not. 

We still have the 'ordered_in_driver' property for queues, but its
meaning has changed. When set videobuf2 will know that the driver can
keep the order of the buffers, thus videobuf2 can use the same fence
context for all out-fences. Fences inside the same context should signal
in order, so 'ordered_in_driver' is a optimization for that case.
When not set, a context for each out-fence is created.

So now explicit synchronization also works for drivers that do not keep
the ordering of buffers.

Another thing is that we do not allow videobuf2 to requeue buffers
internally when using fences, they have a fence associated to it and
we need to finish the job on them, i.e., signal the fence, even if an
error happened.

The rest of the changes are documented in each patch separated.

There a test app at:

https://gitlab.collabora.com/padovan/v4l2-fences-test

Among my next steps is to create a v4l2->drm test app using fences as a
PoC, and also look into how to support it in ChromeOS.

Open Questions
--------------

* Do drivers reorder buffers internally? How to handle that with fences?

* How to handle audio/video syncronization? Fences aren't enough, we  need 
  to know things like the start of capture timestamp.

Regards,

Gustavo
--

[1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1518928.html
[2] http://muistio.tieke.fi/p/media-summit-2017

Gustavo Padovan (10):
  [media] v4l: add V4L2_CAP_ORDERED to the uapi
  [media] vivid: add the V4L2_CAP_ORDERED capability
  [media] vb2: add 'ordered_in_driver' property to queues
  [media] vivid: mark vivid queues as ordered_in_driver
  [media] vb2: check earlier if stream can be started
  [media] vb2: add explicit fence user API
  [media] vb2: add in-fence support to QBUF
  [media] vb2: add infrastructure to support out-fences
  [media] vb2: add out-fence support to QBUF
  [media] v4l: Document explicit synchronization behavior

Javier Martinez Canillas (1):
  [media] vb2: add videobuf2 dma-buf fence helpers

 Documentation/media/uapi/v4l/buffer.rst          |  15 ++
 Documentation/media/uapi/v4l/vidioc-qbuf.rst     |  42 +++-
 Documentation/media/uapi/v4l/vidioc-querybuf.rst |   9 +-
 Documentation/media/uapi/v4l/vidioc-querycap.rst |   3 +
 drivers/media/platform/vivid/vivid-core.c        |  24 +-
 drivers/media/usb/cpia2/cpia2_v4l.c              |   2 +-
 drivers/media/v4l2-core/Kconfig                  |   1 +
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c    |   4 +-
 drivers/media/v4l2-core/videobuf2-core.c         | 274 +++++++++++++++++++++--
 drivers/media/v4l2-core/videobuf2-v4l2.c         |  48 +++-
 include/media/videobuf2-core.h                   |  44 +++-
 include/media/videobuf2-fence.h                  |  48 ++++
 include/uapi/linux/videodev2.h                   |   8 +-
 13 files changed, 485 insertions(+), 37 deletions(-)
 create mode 100644 include/media/videobuf2-fence.h

-- 
2.13.6

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17 11:57   ` Mauro Carvalho Chehab
  2017-11-15 17:10 ` [RFC v5 02/11] [media] vivid: add the V4L2_CAP_ORDERED capability Gustavo Padovan
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

When using explicit synchronization userspace needs to know if
the queue can deliver everything back in the same order, so we added
a new capability that drivers can use to report that they are capable
of keeping ordering.

In videobuf2 core when using fences we also make sure to keep the ordering
of buffers, so if the driver guarantees it too the whole pipeline inside
V4L2 will be ordered and the V4L2_CAP_ORDERED should be used.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 Documentation/media/uapi/v4l/vidioc-querycap.rst | 3 +++
 include/uapi/linux/videodev2.h                   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/Documentation/media/uapi/v4l/vidioc-querycap.rst b/Documentation/media/uapi/v4l/vidioc-querycap.rst
index 66fb1b3d6e6e..ed3daa814da9 100644
--- a/Documentation/media/uapi/v4l/vidioc-querycap.rst
+++ b/Documentation/media/uapi/v4l/vidioc-querycap.rst
@@ -254,6 +254,9 @@ specification the ioctl returns an ``EINVAL`` error code.
     * - ``V4L2_CAP_TOUCH``
       - 0x10000000
       - This is a touch device.
+    * - ``V4L2_CAP_ORDERED``
+      - 0x20000000
+      - The device queue is ordered.
     * - ``V4L2_CAP_DEVICE_CAPS``
       - 0x80000000
       - The driver fills the ``device_caps`` field. This capability can
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 185d6a0acc06..cd6fc1387f47 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -459,6 +459,7 @@ struct v4l2_capability {
 #define V4L2_CAP_STREAMING              0x04000000  /* streaming I/O ioctls */
 
 #define V4L2_CAP_TOUCH                  0x10000000  /* Is a touch device */
+#define V4L2_CAP_ORDERED                0x20000000  /* Is the device queue ordered */
 
 #define V4L2_CAP_DEVICE_CAPS            0x80000000  /* sets device capabilities field */
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 02/11] [media] vivid: add the V4L2_CAP_ORDERED capability
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues Gustavo Padovan
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

vivid guarantees the ordering of buffer when processing then, so add the
V4L2_CAP_ORDERED capability to inform userspace of that.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/platform/vivid/vivid-core.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/vivid/vivid-core.c b/drivers/media/platform/vivid/vivid-core.c
index 5f316a5e38db..f19391fa2d6a 100644
--- a/drivers/media/platform/vivid/vivid-core.c
+++ b/drivers/media/platform/vivid/vivid-core.c
@@ -801,7 +801,8 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		dev->vid_cap_caps = dev->multiplanar ?
 			V4L2_CAP_VIDEO_CAPTURE_MPLANE :
 			V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OVERLAY;
-		dev->vid_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE;
+		dev->vid_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE |
+				     V4L2_CAP_ORDERED;
 		if (dev->has_audio_inputs)
 			dev->vid_cap_caps |= V4L2_CAP_AUDIO;
 		if (in_type_counter[TV])
@@ -814,7 +815,8 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 			V4L2_CAP_VIDEO_OUTPUT;
 		if (dev->has_fb)
 			dev->vid_out_caps |= V4L2_CAP_VIDEO_OUTPUT_OVERLAY;
-		dev->vid_out_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE;
+		dev->vid_out_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE |
+				     V4L2_CAP_ORDERED;
 		if (dev->has_audio_outputs)
 			dev->vid_out_caps |= V4L2_CAP_AUDIO;
 	}
@@ -822,7 +824,8 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		/* set up the capabilities of the vbi capture device */
 		dev->vbi_cap_caps = (dev->has_raw_vbi_cap ? V4L2_CAP_VBI_CAPTURE : 0) |
 				    (dev->has_sliced_vbi_cap ? V4L2_CAP_SLICED_VBI_CAPTURE : 0);
-		dev->vbi_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE;
+		dev->vbi_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE |
+				     V4L2_CAP_ORDERED;
 		if (dev->has_audio_inputs)
 			dev->vbi_cap_caps |= V4L2_CAP_AUDIO;
 		if (in_type_counter[TV])
@@ -832,24 +835,26 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		/* set up the capabilities of the vbi output device */
 		dev->vbi_out_caps = (dev->has_raw_vbi_out ? V4L2_CAP_VBI_OUTPUT : 0) |
 				    (dev->has_sliced_vbi_out ? V4L2_CAP_SLICED_VBI_OUTPUT : 0);
-		dev->vbi_out_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE;
+		dev->vbi_out_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE |
+				     V4L2_CAP_ORDERED;
 		if (dev->has_audio_outputs)
 			dev->vbi_out_caps |= V4L2_CAP_AUDIO;
 	}
 	if (dev->has_sdr_cap) {
 		/* set up the capabilities of the sdr capture device */
 		dev->sdr_cap_caps = V4L2_CAP_SDR_CAPTURE | V4L2_CAP_TUNER;
-		dev->sdr_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE;
+		dev->sdr_cap_caps |= V4L2_CAP_STREAMING | V4L2_CAP_READWRITE |
+				     V4L2_CAP_ORDERED;
 	}
 	/* set up the capabilities of the radio receiver device */
 	if (dev->has_radio_rx)
 		dev->radio_rx_caps = V4L2_CAP_RADIO | V4L2_CAP_RDS_CAPTURE |
 				     V4L2_CAP_HW_FREQ_SEEK | V4L2_CAP_TUNER |
-				     V4L2_CAP_READWRITE;
+				     V4L2_CAP_READWRITE | V4L2_CAP_ORDERED;
 	/* set up the capabilities of the radio transmitter device */
 	if (dev->has_radio_tx)
 		dev->radio_tx_caps = V4L2_CAP_RDS_OUTPUT | V4L2_CAP_MODULATOR |
-				     V4L2_CAP_READWRITE;
+				     V4L2_CAP_READWRITE | V4L2_CAP_ORDERED;
 
 	ret = -ENOMEM;
 	/* initialize the test pattern generator */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 02/11] [media] vivid: add the V4L2_CAP_ORDERED capability Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17  5:56   ` Alexandre Courbot
  2017-11-17 12:15   ` Mauro Carvalho Chehab
  2017-11-15 17:10 ` [RFC v5 04/11] [media] vivid: mark vivid queues as ordered_in_driver Gustavo Padovan
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

We use ordered_in_driver property to optimize for the case where
the driver can deliver the buffers in an ordered fashion. When it
is ordered we can use the same fence context for all fences, but
when it is not we need to a new context for each out-fence.

So the ordered_in_driver flag will help us with identifying the queues
that can be optimized and use the same fence context.

v4: make the property a vector for optimization and not a mandatory thing
that drivers need to set if they want to use explicit synchronization.

v3: improve doc (Hans Verkuil)

v2: rename property to 'ordered_in_driver' to avoid confusion

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 include/media/videobuf2-core.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index ef9b64398c8c..38b9c8dd42c6 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -440,6 +440,12 @@ struct vb2_buf_ops {
  * @fileio_read_once:		report EOF after reading the first buffer
  * @fileio_write_immediately:	queue buffer after each write() call
  * @allow_zero_bytesused:	allow bytesused == 0 to be passed to the driver
+ * @ordered_in_driver: if the driver can guarantee that the queue will be
+ *		ordered or not, i.e., the buffers are dequeued from the driver
+ *		in the same order they are queued to the driver. The default
+ *		is not ordered unless the driver sets this flag. Setting it
+ *		when ordering can be guaranted helps to optimize explicit
+ *		fences.
  * @quirk_poll_must_check_waiting_for_buffers: Return POLLERR at poll when QBUF
  *              has not been called. This is a vb1 idiom that has been adopted
  *              also by vb2.
@@ -510,6 +516,7 @@ struct vb2_queue {
 	unsigned			fileio_read_once:1;
 	unsigned			fileio_write_immediately:1;
 	unsigned			allow_zero_bytesused:1;
+	unsigned			ordered_in_driver:1;
 	unsigned		   quirk_poll_must_check_waiting_for_buffers:1;
 
 	struct mutex			*lock;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 04/11] [media] vivid: mark vivid queues as ordered_in_driver
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (2 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 05/11] [media] vb2: check earlier if stream can be started Gustavo Padovan
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

ordered_in_driver is used to optimize the use of explicit synchronization
when the driver guarantees ordering we can use the same fence context for
out-fences. In this case userspace will know that the buffers won't be
signaling out of order. vivid queues are already ordered by default so no
changes are needed.

v2: rename 'ordered' to 'ordered_in_driver' to avoid confusion.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/platform/vivid/vivid-core.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/media/platform/vivid/vivid-core.c b/drivers/media/platform/vivid/vivid-core.c
index f19391fa2d6a..1b830ebe1cd8 100644
--- a/drivers/media/platform/vivid/vivid-core.c
+++ b/drivers/media/platform/vivid/vivid-core.c
@@ -1068,6 +1068,7 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		q->type = dev->multiplanar ? V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE :
 			V4L2_BUF_TYPE_VIDEO_CAPTURE;
 		q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_READ;
+		q->ordered_in_driver = 1;
 		q->drv_priv = dev;
 		q->buf_struct_size = sizeof(struct vivid_buffer);
 		q->ops = &vivid_vid_cap_qops;
@@ -1088,6 +1089,7 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		q->type = dev->multiplanar ? V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE :
 			V4L2_BUF_TYPE_VIDEO_OUTPUT;
 		q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_WRITE;
+		q->ordered_in_driver = 1;
 		q->drv_priv = dev;
 		q->buf_struct_size = sizeof(struct vivid_buffer);
 		q->ops = &vivid_vid_out_qops;
@@ -1108,6 +1110,7 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		q->type = dev->has_raw_vbi_cap ? V4L2_BUF_TYPE_VBI_CAPTURE :
 					      V4L2_BUF_TYPE_SLICED_VBI_CAPTURE;
 		q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_READ;
+		q->ordered_in_driver = 1;
 		q->drv_priv = dev;
 		q->buf_struct_size = sizeof(struct vivid_buffer);
 		q->ops = &vivid_vbi_cap_qops;
@@ -1128,6 +1131,7 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		q->type = dev->has_raw_vbi_out ? V4L2_BUF_TYPE_VBI_OUTPUT :
 					      V4L2_BUF_TYPE_SLICED_VBI_OUTPUT;
 		q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_WRITE;
+		q->ordered_in_driver = 1;
 		q->drv_priv = dev;
 		q->buf_struct_size = sizeof(struct vivid_buffer);
 		q->ops = &vivid_vbi_out_qops;
@@ -1147,6 +1151,7 @@ static int vivid_create_instance(struct platform_device *pdev, int inst)
 		q = &dev->vb_sdr_cap_q;
 		q->type = V4L2_BUF_TYPE_SDR_CAPTURE;
 		q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_READ;
+		q->ordered_in_driver = 1;
 		q->drv_priv = dev;
 		q->buf_struct_size = sizeof(struct vivid_buffer);
 		q->ops = &vivid_sdr_cap_qops;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 05/11] [media] vb2: check earlier if stream can be started
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (3 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 04/11] [media] vivid: mark vivid queues as ordered_in_driver Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-15 17:10 ` [RFC v5 06/11] [media] vb2: add explicit fence user API Gustavo Padovan
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

To support explicit synchronization we need to run all operations that can
fail before we queue the buffer to the driver. With fences the queueing
will be delayed if the fence is not signaled yet and it will be better if
such callback do not fail.

For that we move the vb2_start_streaming() before the queuing for the
buffer may happen.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index cb115ba6a1d2..60f8b582396a 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -1399,29 +1399,27 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
 	trace_vb2_qbuf(q, vb);
 
 	/*
-	 * If already streaming, give the buffer to driver for processing.
-	 * If not, the buffer will be given to driver on next streamon.
-	 */
-	if (q->start_streaming_called)
-		__enqueue_in_driver(vb);
-
-	/* Fill buffer information for the userspace */
-	if (pb)
-		call_void_bufop(q, fill_user_buffer, vb, pb);
-
-	/*
 	 * If streamon has been called, and we haven't yet called
 	 * start_streaming() since not enough buffers were queued, and
 	 * we now have reached the minimum number of queued buffers,
 	 * then we can finally call start_streaming().
+	 *
+	 * If already streaming, give the buffer to driver for processing.
+	 * If not, the buffer will be given to driver on next streamon.
 	 */
 	if (q->streaming && !q->start_streaming_called &&
 	    q->queued_count >= q->min_buffers_needed) {
 		ret = vb2_start_streaming(q);
 		if (ret)
 			return ret;
+	} else if (q->start_streaming_called) {
+		__enqueue_in_driver(vb);
 	}
 
+	/* Fill buffer information for the userspace */
+	if (pb)
+		call_void_bufop(q, fill_user_buffer, vb, pb);
+
 	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
 	return 0;
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 06/11] [media] vb2: add explicit fence user API
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (4 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 05/11] [media] vb2: check earlier if stream can be started Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17 12:25   ` Mauro Carvalho Chehab
  2017-11-17 13:29   ` Hans Verkuil
  2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

Turn the reserved2 field into fence_fd that we will use to send
an in-fence to the kernel and return an out-fence from the kernel to
userspace.

Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
when sending a fence to the kernel to be waited on, and
V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.

v4:
	- make it a union with reserved2 and fence_fd (Hans Verkuil)

v3:
	- make the out_fence refer to the current buffer (Hans Verkuil)

v2: add documentation

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 Documentation/media/uapi/v4l/buffer.rst       | 15 +++++++++++++++
 drivers/media/usb/cpia2/cpia2_v4l.c           |  2 +-
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  4 ++--
 drivers/media/v4l2-core/videobuf2-v4l2.c      |  2 +-
 include/uapi/linux/videodev2.h                |  7 ++++++-
 5 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
index ae6ee73f151c..eeefbd2547e7 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -648,6 +648,21 @@ Buffer Flags
       - Start Of Exposure. The buffer timestamp has been taken when the
 	exposure of the frame has begun. This is only valid for the
 	``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
+    * .. _`V4L2-BUF-FLAG-IN-FENCE`:
+
+      - ``V4L2_BUF_FLAG_IN_FENCE``
+      - 0x00200000
+      - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
+	won't be queued to the driver until the fence signals.
+
+    * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
+
+      - ``V4L2_BUF_FLAG_OUT_FENCE``
+      - 0x00400000
+      - Request a fence to be attached to the buffer. The ``fence_fd``
+	field on
+	:ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
+	fd to userspace.
 
 
 
diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
index 3dedd83f0b19..6cde686bf44c 100644
--- a/drivers/media/usb/cpia2/cpia2_v4l.c
+++ b/drivers/media/usb/cpia2/cpia2_v4l.c
@@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
 	buf->sequence = cam->buffers[buf->index].seq;
 	buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
 	buf->length = cam->frame_size;
-	buf->reserved2 = 0;
+	buf->fence_fd = -1;
 	buf->reserved = 0;
 	memset(&buf->timecode, 0, sizeof(buf->timecode));
 
diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
index 821f2aa299ae..3a31d318df2a 100644
--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
+++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
@@ -370,7 +370,7 @@ struct v4l2_buffer32 {
 		__s32		fd;
 	} m;
 	__u32			length;
-	__u32			reserved2;
+	__s32			fence_fd;
 	__u32			reserved;
 };
 
@@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
 		put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
 		copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
 		put_user(kp->sequence, &up->sequence) ||
-		put_user(kp->reserved2, &up->reserved2) ||
+		put_user(kp->fence_fd, &up->fence_fd) ||
 		put_user(kp->reserved, &up->reserved) ||
 		put_user(kp->length, &up->length))
 			return -EFAULT;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 0c0669976bdc..110fb45fef6f 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
 	b->timestamp = ns_to_timeval(vb->timestamp);
 	b->timecode = vbuf->timecode;
 	b->sequence = vbuf->sequence;
-	b->reserved2 = 0;
+	b->fence_fd = -1;
 	b->reserved = 0;
 
 	if (q->is_multiplanar) {
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index cd6fc1387f47..311c3a331eda 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -925,7 +925,10 @@ struct v4l2_buffer {
 		__s32		fd;
 	} m;
 	__u32			length;
-	__u32			reserved2;
+	union {
+		__s32		fence_fd;
+		__u32		reserved2;
+	};
 	__u32			reserved;
 };
 
@@ -962,6 +965,8 @@ struct v4l2_buffer {
 #define V4L2_BUF_FLAG_TSTAMP_SRC_SOE		0x00010000
 /* mem2mem encoder/decoder */
 #define V4L2_BUF_FLAG_LAST			0x00100000
+#define V4L2_BUF_FLAG_IN_FENCE			0x00200000
+#define V4L2_BUF_FLAG_OUT_FENCE			0x00400000
 
 /**
  * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (5 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 06/11] [media] vb2: add explicit fence user API Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17  6:49   ` Alexandre Courbot
                     ` (2 more replies)
  2017-11-15 17:10 ` [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers Gustavo Padovan
                   ` (4 subsequent siblings)
  11 siblings, 3 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

Receive in-fence from userspace and add support for waiting on them
before queueing the buffer to the driver. Buffers can't be queued to the
driver before its fences signal. And a buffer can't be queue to the driver
out of the order they were queued from userspace. That means that even if
it fence signal it must wait all other buffers, ahead of it in the queue,
to signal first.

To make that possible we use fence_array to keep that ordering. Basically
we create a fence_array that contains both the current fence and the fence
from the previous buffer (which might be a fence array as well). The base
fence class for the fence_array becomes the new buffer fence, waiting on
that one guarantees that it won't be queued out of order.

v6:
	- With fences always keep the order userspace queues the buffers.
	- Protect in_fence manipulation with a lock (Brian Starkey)
	- check if fences have the same context before adding a fence array
	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
	- Clean up fence if __set_in_fence() fails (Brian Starkey)
	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)

v5:	- use fence_array to keep buffers ordered in vb2 core when
	needed (Brian Starkey)
	- keep backward compat on the reserved2 field (Brian Starkey)
	- protect fence callback removal with lock (Brian Starkey)

v4:
	- Add a comment about dma_fence_add_callback() not returning a
	error (Hans)
	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
	vb2_start_streaming() (Hans)
	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
	- Queue buffers to the driver as soon as they are ready (Hans)
	- call fill_user_buffer() after queuing the buffer (Hans)
	- add err: label to clean up fence
	- add dma_fence_wait() before calling vb2_start_streaming()

v3:	- document fence parameter
	- remove ternary if at vb2_qbuf() return (Mauro)
	- do not change if conditions behaviour (Mauro)

v2:
	- fix vb2_queue_or_prepare_buf() ret check
	- remove check for VB2_MEMORY_DMABUF only (Javier)
	- check num of ready buffers to start streaming
	- when queueing, start from the first ready buffer
	- handle queue cancel

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/v4l2-core/Kconfig          |   1 +
 drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
 drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
 include/media/videobuf2-core.h           |  17 ++-
 4 files changed, 231 insertions(+), 18 deletions(-)

diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
index a35c33686abf..3f988c407c80 100644
--- a/drivers/media/v4l2-core/Kconfig
+++ b/drivers/media/v4l2-core/Kconfig
@@ -83,6 +83,7 @@ config VIDEOBUF_DVB
 # Used by drivers that need Videobuf2 modules
 config VIDEOBUF2_CORE
 	select DMA_SHARED_BUFFER
+	select SYNC_FILE
 	tristate
 
 config VIDEOBUF2_MEMOPS
diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 60f8b582396a..26de4c80717d 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -23,6 +23,7 @@
 #include <linux/sched.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>
+#include <linux/dma-fence-array.h>
 
 #include <media/videobuf2-core.h>
 #include <media/v4l2-mc.h>
@@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
 		vb->index = q->num_buffers + buffer;
 		vb->type = q->type;
 		vb->memory = memory;
+		spin_lock_init(&vb->fence_cb_lock);
 		for (plane = 0; plane < num_planes; ++plane) {
 			vb->planes[plane].length = plane_sizes[plane];
 			vb->planes[plane].min_length = plane_sizes[plane];
@@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
 {
 	struct vb2_queue *q = vb->vb2_queue;
 
+	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
+		return;
+
 	vb->state = VB2_BUF_STATE_ACTIVE;
 	atomic_inc(&q->owned_by_drv_count);
 
@@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 	return 0;
 }
 
+static int __get_num_ready_buffers(struct vb2_queue *q)
+{
+	struct vb2_buffer *vb;
+	int ready_count = 0;
+	unsigned long flags;
+
+	/* count num of buffers ready in front of the queued_list */
+	list_for_each_entry(vb, &q->queued_list, queued_entry) {
+		spin_lock_irqsave(&vb->fence_cb_lock, flags);
+		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
+			ready_count++;
+		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+	}
+
+	return ready_count;
+}
+
 int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
 {
 	struct vb2_buffer *vb;
@@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
 	return ret;
 }
 
-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
+static struct dma_fence *__set_in_fence(struct vb2_queue *q,
+					struct vb2_buffer *vb,
+					struct dma_fence *fence)
+{
+	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
+		dma_fence_put(q->last_fence);
+		q->last_fence = NULL;
+	}
+
+	/*
+	 * We always guarantee the ordering of buffers queued from
+	 * userspace to be the same it is queued to the driver. For that
+	 * we create a fence array with the fence from the last queued
+	 * buffer and this one, that way the fence for this buffer can't
+	 * signal before the last one.
+	 */
+	if (fence && q->last_fence) {
+		struct dma_fence **fences;
+		struct dma_fence_array *arr;
+
+		if (fence->context == q->last_fence->context) {
+			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
+				dma_fence_put(q->last_fence);
+				q->last_fence = dma_fence_get(fence);
+			} else {
+				dma_fence_put(fence);
+				fence = dma_fence_get(q->last_fence);
+			}
+			return fence;
+		}
+
+		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
+		if (!fences)
+			return ERR_PTR(-ENOMEM);
+
+		fences[0] = fence;
+		fences[1] = q->last_fence;
+
+		arr = dma_fence_array_create(2, fences,
+					     dma_fence_context_alloc(1),
+					     1, false);
+		if (!arr) {
+			kfree(fences);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		fence = &arr->base;
+
+		q->last_fence = dma_fence_get(fence);
+	} else if (!fence && q->last_fence) {
+		fence = dma_fence_get(q->last_fence);
+	}
+
+	return fence;
+}
+
+static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
+	struct vb2_queue *q = vb->vb2_queue;
+	unsigned long flags;
+
+	spin_lock_irqsave(&vb->fence_cb_lock, flags);
+	if (!vb->in_fence) {
+		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+		return;
+	}
+
+	dma_fence_put(vb->in_fence);
+	vb->in_fence = NULL;
+
+	if (q->start_streaming_called)
+		__enqueue_in_driver(vb);
+	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+}
+
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+		  struct dma_fence *fence)
 {
 	struct vb2_buffer *vb;
+	unsigned long flags;
 	int ret;
 
 	vb = q->bufs[index];
@@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
 	case VB2_BUF_STATE_DEQUEUED:
 		ret = __buf_prepare(vb, pb);
 		if (ret)
-			return ret;
+			goto err;
 		break;
 	case VB2_BUF_STATE_PREPARED:
 		break;
 	case VB2_BUF_STATE_PREPARING:
 		dprintk(1, "buffer still being prepared\n");
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err;
 	default:
 		dprintk(1, "invalid buffer state %d\n", vb->state);
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err;
 	}
 
 	/*
@@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
 
 	trace_vb2_qbuf(q, vb);
 
+	vb->in_fence = __set_in_fence(q, vb, fence);
+	if (IS_ERR(vb->in_fence)) {
+		dma_fence_put(fence);
+		ret = PTR_ERR(vb->in_fence);
+		goto err;
+	}
+	fence = NULL;
+
+	/*
+	 * If it is time to call vb2_start_streaming() wait for the fence
+	 * to signal first. Of course, this happens only once per streaming.
+	 * We want to run any step that might fail before we set the callback
+	 * to queue the fence when it signals.
+	 */
+	if (vb->in_fence && !q->start_streaming_called &&
+	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
+		dma_fence_wait(vb->in_fence, true);
+
 	/*
 	 * If streamon has been called, and we haven't yet called
 	 * start_streaming() since not enough buffers were queued, and
 	 * we now have reached the minimum number of queued buffers,
 	 * then we can finally call start_streaming().
-	 *
-	 * If already streaming, give the buffer to driver for processing.
-	 * If not, the buffer will be given to driver on next streamon.
 	 */
 	if (q->streaming && !q->start_streaming_called &&
-	    q->queued_count >= q->min_buffers_needed) {
+	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {
 		ret = vb2_start_streaming(q);
 		if (ret)
-			return ret;
-	} else if (q->start_streaming_called) {
-		__enqueue_in_driver(vb);
+			goto err;
 	}
 
+	/*
+	 * For explicit synchronization: If the fence didn't signal
+	 * yet we setup a callback to queue the buffer once the fence
+	 * signals, and then, return successfully. But if the fence
+	 * already signaled we lose the reference we held and queue the
+	 * buffer to the driver.
+	 */
+	spin_lock_irqsave(&vb->fence_cb_lock, flags);
+	if (vb->in_fence) {
+		ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
+					     vb2_qbuf_fence_cb);
+		if (ret == -EINVAL) {
+			spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+			goto err;
+		} else if (!ret) {
+			goto fill;
+		}
+
+		dma_fence_put(vb->in_fence);
+		vb->in_fence = NULL;
+	}
+
+fill:
+	/*
+	 * If already streaming and there is no fence to wait on
+	 * give the buffer to driver for processing.
+	 */
+	if (q->start_streaming_called && !vb->in_fence)
+		__enqueue_in_driver(vb);
+	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+
 	/* Fill buffer information for the userspace */
 	if (pb)
 		call_void_bufop(q, fill_user_buffer, vb, pb);
 
 	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
 	return 0;
+
+err:
+	if (vb->in_fence) {
+		dma_fence_put(vb->in_fence);
+		vb->in_fence = NULL;
+	}
+
+	return ret;
+
 }
 EXPORT_SYMBOL_GPL(vb2_core_qbuf);
 
@@ -1632,6 +1787,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
 static void __vb2_queue_cancel(struct vb2_queue *q)
 {
 	unsigned int i;
+	struct vb2_buffer *vb;
+	unsigned long flags;
 
 	/*
 	 * Tell driver to stop all transactions and release all queued
@@ -1659,6 +1816,21 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
 	q->queued_count = 0;
 	q->error = 0;
 
+	list_for_each_entry(vb, &q->queued_list, queued_entry) {
+		spin_lock_irqsave(&vb->fence_cb_lock, flags);
+		if (vb->in_fence) {
+			dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
+			dma_fence_put(vb->in_fence);
+			vb->in_fence = NULL;
+		}
+		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
+	}
+
+	if (q->last_fence) {
+		dma_fence_put(q->last_fence);
+		q->last_fence = NULL;
+	}
+
 	/*
 	 * Remove all buffers from videobuf's list...
 	 */
@@ -1720,7 +1892,7 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
 	 * Tell driver to start streaming provided sufficient buffers
 	 * are available.
 	 */
-	if (q->queued_count >= q->min_buffers_needed) {
+	if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
 		ret = v4l_vb2q_enable_media_source(q);
 		if (ret)
 			return ret;
@@ -2240,7 +2412,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 		 * Queue all buffers.
 		 */
 		for (i = 0; i < q->num_buffers; i++) {
-			ret = vb2_core_qbuf(q, i, NULL);
+			ret = vb2_core_qbuf(q, i, NULL, NULL);
 			if (ret)
 				goto err_reqbufs;
 			fileio->bufs[i].queued = 1;
@@ -2419,7 +2591,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
 
 		if (copy_timestamp)
 			b->timestamp = ktime_get_ns();
-		ret = vb2_core_qbuf(q, index, NULL);
+		ret = vb2_core_qbuf(q, index, NULL, NULL);
 		dprintk(5, "vb2_dbuf result: %d\n", ret);
 		if (ret)
 			return ret;
@@ -2522,7 +2694,7 @@ static int vb2_thread(void *data)
 		if (copy_timestamp)
 			vb->timestamp = ktime_get_ns();;
 		if (!threadio->stop)
-			ret = vb2_core_qbuf(q, vb->index, NULL);
+			ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
 		call_void_qop(q, wait_prepare, q);
 		if (ret || threadio->stop)
 			break;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 110fb45fef6f..4c09ea007d90 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -23,6 +23,7 @@
 #include <linux/sched.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>
+#include <linux/sync_file.h>
 
 #include <media/v4l2-dev.h>
 #include <media/v4l2-fh.h>
@@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
 		return -EINVAL;
 	}
 
+	if ((b->fence_fd != 0 && b->fence_fd != -1) &&
+	    !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
+		dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
+		return -EINVAL;
+	}
+
 	return __verify_planes_array(q->bufs[b->index], b);
 }
 
@@ -203,9 +210,14 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
 	b->timestamp = ns_to_timeval(vb->timestamp);
 	b->timecode = vbuf->timecode;
 	b->sequence = vbuf->sequence;
-	b->fence_fd = -1;
 	b->reserved = 0;
 
+	b->fence_fd = -1;
+	if (vb->in_fence)
+		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
+	else
+		b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
+
 	if (q->is_multiplanar) {
 		/*
 		 * Fill in plane-related data if userspace provided an array
@@ -560,6 +572,7 @@ EXPORT_SYMBOL_GPL(vb2_create_bufs);
 
 int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
 {
+	struct dma_fence *fence = NULL;
 	int ret;
 
 	if (vb2_fileio_is_active(q)) {
@@ -568,7 +581,19 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
 	}
 
 	ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
-	return ret ? ret : vb2_core_qbuf(q, b->index, b);
+	if (ret)
+		return ret;
+
+	if (b->flags & V4L2_BUF_FLAG_IN_FENCE) {
+		fence = sync_file_get_fence(b->fence_fd);
+		if (!fence) {
+			dprintk(1, "failed to get in-fence from fd %d\n",
+				b->fence_fd);
+			return -EINVAL;
+		}
+	}
+
+	return vb2_core_qbuf(q, b->index, b, fence);
 }
 EXPORT_SYMBOL_GPL(vb2_qbuf);
 
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 38b9c8dd42c6..5f48c7be7770 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -16,6 +16,7 @@
 #include <linux/mutex.h>
 #include <linux/poll.h>
 #include <linux/dma-buf.h>
+#include <linux/dma-fence.h>
 
 #define VB2_MAX_FRAME	(32)
 #define VB2_MAX_PLANES	(8)
@@ -254,11 +255,20 @@ struct vb2_buffer {
 	 *			all buffers queued from userspace
 	 * done_entry:		entry on the list that stores all buffers ready
 	 *			to be dequeued to userspace
+	 * in_fence:		fence receive from vb2 client to wait on before
+	 *			using the buffer (queueing to the driver)
+	 * fence_cb:		fence callback information
+	 * fence_cb_lock:	protect callback signal/remove
 	 */
 	enum vb2_buffer_state	state;
 
 	struct list_head	queued_entry;
 	struct list_head	done_entry;
+
+	struct dma_fence	*in_fence;
+	struct dma_fence_cb	fence_cb;
+	spinlock_t              fence_cb_lock;
+
 #ifdef CONFIG_VIDEO_ADV_DEBUG
 	/*
 	 * Counters for how often these buffer-related ops are
@@ -504,6 +514,7 @@ struct vb2_buf_ops {
  * @last_buffer_dequeued: used in poll() and DQBUF to immediately return if the
  *		last decoded buffer was already dequeued. Set for capture queues
  *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
+ * @last_fence:	last in-fence received. Used to keep ordering.
  * @fileio:	file io emulator internal data, used only if emulator is active
  * @threadio:	thread io internal data, used only if thread is active
  */
@@ -558,6 +569,8 @@ struct vb2_queue {
 	unsigned int			copy_timestamp:1;
 	unsigned int			last_buffer_dequeued:1;
 
+	struct dma_fence		*last_fence;
+
 	struct vb2_fileio_data		*fileio;
 	struct vb2_threadio_data	*threadio;
 
@@ -733,6 +746,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
  * @index:	id number of the buffer
  * @pb:		buffer structure passed from userspace to vidioc_qbuf handler
  *		in driver
+ * @fence:	in-fence to wait on before queueing the buffer
  *
  * Should be called from vidioc_qbuf ioctl handler of a driver.
  * The passed buffer should have been verified.
@@ -747,7 +761,8 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
  * The return values from this function are intended to be directly returned
  * from vidioc_qbuf handler in driver.
  */
-int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
+int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
+		  struct dma_fence *fence);
 
 /**
  * vb2_core_dqbuf() - Dequeue a buffer to the userspace
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (6 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17  7:02   ` Alexandre Courbot
  2017-11-15 17:10 ` [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences Gustavo Padovan
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Javier Martinez Canillas

From: Javier Martinez Canillas <javier@osg.samsung.com>

Add a videobuf2-fence.h header file that contains different helpers
for DMA buffer sharing explicit fence support in videobuf2.

v2:	- use fence context provided by the caller in vb2_fence_alloc()

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 include/media/videobuf2-fence.h | 48 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
 create mode 100644 include/media/videobuf2-fence.h

diff --git a/include/media/videobuf2-fence.h b/include/media/videobuf2-fence.h
new file mode 100644
index 000000000000..b49cc1bf6bb4
--- /dev/null
+++ b/include/media/videobuf2-fence.h
@@ -0,0 +1,48 @@
+/*
+ * videobuf2-fence.h - DMA buffer sharing fence helpers for videobuf 2
+ *
+ * Copyright (C) 2016 Samsung Electronics
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/dma-fence.h>
+#include <linux/slab.h>
+
+static DEFINE_SPINLOCK(vb2_fence_lock);
+
+static inline const char *vb2_fence_get_driver_name(struct dma_fence *fence)
+{
+	return "vb2_fence";
+}
+
+static inline const char *vb2_fence_get_timeline_name(struct dma_fence *fence)
+{
+	return "vb2_fence_timeline";
+}
+
+static inline bool vb2_fence_enable_signaling(struct dma_fence *fence)
+{
+	return true;
+}
+
+static const struct dma_fence_ops vb2_fence_ops = {
+	.get_driver_name = vb2_fence_get_driver_name,
+	.get_timeline_name = vb2_fence_get_timeline_name,
+	.enable_signaling = vb2_fence_enable_signaling,
+	.wait = dma_fence_default_wait,
+};
+
+static inline struct dma_fence *vb2_fence_alloc(u64 context)
+{
+	struct dma_fence *vb2_fence = kzalloc(sizeof(*vb2_fence), GFP_KERNEL);
+
+	if (!vb2_fence)
+		return NULL;
+
+	dma_fence_init(vb2_fence, &vb2_fence_ops, &vb2_fence_lock, context, 1);
+
+	return vb2_fence;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (7 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17  7:19   ` Alexandre Courbot
  2017-11-15 17:10 ` [RFC v5 10/11] [media] vb2: add out-fence support to QBUF Gustavo Padovan
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

Add vb2_setup_out_fence() and the needed members to struct vb2_buffer.

v3:
	- Do not hold yet another ref to the out_fence (Brian Starkey)

v2:	- change it to reflect fd_install at DQEVENT
	- add fence context for out-fences

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 28 ++++++++++++++++++++++++++++
 include/media/videobuf2-core.h           | 20 ++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 26de4c80717d..8b4f0e9bcb36 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -24,8 +24,10 @@
 #include <linux/freezer.h>
 #include <linux/kthread.h>
 #include <linux/dma-fence-array.h>
+#include <linux/sync_file.h>
 
 #include <media/videobuf2-core.h>
+#include <media/videobuf2-fence.h>
 #include <media/v4l2-mc.h>
 
 #include <trace/events/vb2.h>
@@ -1320,6 +1322,32 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
 }
 EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
 
+int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
+{
+	struct vb2_buffer *vb;
+
+	vb = q->bufs[index];
+
+	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
+
+	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
+	if (!vb->out_fence) {
+		put_unused_fd(vb->out_fence_fd);
+		return -ENOMEM;
+	}
+
+	vb->sync_file = sync_file_create(vb->out_fence);
+	if (!vb->sync_file) {
+		put_unused_fd(vb->out_fence_fd);
+		dma_fence_put(vb->out_fence);
+		vb->out_fence = NULL;
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_setup_out_fence);
+
 /**
  * vb2_start_streaming() - Attempt to start streaming.
  * @q:		videobuf2 queue
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 5f48c7be7770..a9b0697bd782 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -259,6 +259,10 @@ struct vb2_buffer {
 	 *			using the buffer (queueing to the driver)
 	 * fence_cb:		fence callback information
 	 * fence_cb_lock:	protect callback signal/remove
+	 * out_fence_fd:	the out_fence_fd to be shared with userspace.
+	 * out_fence:		the out-fence associated with the buffer once
+	 *			it is queued to the driver.
+	 * sync_file:		the sync file to wrap the out fence
 	 */
 	enum vb2_buffer_state	state;
 
@@ -269,6 +273,10 @@ struct vb2_buffer {
 	struct dma_fence_cb	fence_cb;
 	spinlock_t              fence_cb_lock;
 
+	int			out_fence_fd;
+	struct dma_fence	*out_fence;
+	struct sync_file	*sync_file;
+
 #ifdef CONFIG_VIDEO_ADV_DEBUG
 	/*
 	 * Counters for how often these buffer-related ops are
@@ -514,6 +522,7 @@ struct vb2_buf_ops {
  * @last_buffer_dequeued: used in poll() and DQBUF to immediately return if the
  *		last decoded buffer was already dequeued. Set for capture queues
  *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
+ * @out_fence_context: the fence context for the out fences
  * @last_fence:	last in-fence received. Used to keep ordering.
  * @fileio:	file io emulator internal data, used only if emulator is active
  * @threadio:	thread io internal data, used only if thread is active
@@ -569,6 +578,7 @@ struct vb2_queue {
 	unsigned int			copy_timestamp:1;
 	unsigned int			last_buffer_dequeued:1;
 
+	u64				out_fence_context;
 	struct dma_fence		*last_fence;
 
 	struct vb2_fileio_data		*fileio;
@@ -740,6 +750,16 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
 
 /**
+ * vb2_setup_out_fence() - setup new out-fence
+ * @q:		The vb2_queue where to setup it
+ * @index:	index of the buffer
+ *
+ * Setup the file descriptor, the fence and the sync_file for the next
+ * buffer to be queued and add everything to the tail of the q->out_fence_list.
+ */
+int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index);
+
+/**
  * vb2_core_qbuf() - Queue a buffer from userspace
  *
  * @q:		videobuf2 queue
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 10/11] [media] vb2: add out-fence support to QBUF
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (8 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-17  7:38   ` Alexandre Courbot
  2017-11-17 13:34   ` Hans Verkuil
  2017-11-15 17:10 ` [RFC v5 11/11] [media] v4l: Document explicit synchronization behavior Gustavo Padovan
  2017-11-20 10:19 ` [RFC v5 00/11] V4L2 Explicit Synchronization Smitha T Murthy
  11 siblings, 2 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

If V4L2_BUF_FLAG_OUT_FENCE flag is present on the QBUF call we create
an out_fence and send its fd to userspace on the fence_fd field as a
return arg for the QBUF call.

The fence is signaled on buffer_done(), when the job on the buffer is
finished.

With out-fences we do not allow drivers to requeue buffers through vb2,
instead we flag an error on the buffer, signals it fence with error and
return it to userspace.

v6
	- get rid of the V4L2_EVENT_OUT_FENCE event. We always keep the
	ordering in vb2 for queueing in the driver, so the event is not
	necessary anymore and the out_fence_fd is sent back to userspace
	on QBUF call return arg
	- do not allow requeueing with out-fences, instead mark the buffer
	with an error and wake up to userspace.
	- send the out_fence_fd back to userspace on the fence_fd field

v5:
	- delay fd_install to DQ_EVENT (Hans)
	- if queue is fully ordered send OUT_FENCE event right away
	(Brian)
	- rename 'q->ordered' to 'q->ordered_in_driver'
	- merge change to implement OUT_FENCE event here

v4:
	- return the out_fence_fd in the BUF_QUEUED event(Hans)

v3:	- add WARN_ON_ONCE(q->ordered) on requeueing (Hans)
	- set the OUT_FENCE flag if there is a fence pending (Hans)
	- call fd_install() after vb2_core_qbuf() (Hans)
	- clean up fence if vb2_core_qbuf() fails (Hans)
	- add list to store sync_file and fence for the next queued buffer

v2: check if the queue is ordered.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 drivers/media/v4l2-core/videobuf2-core.c | 42 +++++++++++++++++++++++++++++---
 drivers/media/v4l2-core/videobuf2-v4l2.c | 21 +++++++++++++++-
 2 files changed, 58 insertions(+), 5 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
index 8b4f0e9bcb36..2eb5ffa8e028 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -354,6 +354,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
 			vb->planes[plane].length = plane_sizes[plane];
 			vb->planes[plane].min_length = plane_sizes[plane];
 		}
+		vb->out_fence_fd = -1;
 		q->bufs[vb->index] = vb;
 
 		/* Allocate video buffer memory for the MMAP type */
@@ -934,10 +935,26 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
 	case VB2_BUF_STATE_QUEUED:
 		return;
 	case VB2_BUF_STATE_REQUEUEING:
-		if (q->start_streaming_called)
-			__enqueue_in_driver(vb);
-		return;
+
+		if (!vb->out_fence) {
+			if (q->start_streaming_called)
+				__enqueue_in_driver(vb);
+			return;
+		}
+
+		/* Do not allow requeuing with explicit synchronization,
+		 * report it as an error to userspace */
+		state = VB2_BUF_STATE_ERROR;
+
+		/* fall through */
 	default:
+		if (state == VB2_BUF_STATE_ERROR)
+			dma_fence_set_error(vb->out_fence, -EFAULT);
+		dma_fence_signal(vb->out_fence);
+		dma_fence_put(vb->out_fence);
+		vb->out_fence = NULL;
+		vb->out_fence_fd = -1;
+
 		/* Inform any processes that may be waiting for buffers */
 		wake_up(&q->done_wq);
 		break;
@@ -1325,12 +1342,18 @@ EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
 int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
 {
 	struct vb2_buffer *vb;
+	u64 context;
 
 	vb = q->bufs[index];
 
 	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
 
-	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
+	if (q->ordered_in_driver)
+		context = q->out_fence_context;
+	else
+		context = dma_fence_context_alloc(1);
+
+	vb->out_fence = vb2_fence_alloc(context);
 	if (!vb->out_fence) {
 		put_unused_fd(vb->out_fence_fd);
 		return -ENOMEM;
@@ -1594,6 +1617,9 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
 	if (pb)
 		call_void_bufop(q, fill_user_buffer, vb, pb);
 
+	fd_install(vb->out_fence_fd, vb->sync_file->file);
+	vb->sync_file = NULL;
+
 	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
 	return 0;
 
@@ -1860,6 +1886,12 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
 	}
 
 	/*
+	 * Renew out-fence context.
+	 */
+	if (q->ordered_in_driver)
+		q->out_fence_context = dma_fence_context_alloc(1);
+
+	/*
 	 * Remove all buffers from videobuf's list...
 	 */
 	INIT_LIST_HEAD(&q->queued_list);
@@ -2191,6 +2223,8 @@ int vb2_core_queue_init(struct vb2_queue *q)
 	spin_lock_init(&q->done_lock);
 	mutex_init(&q->mmap_lock);
 	init_waitqueue_head(&q->done_wq);
+	if (q->ordered_in_driver)
+		q->out_fence_context = dma_fence_context_alloc(1);
 
 	if (q->buf_struct_size == 0)
 		q->buf_struct_size = sizeof(struct vb2_buffer);
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index 4c09ea007d90..f2e60d2908ae 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -212,7 +212,13 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
 	b->sequence = vbuf->sequence;
 	b->reserved = 0;
 
-	b->fence_fd = -1;
+	if (vb->out_fence) {
+		b->flags |= V4L2_BUF_FLAG_OUT_FENCE;
+		b->fence_fd = vb->out_fence_fd;
+	} else {
+		b->fence_fd = -1;
+	}
+
 	if (vb->in_fence)
 		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
 	else
@@ -489,6 +495,10 @@ int vb2_querybuf(struct vb2_queue *q, struct v4l2_buffer *b)
 	ret = __verify_planes_array(vb, b);
 	if (!ret)
 		vb2_core_querybuf(q, b->index, b);
+
+	/* Do not return the out-fence fd on querybuf */
+	if (vb->out_fence)
+		b->fence_fd = -1;
 	return ret;
 }
 EXPORT_SYMBOL(vb2_querybuf);
@@ -593,6 +603,15 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
 		}
 	}
 
+	if (b->flags & V4L2_BUF_FLAG_OUT_FENCE) {
+		ret = vb2_setup_out_fence(q, b->index);
+		if (ret) {
+			dprintk(1, "failed to set up out-fence\n");
+			dma_fence_put(fence);
+			return ret;
+		}
+	}
+
 	return vb2_core_qbuf(q, b->index, b, fence);
 }
 EXPORT_SYMBOL_GPL(vb2_qbuf);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v5 11/11] [media] v4l: Document explicit synchronization behavior
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (9 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 10/11] [media] vb2: add out-fence support to QBUF Gustavo Padovan
@ 2017-11-15 17:10 ` Gustavo Padovan
  2017-11-20 10:19 ` [RFC v5 00/11] V4L2 Explicit Synchronization Smitha T Murthy
  11 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-15 17:10 UTC (permalink / raw)
  To: linux-media
  Cc: Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

From: Gustavo Padovan <gustavo.padovan@collabora.com>

Add section to VIDIOC_QBUF about it

v4:
	- Document ordering behavior for in-fences
	- Document V4L2_CAP_ORDERED capability
	- Remove doc about OUT_FENCE event
	- Document immediate return of out-fence in QBUF

v3:
	- make the out_fence refer to the current buffer (Hans)
	- Note what happens when the IN_FENCE is not set (Hans)

v2:
	- mention that fences are files (Hans)
	- rework for the new API

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
---
 Documentation/media/uapi/v4l/vidioc-qbuf.rst     | 42 +++++++++++++++++++++++-
 Documentation/media/uapi/v4l/vidioc-querybuf.rst |  9 ++++-
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-qbuf.rst b/Documentation/media/uapi/v4l/vidioc-qbuf.rst
index 9e448a4aa3aa..040709256ab6 100644
--- a/Documentation/media/uapi/v4l/vidioc-qbuf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-qbuf.rst
@@ -54,7 +54,7 @@ When the buffer is intended for output (``type`` is
 or ``V4L2_BUF_TYPE_VBI_OUTPUT``) applications must also initialize the
 ``bytesused``, ``field`` and ``timestamp`` fields, see :ref:`buffer`
 for details. Applications must also set ``flags`` to 0. The
-``reserved2`` and ``reserved`` fields must be set to 0. When using the
+``reserved`` field must be set to 0. When using the
 :ref:`multi-planar API <planar-apis>`, the ``m.planes`` field must
 contain a userspace pointer to a filled-in array of struct
 :c:type:`v4l2_plane` and the ``length`` field must be set
@@ -118,6 +118,46 @@ immediately with an ``EAGAIN`` error code when no buffer is available.
 The struct :c:type:`v4l2_buffer` structure is specified in
 :ref:`buffer`.
 
+Explicit Synchronization
+------------------------
+
+Explicit Synchronization allows us to control the synchronization of
+shared buffers from userspace by passing fences to the kernel and/or
+receiving them from it. Fences passed to the kernel are named in-fences and
+the kernel should wait on them to signal before using the buffer, i.e., queueing
+it to the driver. On the other side, the kernel can create out-fences for the
+buffers it queues to the drivers. Out-fences signal when the driver is
+finished with buffer, i.e., the buffer is ready. The fences are represented
+as a file and passed as a file descriptor to userspace.
+
+The in-fences are communicated to the kernel at the ``VIDIOC_QBUF`` ioctl
+using the ``V4L2_BUF_FLAG_IN_FENCE`` buffer flag and the `fence_fd` field. If
+an in-fence needs to be passed to the kernel, `fence_fd` should be set to the
+fence file descriptor number and the ``V4L2_BUF_FLAG_IN_FENCE`` should be set
+as well. Setting one but not the other will cause ``VIDIOC_QBUF`` to return
+with error. The fence_fd field will be ignored if the
+``V4L2_BUF_FLAG_IN_FENCE`` is not set.
+
+The videobuf2-core will guarantee that all buffers queued with in-fence will
+be queued to the drivers in the same order. Fence may signal out of order, so
+this guarantee at videobuf2 is necessary to not change ordering.
+
+To get an out-fence back from V4L2 the ``V4L2_BUF_FLAG_OUT_FENCE`` flag should
+be set to ask for a fence to be attached to the buffer. The out-fence fd is
+sent to userspace as a ``VIDIOC_QBUF`` return argument on the `fence_fd` field.
+
+Note the the same `fence_fd` field is used for both sending the in-fence as
+input argument to receive the out-fence as a return argument.
+
+At streamoff the out-fences will either signal normally if the driver waits
+for the operations on the buffers to finish or signal with an error if the
+driver cancels the pending operations. Buffers with in-fences won't be queued
+to the driver if their fences signal. It will be cleaned up.
+
+A capability flag, ``V4L2_CAP_ORDERED``, tells userspace that the current
+buffer queues is able to keep the ordering of buffers, i.e., the dequeing of
+buffers will happen at the same order we queue them, with no reordering by
+the driver.
 
 Return Value
 ============
diff --git a/Documentation/media/uapi/v4l/vidioc-querybuf.rst b/Documentation/media/uapi/v4l/vidioc-querybuf.rst
index dd54747fabc9..df964c4d916b 100644
--- a/Documentation/media/uapi/v4l/vidioc-querybuf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-querybuf.rst
@@ -44,7 +44,7 @@ and the ``index`` field. Valid index numbers range from zero to the
 number of buffers allocated with
 :ref:`VIDIOC_REQBUFS` (struct
 :c:type:`v4l2_requestbuffers` ``count``) minus
-one. The ``reserved`` and ``reserved2`` fields must be set to 0. When
+one. The ``reserved`` field must be set to 0. When
 using the :ref:`multi-planar API <planar-apis>`, the ``m.planes``
 field must contain a userspace pointer to an array of struct
 :c:type:`v4l2_plane` and the ``length`` field has to be set
@@ -64,6 +64,13 @@ elements will be used instead and the ``length`` field of struct
 array elements. The driver may or may not set the remaining fields and
 flags, they are meaningless in this context.
 
+When using in-fences, the ``V4L2_BUF_FLAG_IN_FENCE`` will be set if the
+in-fence didn't signal at the time of the
+:ref:`VIDIOC_QUERYBUF`. The ``V4L2_BUF_FLAG_OUT_FENCE`` will be set if
+the user asked for an out-fence for the buffer, but the ``fence_fd``
+field will be set to -1. In case ``V4L2_BUF_FLAG_OUT_FENCE`` is not set
+``fence_fd`` will be set to 0 for backward compatibility.
+
 The struct :c:type:`v4l2_buffer` structure is specified in
 :ref:`buffer`.
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues
  2017-11-15 17:10 ` [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues Gustavo Padovan
@ 2017-11-17  5:56   ` Alexandre Courbot
  2017-11-17 11:23     ` Gustavo Padovan
  2017-11-17 12:15   ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  5:56 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On Thursday, November 16, 2017 2:10:49 AM JST, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>
> We use ordered_in_driver property to optimize for the case where
> the driver can deliver the buffers in an ordered fashion. When it
> is ordered we can use the same fence context for all fences, but
> when it is not we need to a new context for each out-fence.

"we need to a new context" looks like it is missing a word.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
@ 2017-11-17  6:49   ` Alexandre Courbot
  2017-11-17 13:00     ` Mauro Carvalho Chehab
  2017-11-17 13:01     ` Gustavo Padovan
  2017-11-17 12:53   ` Mauro Carvalho Chehab
  2017-11-17 14:15   ` Hans Verkuil
  2 siblings, 2 replies; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  6:49 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Hi Gustavo,

I am coming a bit late in this series' review, so apologies if some of my
comments have already have been discussed in an earlier revision.

On Thursday, November 16, 2017 2:10:53 AM JST, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>
> Receive in-fence from userspace and add support for waiting on them
> before queueing the buffer to the driver. Buffers can't be queued to the
> driver before its fences signal. And a buffer can't be queue to the driver
> out of the order they were queued from userspace. That means that even if
> it fence signal it must wait all other buffers, ahead of it in the queue,
> to signal first.
>
> To make that possible we use fence_array to keep that ordering. Basically
> we create a fence_array that contains both the current fence and the fence
> from the previous buffer (which might be a fence array as well). The base
> fence class for the fence_array becomes the new buffer fence, waiting on
> that one guarantees that it won't be queued out of order.
>
> v6:
> 	- With fences always keep the order userspace queues the buffers.
> 	- Protect in_fence manipulation with a lock (Brian Starkey)
> 	- check if fences have the same context before adding a fence array
> 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
>
> v5:	- use fence_array to keep buffers ordered in vb2 core when
> 	needed (Brian Starkey)
> 	- keep backward compat on the reserved2 field (Brian Starkey)
> 	- protect fence callback removal with lock (Brian Starkey)
>
> v4:
> 	- Add a comment about dma_fence_add_callback() not returning a
> 	error (Hans)
> 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> 	vb2_start_streaming() (Hans)
> 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> 	- Queue buffers to the driver as soon as they are ready (Hans)
> 	- call fill_user_buffer() after queuing the buffer (Hans)
> 	- add err: label to clean up fence
> 	- add dma_fence_wait() before calling vb2_start_streaming()
>
> v3:	- document fence parameter
> 	- remove ternary if at vb2_qbuf() return (Mauro)
> 	- do not change if conditions behaviour (Mauro)
>
> v2:
> 	- fix vb2_queue_or_prepare_buf() ret check
> 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> 	- check num of ready buffers to start streaming
> 	- when queueing, start from the first ready buffer
> 	- handle queue cancel
>
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/Kconfig          |   1 +
>  drivers/media/v4l2-core/videobuf2-core.c | 202 
> ++++++++++++++++++++++++++++---
>  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
>  include/media/videobuf2-core.h           |  17 ++-
>  4 files changed, 231 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/Kconfig 
> b/drivers/media/v4l2-core/Kconfig
> index a35c33686abf..3f988c407c80 100644
> --- a/drivers/media/v4l2-core/Kconfig
> +++ b/drivers/media/v4l2-core/Kconfig
> @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
>  # Used by drivers that need Videobuf2 modules
>  config VIDEOBUF2_CORE
>  	select DMA_SHARED_BUFFER
> +	select SYNC_FILE
>  	tristate
>  
>  config VIDEOBUF2_MEMOPS
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
> b/drivers/media/v4l2-core/videobuf2-core.c
> index 60f8b582396a..26de4c80717d 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/dma-fence-array.h>
>  
>  #include <media/videobuf2-core.h>
>  #include <media/v4l2-mc.h>
> @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct 
> vb2_queue *q, enum vb2_memory memory,
>  		vb->index = q->num_buffers + buffer;
>  		vb->type = q->type;
>  		vb->memory = memory;
> +		spin_lock_init(&vb->fence_cb_lock);
>  		for (plane = 0; plane < num_planes; ++plane) {
>  			vb->planes[plane].length = plane_sizes[plane];
>  			vb->planes[plane].min_length = plane_sizes[plane];
> @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>  {
>  	struct vb2_queue *q = vb->vb2_queue;
>  
> +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> +		return;
> +
>  	vb->state = VB2_BUF_STATE_ACTIVE;
>  	atomic_inc(&q->owned_by_drv_count);
>  
> @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct 
> vb2_buffer *vb, const void *pb)
>  	return 0;
>  }
>  
> +static int __get_num_ready_buffers(struct vb2_queue *q)
> +{
> +	struct vb2_buffer *vb;
> +	int ready_count = 0;
> +	unsigned long flags;
> +
> +	/* count num of buffers ready in front of the queued_list */
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> +			ready_count++;
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	return ready_count;
> +}
> +
>  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>  {
>  	struct vb2_buffer *vb;
> @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
>  	return ret;
>  }
>  
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> +					struct vb2_buffer *vb,
> +					struct dma_fence *fence)
> +{
> +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
> +	/*
> +	 * We always guarantee the ordering of buffers queued from
> +	 * userspace to be the same it is queued to the driver. For that
> +	 * we create a fence array with the fence from the last queued
> +	 * buffer and this one, that way the fence for this buffer can't
> +	 * signal before the last one.
> +	 */
> +	if (fence && q->last_fence) {
> +		struct dma_fence **fences;
> +		struct dma_fence_array *arr;
> +
> +		if (fence->context == q->last_fence->context) {
> +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> +				dma_fence_put(q->last_fence);
> +				q->last_fence = dma_fence_get(fence);
> +			} else {
> +				dma_fence_put(fence);
> +				fence = dma_fence_get(q->last_fence);
> +			}
> +			return fence;
> +		}
> +
> +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> +		if (!fences)
> +			return ERR_PTR(-ENOMEM);
> +
> +		fences[0] = fence;
> +		fences[1] = q->last_fence;
> +
> +		arr = dma_fence_array_create(2, fences,
> +					     dma_fence_context_alloc(1),
> +					     1, false);
> +		if (!arr) {
> +			kfree(fences);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +
> +		fence = &arr->base;
> +
> +		q->last_fence = dma_fence_get(fence);
> +	} else if (!fence && q->last_fence) {
> +		fence = dma_fence_get(q->last_fence);
> +	}
> +
> +	return fence;
> +}
> +
> +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> +	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
> +	struct vb2_queue *q = vb->vb2_queue;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (!vb->in_fence) {
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +		return;
> +	}

Is this block necessary? IIUC the callback will never be set on buffers 
without
an input fence, so (!vb->in_fence) should never be satisfied.

> +
> +	dma_fence_put(vb->in_fence);
> +	vb->in_fence = NULL;
> +
> +	if (q->start_streaming_called)
> +		__enqueue_in_driver(vb);
> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +}
> +
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence)
>  {
>  	struct vb2_buffer *vb;
> +	unsigned long flags;
>  	int ret;
>  
>  	vb = q->bufs[index];
> @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, 
> unsigned int index, void *pb)
>  	case VB2_BUF_STATE_DEQUEUED:
>  		ret = __buf_prepare(vb, pb);
>  		if (ret)
> -			return ret;
> +			goto err;
>  		break;
>  	case VB2_BUF_STATE_PREPARED:
>  		break;
>  	case VB2_BUF_STATE_PREPARING:
>  		dprintk(1, "buffer still being prepared\n");
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	default:
>  		dprintk(1, "invalid buffer state %d\n", vb->state);
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	}
>  
>  	/*
> @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, 
> unsigned int index, void *pb)
>  
>  	trace_vb2_qbuf(q, vb);
>  
> +	vb->in_fence = __set_in_fence(q, vb, fence);
> +	if (IS_ERR(vb->in_fence)) {
> +		dma_fence_put(fence);
> +		ret = PTR_ERR(vb->in_fence);
> +		goto err;
> +	}
> +	fence = NULL;
> +
> +	/*
> +	 * If it is time to call vb2_start_streaming() wait for the fence
> +	 * to signal first. Of course, this happens only once per streaming.
> +	 * We want to run any step that might fail before we set the callback
> +	 * to queue the fence when it signals.
> +	 */
> +	if (vb->in_fence && !q->start_streaming_called &&
> +	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
> +		dma_fence_wait(vb->in_fence, true);

Mmm, that's a tough call. Userspace may unexpectingly block due to this 
(which
fences are supposed to prevent), not sure how much of a problem this may 
be.

> +
>  	/*
>  	 * If streamon has been called, and we haven't yet called
>  	 * start_streaming() since not enough buffers were queued, and
>  	 * we now have reached the minimum number of queued buffers,
>  	 * then we can finally call start_streaming().
> -	 *
> -	 * If already streaming, give the buffer to driver for processing.
> -	 * If not, the buffer will be given to driver on next streamon.
>  	 */
>  	if (q->streaming && !q->start_streaming_called &&
> -	    q->queued_count >= q->min_buffers_needed) {
> +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {
>  		ret = vb2_start_streaming(q);
>  		if (ret)
> -			return ret;
> -	} else if (q->start_streaming_called) {
> -		__enqueue_in_driver(vb);
> +			goto err;
>  	}
>  
> +	/*
> +	 * For explicit synchronization: If the fence didn't signal
> +	 * yet we setup a callback to queue the buffer once the fence
> +	 * signals, and then, return successfully. But if the fence
> +	 * already signaled we lose the reference we held and queue the
> +	 * buffer to the driver.
> +	 */
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (vb->in_fence) {
> +		ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
> +					     vb2_qbuf_fence_cb);
> +		if (ret == -EINVAL) {
> +			spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +			goto err;
> +		} else if (!ret) {
> +			goto fill;
> +		}
> +
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;

I suppose the last two lines are supposed to be called when
dma_fence_add_callback() returns -ENOENT, because the fence has already 
been
signaled? In that case I think the code would be more readable if you made 
it
more explicit, something like:

// fence already signaled?
if (ret == -ENOENT) {
	dma_fence_put(vb->in_fence);
	vb->in_fence = NULL;
} else if (ret != 0) {
	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
	goto err;
}

and that way you can get rid of the short-ranged fill: label.

> +	}
> +
> +fill:
> +	/*
> +	 * If already streaming and there is no fence to wait on
> +	 * give the buffer to driver for processing.
> +	 */
> +	if (q->start_streaming_called && !vb->in_fence)
> +		__enqueue_in_driver(vb);

Since that code was previously called when vb2_start_streaming() wasn't, we
had a guarantee that the buffer was only queued once. Now I wonder if we 
could
not run into a situation where __enqueue_in_driver is called twice for the 
last
queued buffer, once by vb2_start_streaming() and once here, now that 
vb2_start_streaming() has set q->start_streaming_called to 1?

> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +
>  	/* Fill buffer information for the userspace */
>  	if (pb)
>  		call_void_bufop(q, fill_user_buffer, vb, pb);
>  
>  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>  	return 0;
> +
> +err:
> +	if (vb->in_fence) {
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;
> +	}
> +
> +	return ret;
> +
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_qbuf);
>  
> @@ -1632,6 +1787,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
>  static void __vb2_queue_cancel(struct vb2_queue *q)
>  {
>  	unsigned int i;
> +	struct vb2_buffer *vb;
> +	unsigned long flags;
>  
>  	/*
>  	 * Tell driver to stop all transactions and release all queued
> @@ -1659,6 +1816,21 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>  	q->queued_count = 0;
>  	q->error = 0;
>  
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (vb->in_fence) {
> +			dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
> +			dma_fence_put(vb->in_fence);
> +			vb->in_fence = NULL;
> +		}
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	if (q->last_fence) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
>  	/*
>  	 * Remove all buffers from videobuf's list...
>  	 */
> @@ -1720,7 +1892,7 @@ int vb2_core_streamon(struct vb2_queue 
> *q, unsigned int type)
>  	 * Tell driver to start streaming provided sufficient buffers
>  	 * are available.
>  	 */
> -	if (q->queued_count >= q->min_buffers_needed) {
> +	if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
>  		ret = v4l_vb2q_enable_media_source(q);
>  		if (ret)
>  			return ret;
> @@ -2240,7 +2412,7 @@ static int __vb2_init_fileio(struct 
> vb2_queue *q, int read)
>  		 * Queue all buffers.
>  		 */
>  		for (i = 0; i < q->num_buffers; i++) {
> -			ret = vb2_core_qbuf(q, i, NULL);
> +			ret = vb2_core_qbuf(q, i, NULL, NULL);
>  			if (ret)
>  				goto err_reqbufs;
>  			fileio->bufs[i].queued = 1;
> @@ -2419,7 +2591,7 @@ static size_t __vb2_perform_fileio(struct 
> vb2_queue *q, char __user *data, size_
>  
>  		if (copy_timestamp)
>  			b->timestamp = ktime_get_ns();
> -		ret = vb2_core_qbuf(q, index, NULL);
> +		ret = vb2_core_qbuf(q, index, NULL, NULL);
>  		dprintk(5, "vb2_dbuf result: %d\n", ret);
>  		if (ret)
>  			return ret;
> @@ -2522,7 +2694,7 @@ static int vb2_thread(void *data)
>  		if (copy_timestamp)
>  			vb->timestamp = ktime_get_ns();;
>  		if (!threadio->stop)
> -			ret = vb2_core_qbuf(q, vb->index, NULL);
> +			ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
>  		call_void_qop(q, wait_prepare, q);
>  		if (ret || threadio->stop)
>  			break;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c 
> b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 110fb45fef6f..4c09ea007d90 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/sync_file.h>
>  
>  #include <media/v4l2-dev.h>
>  #include <media/v4l2-fh.h>
> @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct 
> vb2_queue *q, struct v4l2_buffer *b,
>  		return -EINVAL;
>  	}
>  
> +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&

Why do we need to consider both values invalid? Can 0 ever be a valid fence 
fd?

For completeness you may also want to check the case where (b->fence_fd == 
-1 &&
(b->flags & V4L2_BUF_FLAG_IN_FENCE)) and return an error ahead of time, but 
I
suppose the call to sync_file_get_fence() later on will catch this as well.

> +	    !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
> +		dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
> +		return -EINVAL;
> +	}
> +
>  	return __verify_planes_array(q->bufs[b->index], b);
>  }
>  
> @@ -203,9 +210,14 @@ static void __fill_v4l2_buffer(struct 
> vb2_buffer *vb, void *pb)
>  	b->timestamp = ns_to_timeval(vb->timestamp);
>  	b->timecode = vbuf->timecode;
>  	b->sequence = vbuf->sequence;
> -	b->fence_fd = -1;
>  	b->reserved = 0;
>  
> +	b->fence_fd = -1;
> +	if (vb->in_fence)
> +		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
> +	else
> +		b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
> +
>  	if (q->is_multiplanar) {
>  		/*
>  		 * Fill in plane-related data if userspace provided an array
> @@ -560,6 +572,7 @@ EXPORT_SYMBOL_GPL(vb2_create_bufs);
>  
>  int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  {
> +	struct dma_fence *fence = NULL;
>  	int ret;
>  
>  	if (vb2_fileio_is_active(q)) {
> @@ -568,7 +581,19 @@ int vb2_qbuf(struct vb2_queue *q, struct 
> v4l2_buffer *b)
>  	}
>  
>  	ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> -	return ret ? ret : vb2_core_qbuf(q, b->index, b);
> +	if (ret)
> +		return ret;
> +
> +	if (b->flags & V4L2_BUF_FLAG_IN_FENCE) {
> +		fence = sync_file_get_fence(b->fence_fd);
> +		if (!fence) {
> +			dprintk(1, "failed to get in-fence from fd %d\n",
> +				b->fence_fd);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return vb2_core_qbuf(q, b->index, b, fence);
>  }
>  EXPORT_SYMBOL_GPL(vb2_qbuf);
>  
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 38b9c8dd42c6..5f48c7be7770 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -16,6 +16,7 @@
>  #include <linux/mutex.h>
>  #include <linux/poll.h>
>  #include <linux/dma-buf.h>
> +#include <linux/dma-fence.h>
>  
>  #define VB2_MAX_FRAME	(32)
>  #define VB2_MAX_PLANES	(8)
> @@ -254,11 +255,20 @@ struct vb2_buffer {
>  	 *			all buffers queued from userspace
>  	 * done_entry:		entry on the list that stores all buffers ready
>  	 *			to be dequeued to userspace
> +	 * in_fence:		fence receive from vb2 client to wait on before
> +	 *			using the buffer (queueing to the driver)
> +	 * fence_cb:		fence callback information
> +	 * fence_cb_lock:	protect callback signal/remove
>  	 */
>  	enum vb2_buffer_state	state;
>  
>  	struct list_head	queued_entry;
>  	struct list_head	done_entry;
> +
> +	struct dma_fence	*in_fence;
> +	struct dma_fence_cb	fence_cb;
> +	spinlock_t              fence_cb_lock;
> +
>  #ifdef CONFIG_VIDEO_ADV_DEBUG
>  	/*
>  	 * Counters for how often these buffer-related ops are
> @@ -504,6 +514,7 @@ struct vb2_buf_ops {
>   * @last_buffer_dequeued: used in poll() and DQBUF to 
> immediately return if the
>   *		last decoded buffer was already dequeued. Set for capture queues
>   *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
> + * @last_fence:	last in-fence received. Used to keep ordering.
>   * @fileio:	file io emulator internal data, used only if emulator is active
>   * @threadio:	thread io internal data, used only if thread is active
>   */
> @@ -558,6 +569,8 @@ struct vb2_queue {
>  	unsigned int			copy_timestamp:1;
>  	unsigned int			last_buffer_dequeued:1;
>  
> +	struct dma_fence		*last_fence;
> +
>  	struct vb2_fileio_data		*fileio;
>  	struct vb2_threadio_data	*threadio;
>  
> @@ -733,6 +746,7 @@ int vb2_core_prepare_buf(struct vb2_queue 
> *q, unsigned int index, void *pb);
>   * @index:	id number of the buffer
>   * @pb:		buffer structure passed from userspace to vidioc_qbuf handler
>   *		in driver
> + * @fence:	in-fence to wait on before queueing the buffer
>   *
>   * Should be called from vidioc_qbuf ioctl handler of a driver.
>   * The passed buffer should have been verified.
> @@ -747,7 +761,8 @@ int vb2_core_prepare_buf(struct vb2_queue 
> *q, unsigned int index, void *pb);
>   * The return values from this function are intended to be 
> directly returned
>   * from vidioc_qbuf handler in driver.
>   */
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence);
>  
>  /**
>   * vb2_core_dqbuf() - Dequeue a buffer to the userspace

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers
  2017-11-15 17:10 ` [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers Gustavo Padovan
@ 2017-11-17  7:02   ` Alexandre Courbot
  2017-11-17  7:11     ` Alexandre Courbot
  0 siblings, 1 reply; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  7:02 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Javier Martinez Canillas

On Thursday, November 16, 2017 2:10:54 AM JST, Gustavo Padovan wrote:
> From: Javier Martinez Canillas <javier@osg.samsung.com>
>
> Add a videobuf2-fence.h header file that contains different helpers
> for DMA buffer sharing explicit fence support in videobuf2.
>
> v2:	- use fence context provided by the caller in vb2_fence_alloc()
>
> Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  include/media/videobuf2-fence.h | 48 
> +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
>  create mode 100644 include/media/videobuf2-fence.h
>
> diff --git a/include/media/videobuf2-fence.h 
> b/include/media/videobuf2-fence.h
> new file mode 100644
> index 000000000000..b49cc1bf6bb4
> --- /dev/null
> +++ b/include/media/videobuf2-fence.h
> @@ -0,0 +1,48 @@
> +/*
> + * videobuf2-fence.h - DMA buffer sharing fence helpers for videobuf 2
> + *
> + * Copyright (C) 2016 Samsung Electronics
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation.
> + */
> +
> +#include <linux/dma-fence.h>
> +#include <linux/slab.h>
> +
> +static DEFINE_SPINLOCK(vb2_fence_lock);
> +
> +static inline const char *vb2_fence_get_driver_name(struct 
> dma_fence *fence)
> +{
> +	return "vb2_fence";
> +}
> +
> +static inline const char *vb2_fence_get_timeline_name(struct 
> dma_fence *fence)
> +{
> +	return "vb2_fence_timeline";
> +}
> +
> +static inline bool vb2_fence_enable_signaling(struct dma_fence *fence)
> +{
> +	return true;
> +}
> +
> +static const struct dma_fence_ops vb2_fence_ops = {
> +	.get_driver_name = vb2_fence_get_driver_name,
> +	.get_timeline_name = vb2_fence_get_timeline_name,
> +	.enable_signaling = vb2_fence_enable_signaling,
> +	.wait = dma_fence_default_wait,
> +};

It is probably not a good idea to define that struct here since it will be
deduplicated for every source file that includes it.

Maybe change it to a simple declaration, and move the definition to
videobuf2-core.c or a dedicated videobuf2-fence.c file?

> +
> +static inline struct dma_fence *vb2_fence_alloc(u64 context)
> +{
> +	struct dma_fence *vb2_fence = kzalloc(sizeof(*vb2_fence), GFP_KERNEL);
> +
> +	if (!vb2_fence)
> +		return NULL;
> +
> +	dma_fence_init(vb2_fence, &vb2_fence_ops, &vb2_fence_lock, context, 1);
> +
> +	return vb2_fence;
> +}

Not sure we gain a lot by having this function static inline, but your 
call.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers
  2017-11-17  7:02   ` Alexandre Courbot
@ 2017-11-17  7:11     ` Alexandre Courbot
  2017-11-17 11:27       ` Gustavo Padovan
  0 siblings, 1 reply; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  7:11 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Javier Martinez Canillas

On Friday, November 17, 2017 4:02:56 PM JST, Alexandre Courbot wrote:
> On Thursday, November 16, 2017 2:10:54 AM JST, Gustavo Padovan wrote:
>> From: Javier Martinez Canillas <javier@osg.samsung.com>
>> 
>> Add a videobuf2-fence.h header file that contains different helpers
>> for DMA buffer sharing explicit fence support in videobuf2.
>> 
>> v2:	- use fence context provided by the caller in vb2_fence_alloc()
>> 
>> Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> ...
>
> It is probably not a good idea to define that struct here since it will be
> deduplicated for every source file that includes it.
>
> Maybe change it to a simple declaration, and move the definition to
> videobuf2-core.c or a dedicated videobuf2-fence.c file?
>
>> +
>> +static inline struct dma_fence *vb2_fence_alloc(u64 context)
>> +{
>> +	struct dma_fence *vb2_fence = kzalloc(sizeof(*vb2_fence), GFP_KERNEL);
>> + ...
>
> Not sure we gain a lot by having this function static inline, but your call.
>

Looking at the following patch, since it seems that this function is only 
to be
called from vb2_setup_out_fence() anyway, you may as well make it static in
videobuf2-core.c or even inline it in vb2_setup_out_fence() - it would only
add a few extra lines to this function.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences
  2017-11-15 17:10 ` [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences Gustavo Padovan
@ 2017-11-17  7:19   ` Alexandre Courbot
  2017-11-17  7:29     ` Alexandre Courbot
  2017-11-17 11:30     ` Gustavo Padovan
  0 siblings, 2 replies; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  7:19 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On Thursday, November 16, 2017 2:10:55 AM JST, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>
> Add vb2_setup_out_fence() and the needed members to struct vb2_buffer.
>
> v3:
> 	- Do not hold yet another ref to the out_fence (Brian Starkey)
>
> v2:	- change it to reflect fd_install at DQEVENT
> 	- add fence context for out-fences
>
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/videobuf2-core.c | 28 ++++++++++++++++++++++++++++
>  include/media/videobuf2-core.h           | 20 ++++++++++++++++++++
>  2 files changed, 48 insertions(+)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
> b/drivers/media/v4l2-core/videobuf2-core.c
> index 26de4c80717d..8b4f0e9bcb36 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -24,8 +24,10 @@
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
>  #include <linux/dma-fence-array.h>
> +#include <linux/sync_file.h>
>  
>  #include <media/videobuf2-core.h>
> +#include <media/videobuf2-fence.h>
>  #include <media/v4l2-mc.h>
>  
>  #include <trace/events/vb2.h>
> @@ -1320,6 +1322,32 @@ int vb2_core_prepare_buf(struct 
> vb2_queue *q, unsigned int index, void *pb)
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
>  
> +int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
> +{
> +	struct vb2_buffer *vb;
> +
> +	vb = q->bufs[index];
> +
> +	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);

out_fence_fd is allocated in this patch but not used anywhere for the 
moment.
For consistency, maybe move its allocation to the next patch, or move the 
call
to fd_install() here if that is possible? In both cases, the call to 
get_unused_fd() can be moved right before fd_install() so you don't need to
call put_unused_fd() in the error paths below.

... same thing for sync_file too. Maybe this patch can just be merged into
the next one? The current patch just creates an incomplete version of 
vb2_setup_out_fence() for which no user exist yet.

> +
> +	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
> +	if (!vb->out_fence) {
> +		put_unused_fd(vb->out_fence_fd);
> +		return -ENOMEM;
> +	}
> +
> +	vb->sync_file = sync_file_create(vb->out_fence);
> +	if (!vb->sync_file) {
> +		put_unused_fd(vb->out_fence_fd);
> +		dma_fence_put(vb->out_fence);
> +		vb->out_fence = NULL;
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vb2_setup_out_fence);
> +
>  /**
>   * vb2_start_streaming() - Attempt to start streaming.
>   * @q:		videobuf2 queue
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 5f48c7be7770..a9b0697bd782 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -259,6 +259,10 @@ struct vb2_buffer {
>  	 *			using the buffer (queueing to the driver)
>  	 * fence_cb:		fence callback information
>  	 * fence_cb_lock:	protect callback signal/remove
> +	 * out_fence_fd:	the out_fence_fd to be shared with userspace.
> +	 * out_fence:		the out-fence associated with the buffer once
> +	 *			it is queued to the driver.
> +	 * sync_file:		the sync file to wrap the out fence
>  	 */
>  	enum vb2_buffer_state	state;
>  
> @@ -269,6 +273,10 @@ struct vb2_buffer {
>  	struct dma_fence_cb	fence_cb;
>  	spinlock_t              fence_cb_lock;
>  
> +	int			out_fence_fd;
> +	struct dma_fence	*out_fence;
> +	struct sync_file	*sync_file;
> +
>  #ifdef CONFIG_VIDEO_ADV_DEBUG
>  	/*
>  	 * Counters for how often these buffer-related ops are
> @@ -514,6 +522,7 @@ struct vb2_buf_ops {
>   * @last_buffer_dequeued: used in poll() and DQBUF to 
> immediately return if the
>   *		last decoded buffer was already dequeued. Set for capture queues
>   *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
> + * @out_fence_context: the fence context for the out fences
>   * @last_fence:	last in-fence received. Used to keep ordering.
>   * @fileio:	file io emulator internal data, used only if emulator is active
>   * @threadio:	thread io internal data, used only if thread is active
> @@ -569,6 +578,7 @@ struct vb2_queue {
>  	unsigned int			copy_timestamp:1;
>  	unsigned int			last_buffer_dequeued:1;
>  
> +	u64				out_fence_context;
>  	struct dma_fence		*last_fence;
>  
>  	struct vb2_fileio_data		*fileio;
> @@ -740,6 +750,16 @@ int vb2_core_create_bufs(struct vb2_queue 
> *q, enum vb2_memory memory,
>  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int 
> index, void *pb);
>  
>  /**
> + * vb2_setup_out_fence() - setup new out-fence
> + * @q:		The vb2_queue where to setup it
> + * @index:	index of the buffer
> + *
> + * Setup the file descriptor, the fence and the sync_file for the next
> + * buffer to be queued and add everything to the tail of the 
> q->out_fence_list.
> + */
> +int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index);
> +
> +/**
>   * vb2_core_qbuf() - Queue a buffer from userspace
>   *
>   * @q:		videobuf2 queue

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences
  2017-11-17  7:19   ` Alexandre Courbot
@ 2017-11-17  7:29     ` Alexandre Courbot
  2017-11-17 11:30     ` Gustavo Padovan
  1 sibling, 0 replies; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  7:29 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On Friday, November 17, 2017 4:19:00 PM JST, Alexandre Courbot wrote:
> On Thursday, November 16, 2017 2:10:55 AM JST, Gustavo Padovan wrote:
>> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>> 
>> Add vb2_setup_out_fence() and the needed members to struct vb2_buffer.
>> 
>> v3:
>> 	- Do not hold yet another ref to the out_fence (Brian Starkey)
>> 
>> v2:	- change it to reflect fd_install at DQEVENT ...
>
> out_fence_fd is allocated in this patch but not used anywhere 
> for the moment.
> For consistency, maybe move its allocation to the next patch, 
> or move the call
> to fd_install() here if that is possible? In both cases, the 
> call to get_unused_fd() can be moved right before fd_install() 
> so you don't need to
> call put_unused_fd() in the error paths below.

Aha, just realized that fd_install() was called in qbuf() :) Other comments 
probably still hold though.

>
> ... same thing for sync_file too. Maybe this patch can just be merged into
> the next one? The current patch just creates an incomplete 
> version of vb2_setup_out_fence() for which no user exist yet.
>
>> +
>> +	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
>> +	if (!vb->out_fence) {
>> +		put_unused_fd(vb->out_fence_fd);
>> +		return -ENOMEM; ...
>
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 10/11] [media] vb2: add out-fence support to QBUF
  2017-11-15 17:10 ` [RFC v5 10/11] [media] vb2: add out-fence support to QBUF Gustavo Padovan
@ 2017-11-17  7:38   ` Alexandre Courbot
  2017-11-17 11:48     ` Gustavo Padovan
  2017-11-17 13:34   ` Hans Verkuil
  1 sibling, 1 reply; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-17  7:38 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On Thursday, November 16, 2017 2:10:56 AM JST, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>
> If V4L2_BUF_FLAG_OUT_FENCE flag is present on the QBUF call we create
> an out_fence and send its fd to userspace on the fence_fd field as a
> return arg for the QBUF call.
>
> The fence is signaled on buffer_done(), when the job on the buffer is
> finished.
>
> With out-fences we do not allow drivers to requeue buffers through vb2,
> instead we flag an error on the buffer, signals it fence with error and

s/it/its

> return it to userspace.
>
> v6
> 	- get rid of the V4L2_EVENT_OUT_FENCE event. We always keep the
> 	ordering in vb2 for queueing in the driver, so the event is not
> 	necessary anymore and the out_fence_fd is sent back to userspace
> 	on QBUF call return arg
> 	- do not allow requeueing with out-fences, instead mark the buffer
> 	with an error and wake up to userspace.
> 	- send the out_fence_fd back to userspace on the fence_fd field
>
> v5:
> 	- delay fd_install to DQ_EVENT (Hans)
> 	- if queue is fully ordered send OUT_FENCE event right away
> 	(Brian)
> 	- rename 'q->ordered' to 'q->ordered_in_driver'
> 	- merge change to implement OUT_FENCE event here
>
> v4:
> 	- return the out_fence_fd in the BUF_QUEUED event(Hans)
>
> v3:	- add WARN_ON_ONCE(q->ordered) on requeueing (Hans)
> 	- set the OUT_FENCE flag if there is a fence pending (Hans)
> 	- call fd_install() after vb2_core_qbuf() (Hans)
> 	- clean up fence if vb2_core_qbuf() fails (Hans)
> 	- add list to store sync_file and fence for the next queued buffer
>
> v2: check if the queue is ordered.
>
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/videobuf2-core.c | 42 
> +++++++++++++++++++++++++++++---
>  drivers/media/v4l2-core/videobuf2-v4l2.c | 21 +++++++++++++++-
>  2 files changed, 58 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
> b/drivers/media/v4l2-core/videobuf2-core.c
> index 8b4f0e9bcb36..2eb5ffa8e028 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -354,6 +354,7 @@ static int __vb2_queue_alloc(struct 
> vb2_queue *q, enum vb2_memory memory,
>  			vb->planes[plane].length = plane_sizes[plane];
>  			vb->planes[plane].min_length = plane_sizes[plane];
>  		}
> +		vb->out_fence_fd = -1;
>  		q->bufs[vb->index] = vb;
>  
>  		/* Allocate video buffer memory for the MMAP type */
> @@ -934,10 +935,26 @@ void vb2_buffer_done(struct vb2_buffer 
> *vb, enum vb2_buffer_state state)
>  	case VB2_BUF_STATE_QUEUED:
>  		return;
>  	case VB2_BUF_STATE_REQUEUEING:
> -		if (q->start_streaming_called)
> -			__enqueue_in_driver(vb);
> -		return;
> +
> +		if (!vb->out_fence) {
> +			if (q->start_streaming_called)
> +				__enqueue_in_driver(vb);
> +			return;
> +		}
> +
> +		/* Do not allow requeuing with explicit synchronization,
> +		 * report it as an error to userspace */
> +		state = VB2_BUF_STATE_ERROR;
> +
> +		/* fall through */
>  	default:
> +		if (state == VB2_BUF_STATE_ERROR)
> +			dma_fence_set_error(vb->out_fence, -EFAULT);
> +		dma_fence_signal(vb->out_fence);
> +		dma_fence_put(vb->out_fence);
> +		vb->out_fence = NULL;
> +		vb->out_fence_fd = -1;
> +
>  		/* Inform any processes that may be waiting for buffers */
>  		wake_up(&q->done_wq);
>  		break;
> @@ -1325,12 +1342,18 @@ EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
>  int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
>  {
>  	struct vb2_buffer *vb;
> +	u64 context;
>  
>  	vb = q->bufs[index];
>  
>  	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
>  
> -	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
> +	if (q->ordered_in_driver)
> +		context = q->out_fence_context;
> +	else
> +		context = dma_fence_context_alloc(1);
> +
> +	vb->out_fence = vb2_fence_alloc(context);
>  	if (!vb->out_fence) {
>  		put_unused_fd(vb->out_fence_fd);
>  		return -ENOMEM;
> @@ -1594,6 +1617,9 @@ int vb2_core_qbuf(struct vb2_queue *q, 
> unsigned int index, void *pb,
>  	if (pb)
>  		call_void_bufop(q, fill_user_buffer, vb, pb);
>  
> +	fd_install(vb->out_fence_fd, vb->sync_file->file);

What happens if the buffer does not have an output fence? Shouldn't this be 
protected by a check on whether the buffer has been queued with 
V4L2_BUF_FLAG_OUT_FENCE?

(... or maybe move this to vb2_setup_out_fence() after all, where we know 
for sure an output fence exists).

> +	vb->sync_file = NULL;
> +
>  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>  	return 0;
>  
> @@ -1860,6 +1886,12 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>  	}
>  
>  	/*
> +	 * Renew out-fence context.
> +	 */
> +	if (q->ordered_in_driver)
> +		q->out_fence_context = dma_fence_context_alloc(1);
> +
> +	/*
>  	 * Remove all buffers from videobuf's list...
>  	 */
>  	INIT_LIST_HEAD(&q->queued_list);
> @@ -2191,6 +2223,8 @@ int vb2_core_queue_init(struct vb2_queue *q)
>  	spin_lock_init(&q->done_lock);
>  	mutex_init(&q->mmap_lock);
>  	init_waitqueue_head(&q->done_wq);
> +	if (q->ordered_in_driver)
> +		q->out_fence_context = dma_fence_context_alloc(1);
>  
>  	if (q->buf_struct_size == 0)
>  		q->buf_struct_size = sizeof(struct vb2_buffer);
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c 
> b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 4c09ea007d90..f2e60d2908ae 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -212,7 +212,13 @@ static void __fill_v4l2_buffer(struct 
> vb2_buffer *vb, void *pb)
>  	b->sequence = vbuf->sequence;
>  	b->reserved = 0;
>  
> -	b->fence_fd = -1;
> +	if (vb->out_fence) {
> +		b->flags |= V4L2_BUF_FLAG_OUT_FENCE;
> +		b->fence_fd = vb->out_fence_fd;
> +	} else {
> +		b->fence_fd = -1;
> +	}
> +
>  	if (vb->in_fence)
>  		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
>  	else
> @@ -489,6 +495,10 @@ int vb2_querybuf(struct vb2_queue *q, 
> struct v4l2_buffer *b)
>  	ret = __verify_planes_array(vb, b);
>  	if (!ret)
>  		vb2_core_querybuf(q, b->index, b);
> +
> +	/* Do not return the out-fence fd on querybuf */
> +	if (vb->out_fence)
> +		b->fence_fd = -1;
>  	return ret;
>  }
>  EXPORT_SYMBOL(vb2_querybuf);
> @@ -593,6 +603,15 @@ int vb2_qbuf(struct vb2_queue *q, struct 
> v4l2_buffer *b)
>  		}
>  	}
>  
> +	if (b->flags & V4L2_BUF_FLAG_OUT_FENCE) {
> +		ret = vb2_setup_out_fence(q, b->index);
> +		if (ret) {
> +			dprintk(1, "failed to set up out-fence\n");
> +			dma_fence_put(fence);
> +			return ret;
> +		}
> +	}
> +
>  	return vb2_core_qbuf(q, b->index, b, fence);
>  }
>  EXPORT_SYMBOL_GPL(vb2_qbuf);

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues
  2017-11-17  5:56   ` Alexandre Courbot
@ 2017-11-17 11:23     ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 11:23 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Alexandre Courbot <acourbot@chromium.org>:

> On Thursday, November 16, 2017 2:10:49 AM JST, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > We use ordered_in_driver property to optimize for the case where
> > the driver can deliver the buffers in an ordered fashion. When it
> > is ordered we can use the same fence context for all fences, but
> > when it is not we need to a new context for each out-fence.
> 
> "we need to a new context" looks like it is missing a word.

oh

"we need to create a new context" 

Thanks for reviewing the patches!

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers
  2017-11-17  7:11     ` Alexandre Courbot
@ 2017-11-17 11:27       ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 11:27 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Javier Martinez Canillas

2017-11-17 Alexandre Courbot <acourbot@chromium.org>:

> On Friday, November 17, 2017 4:02:56 PM JST, Alexandre Courbot wrote:
> > On Thursday, November 16, 2017 2:10:54 AM JST, Gustavo Padovan wrote:
> > > From: Javier Martinez Canillas <javier@osg.samsung.com>
> > > 
> > > Add a videobuf2-fence.h header file that contains different helpers
> > > for DMA buffer sharing explicit fence support in videobuf2.
> > > 
> > > v2:	- use fence context provided by the caller in vb2_fence_alloc()
> > > 
> > > Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> ...
> > 
> > It is probably not a good idea to define that struct here since it will be
> > deduplicated for every source file that includes it.
> > 
> > Maybe change it to a simple declaration, and move the definition to
> > videobuf2-core.c or a dedicated videobuf2-fence.c file?
> > 
> > > +
> > > +static inline struct dma_fence *vb2_fence_alloc(u64 context)
> > > +{
> > > +	struct dma_fence *vb2_fence = kzalloc(sizeof(*vb2_fence), GFP_KERNEL);
> > > + ...
> > 
> > Not sure we gain a lot by having this function static inline, but your call.
> > 
> 
> Looking at the following patch, since it seems that this function is only to
> be
> called from vb2_setup_out_fence() anyway, you may as well make it static in
> videobuf2-core.c or even inline it in vb2_setup_out_fence() - it would only
> add a few extra lines to this function.

Okay, that makes sense, I'll will move everything to videobuf2-core.c 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences
  2017-11-17  7:19   ` Alexandre Courbot
  2017-11-17  7:29     ` Alexandre Courbot
@ 2017-11-17 11:30     ` Gustavo Padovan
  1 sibling, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 11:30 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Alexandre Courbot <acourbot@chromium.org>:

> On Thursday, November 16, 2017 2:10:55 AM JST, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > Add vb2_setup_out_fence() and the needed members to struct vb2_buffer.
> > 
> > v3:
> > 	- Do not hold yet another ref to the out_fence (Brian Starkey)
> > 
> > v2:	- change it to reflect fd_install at DQEVENT
> > 	- add fence context for out-fences
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  drivers/media/v4l2-core/videobuf2-core.c | 28 ++++++++++++++++++++++++++++
> >  include/media/videobuf2-core.h           | 20 ++++++++++++++++++++
> >  2 files changed, 48 insertions(+)
> > 
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c
> > b/drivers/media/v4l2-core/videobuf2-core.c
> > index 26de4c80717d..8b4f0e9bcb36 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -24,8 +24,10 @@
> >  #include <linux/freezer.h>
> >  #include <linux/kthread.h>
> >  #include <linux/dma-fence-array.h>
> > +#include <linux/sync_file.h>
> >  #include <media/videobuf2-core.h>
> > +#include <media/videobuf2-fence.h>
> >  #include <media/v4l2-mc.h>
> >  #include <trace/events/vb2.h>
> > @@ -1320,6 +1322,32 @@ int vb2_core_prepare_buf(struct vb2_queue *q,
> > unsigned int index, void *pb)
> >  }
> >  EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
> > +int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
> > +{
> > +	struct vb2_buffer *vb;
> > +
> > +	vb = q->bufs[index];
> > +
> > +	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
> 
> out_fence_fd is allocated in this patch but not used anywhere for the
> moment.
> For consistency, maybe move its allocation to the next patch, or move the
> call
> to fd_install() here if that is possible? In both cases, the call to
> get_unused_fd() can be moved right before fd_install() so you don't need to
> call put_unused_fd() in the error paths below.
> 
> ... same thing for sync_file too. Maybe this patch can just be merged into
> the next one? The current patch just creates an incomplete version of
> vb2_setup_out_fence() for which no user exist yet.

It turned out that out-fences patch is not big at all, so I can merge
them both. I think it will be cleaner, thanks for the suggestion.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 10/11] [media] vb2: add out-fence support to QBUF
  2017-11-17  7:38   ` Alexandre Courbot
@ 2017-11-17 11:48     ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 11:48 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Alexandre Courbot <acourbot@chromium.org>:

> On Thursday, November 16, 2017 2:10:56 AM JST, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > If V4L2_BUF_FLAG_OUT_FENCE flag is present on the QBUF call we create
> > an out_fence and send its fd to userspace on the fence_fd field as a
> > return arg for the QBUF call.
> > 
> > The fence is signaled on buffer_done(), when the job on the buffer is
> > finished.
> > 
> > With out-fences we do not allow drivers to requeue buffers through vb2,
> > instead we flag an error on the buffer, signals it fence with error and
> 
> s/it/its
> 
> > return it to userspace.
> > 
> > v6
> > 	- get rid of the V4L2_EVENT_OUT_FENCE event. We always keep the
> > 	ordering in vb2 for queueing in the driver, so the event is not
> > 	necessary anymore and the out_fence_fd is sent back to userspace
> > 	on QBUF call return arg
> > 	- do not allow requeueing with out-fences, instead mark the buffer
> > 	with an error and wake up to userspace.
> > 	- send the out_fence_fd back to userspace on the fence_fd field
> > 
> > v5:
> > 	- delay fd_install to DQ_EVENT (Hans)
> > 	- if queue is fully ordered send OUT_FENCE event right away
> > 	(Brian)
> > 	- rename 'q->ordered' to 'q->ordered_in_driver'
> > 	- merge change to implement OUT_FENCE event here
> > 
> > v4:
> > 	- return the out_fence_fd in the BUF_QUEUED event(Hans)
> > 
> > v3:	- add WARN_ON_ONCE(q->ordered) on requeueing (Hans)
> > 	- set the OUT_FENCE flag if there is a fence pending (Hans)
> > 	- call fd_install() after vb2_core_qbuf() (Hans)
> > 	- clean up fence if vb2_core_qbuf() fails (Hans)
> > 	- add list to store sync_file and fence for the next queued buffer
> > 
> > v2: check if the queue is ordered.
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  drivers/media/v4l2-core/videobuf2-core.c | 42
> > +++++++++++++++++++++++++++++---
> >  drivers/media/v4l2-core/videobuf2-v4l2.c | 21 +++++++++++++++-
> >  2 files changed, 58 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c
> > b/drivers/media/v4l2-core/videobuf2-core.c
> > index 8b4f0e9bcb36..2eb5ffa8e028 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -354,6 +354,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q,
> > enum vb2_memory memory,
> >  			vb->planes[plane].length = plane_sizes[plane];
> >  			vb->planes[plane].min_length = plane_sizes[plane];
> >  		}
> > +		vb->out_fence_fd = -1;
> >  		q->bufs[vb->index] = vb;
> >  		/* Allocate video buffer memory for the MMAP type */
> > @@ -934,10 +935,26 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum
> > vb2_buffer_state state)
> >  	case VB2_BUF_STATE_QUEUED:
> >  		return;
> >  	case VB2_BUF_STATE_REQUEUEING:
> > -		if (q->start_streaming_called)
> > -			__enqueue_in_driver(vb);
> > -		return;
> > +
> > +		if (!vb->out_fence) {
> > +			if (q->start_streaming_called)
> > +				__enqueue_in_driver(vb);
> > +			return;
> > +		}
> > +
> > +		/* Do not allow requeuing with explicit synchronization,
> > +		 * report it as an error to userspace */
> > +		state = VB2_BUF_STATE_ERROR;
> > +
> > +		/* fall through */
> >  	default:
> > +		if (state == VB2_BUF_STATE_ERROR)
> > +			dma_fence_set_error(vb->out_fence, -EFAULT);
> > +		dma_fence_signal(vb->out_fence);
> > +		dma_fence_put(vb->out_fence);
> > +		vb->out_fence = NULL;
> > +		vb->out_fence_fd = -1;
> > +
> >  		/* Inform any processes that may be waiting for buffers */
> >  		wake_up(&q->done_wq);
> >  		break;
> > @@ -1325,12 +1342,18 @@ EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
> >  int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
> >  {
> >  	struct vb2_buffer *vb;
> > +	u64 context;
> >  	vb = q->bufs[index];
> >  	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
> > -	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
> > +	if (q->ordered_in_driver)
> > +		context = q->out_fence_context;
> > +	else
> > +		context = dma_fence_context_alloc(1);
> > +
> > +	vb->out_fence = vb2_fence_alloc(context);
> >  	if (!vb->out_fence) {
> >  		put_unused_fd(vb->out_fence_fd);
> >  		return -ENOMEM;
> > @@ -1594,6 +1617,9 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned
> > int index, void *pb,
> >  	if (pb)
> >  		call_void_bufop(q, fill_user_buffer, vb, pb);
> > +	fd_install(vb->out_fence_fd, vb->sync_file->file);
> 
> What happens if the buffer does not have an output fence? Shouldn't this be
> protected by a check on whether the buffer has been queued with
> V4L2_BUF_FLAG_OUT_FENCE?

Ouch, I think we would have a nice crash. We should protect it, yes.


Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi
  2017-11-15 17:10 ` [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi Gustavo Padovan
@ 2017-11-17 11:57   ` Mauro Carvalho Chehab
  2017-11-17 12:23     ` Gustavo Padovan
  0 siblings, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 11:57 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Wed, 15 Nov 2017 15:10:47 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> When using explicit synchronization userspace needs to know if
> the queue can deliver everything back in the same order, so we added
> a new capability that drivers can use to report that they are capable
> of keeping ordering.
> 
> In videobuf2 core when using fences we also make sure to keep the ordering
> of buffers, so if the driver guarantees it too the whole pipeline inside
> V4L2 will be ordered and the V4L2_CAP_ORDERED should be used.
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  Documentation/media/uapi/v4l/vidioc-querycap.rst | 3 +++
>  include/uapi/linux/videodev2.h                   | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/vidioc-querycap.rst b/Documentation/media/uapi/v4l/vidioc-querycap.rst
> index 66fb1b3d6e6e..ed3daa814da9 100644
> --- a/Documentation/media/uapi/v4l/vidioc-querycap.rst
> +++ b/Documentation/media/uapi/v4l/vidioc-querycap.rst
> @@ -254,6 +254,9 @@ specification the ioctl returns an ``EINVAL`` error code.
>      * - ``V4L2_CAP_TOUCH``
>        - 0x10000000
>        - This is a touch device.
> +    * - ``V4L2_CAP_ORDERED``
> +      - 0x20000000
> +      - The device queue is ordered.
>      * - ``V4L2_CAP_DEVICE_CAPS``
>        - 0x80000000
>        - The driver fills the ``device_caps`` field. This capability can
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 185d6a0acc06..cd6fc1387f47 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -459,6 +459,7 @@ struct v4l2_capability {
>  #define V4L2_CAP_STREAMING              0x04000000  /* streaming I/O ioctls */
>  
>  #define V4L2_CAP_TOUCH                  0x10000000  /* Is a touch device */
> +#define V4L2_CAP_ORDERED                0x20000000  /* Is the device queue ordered */

I guess we discussed that at the Linux Media summit.
The problem of making it a global flag is that drivers may support
ordered formats only for some of the formats. E. g., a driver that
delivers both MPEG and RGB output formats may deliver ordered
buffers for RGB, and unordered ones for MPEG.

So, instead of doing a global format at v4l2_capability, it is probably
better to use the flags field at struct v4l2_fmtdesc.

That would allow userspace to know in advance what formats support
it, by calling VIDIOC_ENUM_FMT.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues
  2017-11-15 17:10 ` [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues Gustavo Padovan
  2017-11-17  5:56   ` Alexandre Courbot
@ 2017-11-17 12:15   ` Mauro Carvalho Chehab
  2017-11-17 12:27     ` Gustavo Padovan
  1 sibling, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 12:15 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Wed, 15 Nov 2017 15:10:49 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> We use ordered_in_driver property to optimize for the case where
> the driver can deliver the buffers in an ordered fashion. When it
> is ordered we can use the same fence context for all fences, but
> when it is not we need to a new context for each out-fence.
> 
> So the ordered_in_driver flag will help us with identifying the queues
> that can be optimized and use the same fence context.
> 
> v4: make the property a vector for optimization and not a mandatory thing
> that drivers need to set if they want to use explicit synchronization.
> 
> v3: improve doc (Hans Verkuil)
> 
> v2: rename property to 'ordered_in_driver' to avoid confusion
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  include/media/videobuf2-core.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index ef9b64398c8c..38b9c8dd42c6 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -440,6 +440,12 @@ struct vb2_buf_ops {
>   * @fileio_read_once:		report EOF after reading the first buffer
>   * @fileio_write_immediately:	queue buffer after each write() call
>   * @allow_zero_bytesused:	allow bytesused == 0 to be passed to the driver
> + * @ordered_in_driver: if the driver can guarantee that the queue will be
> + *		ordered or not, i.e., the buffers are dequeued from the driver
> + *		in the same order they are queued to the driver. The default
> + *		is not ordered unless the driver sets this flag. Setting it
> + *		when ordering can be guaranted helps to optimize explicit
> + *		fences.
>   * @quirk_poll_must_check_waiting_for_buffers: Return POLLERR at poll when QBUF
>   *              has not been called. This is a vb1 idiom that has been adopted
>   *              also by vb2.
> @@ -510,6 +516,7 @@ struct vb2_queue {
>  	unsigned			fileio_read_once:1;
>  	unsigned			fileio_write_immediately:1;
>  	unsigned			allow_zero_bytesused:1;
> +	unsigned			ordered_in_driver:1;

As this may depend on the format, it is probably a good idea to set
this flag either via a function argument or by a function that
would be meant to update it, as video format changes.

>  	unsigned		   quirk_poll_must_check_waiting_for_buffers:1;
>  
>  	struct mutex			*lock;


-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi
  2017-11-17 11:57   ` Mauro Carvalho Chehab
@ 2017-11-17 12:23     ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 12:23 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:

> Em Wed, 15 Nov 2017 15:10:47 -0200
> Gustavo Padovan <gustavo@padovan.org> escreveu:
> 
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > When using explicit synchronization userspace needs to know if
> > the queue can deliver everything back in the same order, so we added
> > a new capability that drivers can use to report that they are capable
> > of keeping ordering.
> > 
> > In videobuf2 core when using fences we also make sure to keep the ordering
> > of buffers, so if the driver guarantees it too the whole pipeline inside
> > V4L2 will be ordered and the V4L2_CAP_ORDERED should be used.
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  Documentation/media/uapi/v4l/vidioc-querycap.rst | 3 +++
> >  include/uapi/linux/videodev2.h                   | 1 +
> >  2 files changed, 4 insertions(+)
> > 
> > diff --git a/Documentation/media/uapi/v4l/vidioc-querycap.rst b/Documentation/media/uapi/v4l/vidioc-querycap.rst
> > index 66fb1b3d6e6e..ed3daa814da9 100644
> > --- a/Documentation/media/uapi/v4l/vidioc-querycap.rst
> > +++ b/Documentation/media/uapi/v4l/vidioc-querycap.rst
> > @@ -254,6 +254,9 @@ specification the ioctl returns an ``EINVAL`` error code.
> >      * - ``V4L2_CAP_TOUCH``
> >        - 0x10000000
> >        - This is a touch device.
> > +    * - ``V4L2_CAP_ORDERED``
> > +      - 0x20000000
> > +      - The device queue is ordered.
> >      * - ``V4L2_CAP_DEVICE_CAPS``
> >        - 0x80000000
> >        - The driver fills the ``device_caps`` field. This capability can
> > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > index 185d6a0acc06..cd6fc1387f47 100644
> > --- a/include/uapi/linux/videodev2.h
> > +++ b/include/uapi/linux/videodev2.h
> > @@ -459,6 +459,7 @@ struct v4l2_capability {
> >  #define V4L2_CAP_STREAMING              0x04000000  /* streaming I/O ioctls */
> >  
> >  #define V4L2_CAP_TOUCH                  0x10000000  /* Is a touch device */
> > +#define V4L2_CAP_ORDERED                0x20000000  /* Is the device queue ordered */
> 
> I guess we discussed that at the Linux Media summit.
> The problem of making it a global flag is that drivers may support
> ordered formats only for some of the formats. E. g., a driver that
> delivers both MPEG and RGB output formats may deliver ordered
> buffers for RGB, and unordered ones for MPEG.
> 
> So, instead of doing a global format at v4l2_capability, it is probably
> better to use the flags field at struct v4l2_fmtdesc.
> 
> That would allow userspace to know in advance what formats support
> it, by calling VIDIOC_ENUM_FMT.

Thanks for clarifying, Mauro. I'll work on making it part of
v4l2_fmtdesc.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 06/11] [media] vb2: add explicit fence user API
  2017-11-15 17:10 ` [RFC v5 06/11] [media] vb2: add explicit fence user API Gustavo Padovan
@ 2017-11-17 12:25   ` Mauro Carvalho Chehab
  2017-11-17 13:29   ` Hans Verkuil
  1 sibling, 0 replies; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 12:25 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Wed, 15 Nov 2017 15:10:52 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> Turn the reserved2 field into fence_fd that we will use to send
> an in-fence to the kernel and return an out-fence from the kernel to
> userspace.
> 
> Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
> when sending a fence to the kernel to be waited on, and
> V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
> 
> v4:
> 	- make it a union with reserved2 and fence_fd (Hans Verkuil)
> 
> v3:
> 	- make the out_fence refer to the current buffer (Hans Verkuil)
> 
> v2: add documentation
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  Documentation/media/uapi/v4l/buffer.rst       | 15 +++++++++++++++
>  drivers/media/usb/cpia2/cpia2_v4l.c           |  2 +-
>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  4 ++--
>  drivers/media/v4l2-core/videobuf2-v4l2.c      |  2 +-
>  include/uapi/linux/videodev2.h                |  7 ++++++-
>  5 files changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
> index ae6ee73f151c..eeefbd2547e7 100644
> --- a/Documentation/media/uapi/v4l/buffer.rst
> +++ b/Documentation/media/uapi/v4l/buffer.rst
> @@ -648,6 +648,21 @@ Buffer Flags
>        - Start Of Exposure. The buffer timestamp has been taken when the
>  	exposure of the frame has begun. This is only valid for the
>  	``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
> +    * .. _`V4L2-BUF-FLAG-IN-FENCE`:
> +
> +      - ``V4L2_BUF_FLAG_IN_FENCE``
> +      - 0x00200000
> +      - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
> +	won't be queued to the driver until the fence signals.
> +
> +    * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
> +
> +      - ``V4L2_BUF_FLAG_OUT_FENCE``
> +      - 0x00400000
> +      - Request a fence to be attached to the buffer. The ``fence_fd``
> +	field on
> +	:ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
> +	fd to userspace.
>  
>  
>  
> diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
> index 3dedd83f0b19..6cde686bf44c 100644
> --- a/drivers/media/usb/cpia2/cpia2_v4l.c
> +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
> @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
>  	buf->sequence = cam->buffers[buf->index].seq;
>  	buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
>  	buf->length = cam->frame_size;
> -	buf->reserved2 = 0;
> +	buf->fence_fd = -1;
>  	buf->reserved = 0;
>  	memset(&buf->timecode, 0, sizeof(buf->timecode));
>  
> diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> index 821f2aa299ae..3a31d318df2a 100644
> --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> @@ -370,7 +370,7 @@ struct v4l2_buffer32 {
>  		__s32		fd;
>  	} m;
>  	__u32			length;
> -	__u32			reserved2;
> +	__s32			fence_fd;
>  	__u32			reserved;
>  };
>  
> @@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
>  		put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
>  		copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
>  		put_user(kp->sequence, &up->sequence) ||
> -		put_user(kp->reserved2, &up->reserved2) ||
> +		put_user(kp->fence_fd, &up->fence_fd) ||
>  		put_user(kp->reserved, &up->reserved) ||
>  		put_user(kp->length, &up->length))
>  			return -EFAULT;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 0c0669976bdc..110fb45fef6f 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>  	b->timestamp = ns_to_timeval(vb->timestamp);
>  	b->timecode = vbuf->timecode;
>  	b->sequence = vbuf->sequence;
> -	b->reserved2 = 0;
> +	b->fence_fd = -1;
>  	b->reserved = 0;
>  
>  	if (q->is_multiplanar) {
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index cd6fc1387f47..311c3a331eda 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -925,7 +925,10 @@ struct v4l2_buffer {
>  		__s32		fd;
>  	} m;
>  	__u32			length;
> -	__u32			reserved2;
> +	union {
> +		__s32		fence_fd;

You forgot to document the fence_fd at the uAPI book. In particular,
it should be noticed there that a non-positive value means no fence.

(as userspace should be setting reserved2 to zero, I'm assuming
that we'll likely not accept 0 here).

You should likely also remove the description from reserved2 at
Documentation/media/uapi/v4l/buffer.rst.


> +		__u32		reserved2;
> +	};
>  	__u32			reserved;
>  };
>  
> @@ -962,6 +965,8 @@ struct v4l2_buffer {
>  #define V4L2_BUF_FLAG_TSTAMP_SRC_SOE		0x00010000
>  /* mem2mem encoder/decoder */
>  #define V4L2_BUF_FLAG_LAST			0x00100000
> +#define V4L2_BUF_FLAG_IN_FENCE			0x00200000
> +#define V4L2_BUF_FLAG_OUT_FENCE			0x00400000
>  
>  /**
>   * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor


-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues
  2017-11-17 12:15   ` Mauro Carvalho Chehab
@ 2017-11-17 12:27     ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 12:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Hi Mauro,

2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:

> Em Wed, 15 Nov 2017 15:10:49 -0200
> Gustavo Padovan <gustavo@padovan.org> escreveu:
> 
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > We use ordered_in_driver property to optimize for the case where
> > the driver can deliver the buffers in an ordered fashion. When it
> > is ordered we can use the same fence context for all fences, but
> > when it is not we need to a new context for each out-fence.
> > 
> > So the ordered_in_driver flag will help us with identifying the queues
> > that can be optimized and use the same fence context.
> > 
> > v4: make the property a vector for optimization and not a mandatory thing
> > that drivers need to set if they want to use explicit synchronization.
> > 
> > v3: improve doc (Hans Verkuil)
> > 
> > v2: rename property to 'ordered_in_driver' to avoid confusion
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  include/media/videobuf2-core.h | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> > index ef9b64398c8c..38b9c8dd42c6 100644
> > --- a/include/media/videobuf2-core.h
> > +++ b/include/media/videobuf2-core.h
> > @@ -440,6 +440,12 @@ struct vb2_buf_ops {
> >   * @fileio_read_once:		report EOF after reading the first buffer
> >   * @fileio_write_immediately:	queue buffer after each write() call
> >   * @allow_zero_bytesused:	allow bytesused == 0 to be passed to the driver
> > + * @ordered_in_driver: if the driver can guarantee that the queue will be
> > + *		ordered or not, i.e., the buffers are dequeued from the driver
> > + *		in the same order they are queued to the driver. The default
> > + *		is not ordered unless the driver sets this flag. Setting it
> > + *		when ordering can be guaranted helps to optimize explicit
> > + *		fences.
> >   * @quirk_poll_must_check_waiting_for_buffers: Return POLLERR at poll when QBUF
> >   *              has not been called. This is a vb1 idiom that has been adopted
> >   *              also by vb2.
> > @@ -510,6 +516,7 @@ struct vb2_queue {
> >  	unsigned			fileio_read_once:1;
> >  	unsigned			fileio_write_immediately:1;
> >  	unsigned			allow_zero_bytesused:1;
> > +	unsigned			ordered_in_driver:1;
> 
> As this may depend on the format, it is probably a good idea to set
> this flag either via a function argument or by a function that
> would be meant to update it, as video format changes.

Right, and maybe I can find a way to store this in only one place,
istead of what I did here (having to set explicitely both the 
ordered_in_driver and the flag separatedly)

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
  2017-11-17  6:49   ` Alexandre Courbot
@ 2017-11-17 12:53   ` Mauro Carvalho Chehab
  2017-11-17 13:12     ` Gustavo Padovan
  2017-11-17 14:15   ` Hans Verkuil
  2 siblings, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 12:53 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Wed, 15 Nov 2017 15:10:53 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> Receive in-fence from userspace and add support for waiting on them
> before queueing the buffer to the driver. Buffers can't be queued to the
> driver before its fences signal. And a buffer can't be queue to the driver
> out of the order they were queued from userspace. That means that even if
> it fence signal it must wait all other buffers, ahead of it in the queue,
> to signal first.
> 
> To make that possible we use fence_array to keep that ordering. Basically
> we create a fence_array that contains both the current fence and the fence
> from the previous buffer (which might be a fence array as well). The base
> fence class for the fence_array becomes the new buffer fence, waiting on
> that one guarantees that it won't be queued out of order.
> 
> v6:
> 	- With fences always keep the order userspace queues the buffers.
> 	- Protect in_fence manipulation with a lock (Brian Starkey)
> 	- check if fences have the same context before adding a fence array
> 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> 
> v5:	- use fence_array to keep buffers ordered in vb2 core when
> 	needed (Brian Starkey)
> 	- keep backward compat on the reserved2 field (Brian Starkey)
> 	- protect fence callback removal with lock (Brian Starkey)
> 
> v4:
> 	- Add a comment about dma_fence_add_callback() not returning a
> 	error (Hans)
> 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> 	vb2_start_streaming() (Hans)
> 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> 	- Queue buffers to the driver as soon as they are ready (Hans)
> 	- call fill_user_buffer() after queuing the buffer (Hans)
> 	- add err: label to clean up fence
> 	- add dma_fence_wait() before calling vb2_start_streaming()
> 
> v3:	- document fence parameter
> 	- remove ternary if at vb2_qbuf() return (Mauro)
> 	- do not change if conditions behaviour (Mauro)
> 
> v2:
> 	- fix vb2_queue_or_prepare_buf() ret check
> 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> 	- check num of ready buffers to start streaming
> 	- when queueing, start from the first ready buffer
> 	- handle queue cancel
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/Kconfig          |   1 +
>  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
>  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
>  include/media/videobuf2-core.h           |  17 ++-
>  4 files changed, 231 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> index a35c33686abf..3f988c407c80 100644
> --- a/drivers/media/v4l2-core/Kconfig
> +++ b/drivers/media/v4l2-core/Kconfig
> @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
>  # Used by drivers that need Videobuf2 modules
>  config VIDEOBUF2_CORE
>  	select DMA_SHARED_BUFFER
> +	select SYNC_FILE
>  	tristate
>  
>  config VIDEOBUF2_MEMOPS
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index 60f8b582396a..26de4c80717d 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/dma-fence-array.h>
>  
>  #include <media/videobuf2-core.h>
>  #include <media/v4l2-mc.h>
> @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  		vb->index = q->num_buffers + buffer;
>  		vb->type = q->type;
>  		vb->memory = memory;
> +		spin_lock_init(&vb->fence_cb_lock);
>  		for (plane = 0; plane < num_planes; ++plane) {
>  			vb->planes[plane].length = plane_sizes[plane];
>  			vb->planes[plane].min_length = plane_sizes[plane];
> @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>  {
>  	struct vb2_queue *q = vb->vb2_queue;
>  
> +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> +		return;
> +
>  	vb->state = VB2_BUF_STATE_ACTIVE;
>  	atomic_inc(&q->owned_by_drv_count);
>  
> @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
>  	return 0;
>  }
>  
> +static int __get_num_ready_buffers(struct vb2_queue *q)
> +{
> +	struct vb2_buffer *vb;
> +	int ready_count = 0;
> +	unsigned long flags;
> +
> +	/* count num of buffers ready in front of the queued_list */
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> +			ready_count++;
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	return ready_count;
> +}
> +
>  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>  {
>  	struct vb2_buffer *vb;
> @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
>  	return ret;
>  }
>  
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> +					struct vb2_buffer *vb,
> +					struct dma_fence *fence)
> +{
> +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
> +	/*
> +	 * We always guarantee the ordering of buffers queued from
> +	 * userspace to be the same it is queued to the driver. For that
> +	 * we create a fence array with the fence from the last queued
> +	 * buffer and this one, that way the fence for this buffer can't
> +	 * signal before the last one.
> +	 */
> +	if (fence && q->last_fence) {
> +		struct dma_fence **fences;
> +		struct dma_fence_array *arr;
> +
> +		if (fence->context == q->last_fence->context) {
> +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> +				dma_fence_put(q->last_fence);
> +				q->last_fence = dma_fence_get(fence);
> +			} else {
> +				dma_fence_put(fence);
> +				fence = dma_fence_get(q->last_fence);
> +			}
> +			return fence;
> +		}
> +
> +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> +		if (!fences)
> +			return ERR_PTR(-ENOMEM);
> +
> +		fences[0] = fence;
> +		fences[1] = q->last_fence;
> +
> +		arr = dma_fence_array_create(2, fences,
> +					     dma_fence_context_alloc(1),
> +					     1, false);
> +		if (!arr) {
> +			kfree(fences);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +
> +		fence = &arr->base;
> +
> +		q->last_fence = dma_fence_get(fence);
> +	} else if (!fence && q->last_fence) {
> +		fence = dma_fence_get(q->last_fence);
> +	}
> +
> +	return fence;
> +}
> +
> +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> +	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
> +	struct vb2_queue *q = vb->vb2_queue;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (!vb->in_fence) {
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +		return;
> +	}
> +
> +	dma_fence_put(vb->in_fence);
> +	vb->in_fence = NULL;
> +
> +	if (q->start_streaming_called)
> +		__enqueue_in_driver(vb);
> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +}
> +
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence)
>  {
>  	struct vb2_buffer *vb;
> +	unsigned long flags;
>  	int ret;
>  
>  	vb = q->bufs[index];
> @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>  	case VB2_BUF_STATE_DEQUEUED:
>  		ret = __buf_prepare(vb, pb);
>  		if (ret)
> -			return ret;
> +			goto err;
>  		break;
>  	case VB2_BUF_STATE_PREPARED:
>  		break;
>  	case VB2_BUF_STATE_PREPARING:
>  		dprintk(1, "buffer still being prepared\n");
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	default:
>  		dprintk(1, "invalid buffer state %d\n", vb->state);
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	}
>  
>  	/*
> @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>  
>  	trace_vb2_qbuf(q, vb);
>  
> +	vb->in_fence = __set_in_fence(q, vb, fence);
> +	if (IS_ERR(vb->in_fence)) {
> +		dma_fence_put(fence);
> +		ret = PTR_ERR(vb->in_fence);
> +		goto err;
> +	}
> +	fence = NULL;
> +
> +	/*
> +	 * If it is time to call vb2_start_streaming() wait for the fence
> +	 * to signal first. Of course, this happens only once per streaming.
> +	 * We want to run any step that might fail before we set the callback
> +	 * to queue the fence when it signals.
> +	 */
> +	if (vb->in_fence && !q->start_streaming_called &&
> +	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
> +		dma_fence_wait(vb->in_fence, true);
> +

The above code sounds weird: it is called even if the user is not
using fences.

You should probably be doing something like:

	if (fence || q->last_fence) {
		vb->in_fence = __set_in_fence(q, vb, fence);
		...
	}

>  	/*
>  	 * If streamon has been called, and we haven't yet called
>  	 * start_streaming() since not enough buffers were queued, and
>  	 * we now have reached the minimum number of queued buffers,
>  	 * then we can finally call start_streaming().
> -	 *
> -	 * If already streaming, give the buffer to driver for processing.
> -	 * If not, the buffer will be given to driver on next streamon.
>  	 */
>  	if (q->streaming && !q->start_streaming_called &&
> -	    q->queued_count >= q->min_buffers_needed) {
> +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {

I guess the case where fences is not used is not covered here.

You probably should add a check at __get_num_ready_buffers(q)
as well, making it just return q->queued_count if fences isn't
used.


>  		ret = vb2_start_streaming(q);
>  		if (ret)
> -			return ret;
> -	} else if (q->start_streaming_called) {
> -		__enqueue_in_driver(vb);
> +			goto err;
>  	}
>  
> +	/*
> +	 * For explicit synchronization: If the fence didn't signal
> +	 * yet we setup a callback to queue the buffer once the fence
> +	 * signals, and then, return successfully. But if the fence
> +	 * already signaled we lose the reference we held and queue the
> +	 * buffer to the driver.
> +	 */
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (vb->in_fence) {
> +		ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
> +					     vb2_qbuf_fence_cb);
> +		if (ret == -EINVAL) {
> +			spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +			goto err;
> +		} else if (!ret) {
> +			goto fill;
> +		}
> +
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;
> +	}
> +
> +fill:
> +	/*
> +	 * If already streaming and there is no fence to wait on
> +	 * give the buffer to driver for processing.
> +	 */
> +	if (q->start_streaming_called && !vb->in_fence)
> +		__enqueue_in_driver(vb);
> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +
>  	/* Fill buffer information for the userspace */
>  	if (pb)
>  		call_void_bufop(q, fill_user_buffer, vb, pb);
>  
>  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>  	return 0;
> +
> +err:
> +	if (vb->in_fence) {
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;
> +	}
> +
> +	return ret;
> +
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_qbuf);
>  
> @@ -1632,6 +1787,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
>  static void __vb2_queue_cancel(struct vb2_queue *q)
>  {
>  	unsigned int i;
> +	struct vb2_buffer *vb;
> +	unsigned long flags;
>  
>  	/*
>  	 * Tell driver to stop all transactions and release all queued
> @@ -1659,6 +1816,21 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>  	q->queued_count = 0;
>  	q->error = 0;
>  
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (vb->in_fence) {
> +			dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
> +			dma_fence_put(vb->in_fence);
> +			vb->in_fence = NULL;
> +		}
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	if (q->last_fence) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
>  	/*
>  	 * Remove all buffers from videobuf's list...
>  	 */
> @@ -1720,7 +1892,7 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
>  	 * Tell driver to start streaming provided sufficient buffers
>  	 * are available.
>  	 */
> -	if (q->queued_count >= q->min_buffers_needed) {
> +	if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
>  		ret = v4l_vb2q_enable_media_source(q);
>  		if (ret)
>  			return ret;
> @@ -2240,7 +2412,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>  		 * Queue all buffers.
>  		 */
>  		for (i = 0; i < q->num_buffers; i++) {
> -			ret = vb2_core_qbuf(q, i, NULL);
> +			ret = vb2_core_qbuf(q, i, NULL, NULL);
>  			if (ret)
>  				goto err_reqbufs;
>  			fileio->bufs[i].queued = 1;
> @@ -2419,7 +2591,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
>  
>  		if (copy_timestamp)
>  			b->timestamp = ktime_get_ns();
> -		ret = vb2_core_qbuf(q, index, NULL);
> +		ret = vb2_core_qbuf(q, index, NULL, NULL);
>  		dprintk(5, "vb2_dbuf result: %d\n", ret);
>  		if (ret)
>  			return ret;
> @@ -2522,7 +2694,7 @@ static int vb2_thread(void *data)
>  		if (copy_timestamp)
>  			vb->timestamp = ktime_get_ns();;
>  		if (!threadio->stop)
> -			ret = vb2_core_qbuf(q, vb->index, NULL);
> +			ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
>  		call_void_qop(q, wait_prepare, q);
>  		if (ret || threadio->stop)
>  			break;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 110fb45fef6f..4c09ea007d90 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/sync_file.h>
>  
>  #include <media/v4l2-dev.h>
>  #include <media/v4l2-fh.h>
> @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
>  		return -EINVAL;
>  	}
>  
> +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&
> +	    !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
> +		dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
> +		return -EINVAL;
> +	}
> +
>  	return __verify_planes_array(q->bufs[b->index], b);
>  }
>  
> @@ -203,9 +210,14 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>  	b->timestamp = ns_to_timeval(vb->timestamp);
>  	b->timecode = vbuf->timecode;
>  	b->sequence = vbuf->sequence;
> -	b->fence_fd = -1;
>  	b->reserved = 0;
>  
> +	b->fence_fd = -1;
> +	if (vb->in_fence)
> +		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
> +	else
> +		b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
> +
>  	if (q->is_multiplanar) {
>  		/*
>  		 * Fill in plane-related data if userspace provided an array
> @@ -560,6 +572,7 @@ EXPORT_SYMBOL_GPL(vb2_create_bufs);
>  
>  int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  {
> +	struct dma_fence *fence = NULL;
>  	int ret;
>  
>  	if (vb2_fileio_is_active(q)) {
> @@ -568,7 +581,19 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  	}
>  
>  	ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> -	return ret ? ret : vb2_core_qbuf(q, b->index, b);
> +	if (ret)
> +		return ret;
> +
> +	if (b->flags & V4L2_BUF_FLAG_IN_FENCE) {
> +		fence = sync_file_get_fence(b->fence_fd);
> +		if (!fence) {
> +			dprintk(1, "failed to get in-fence from fd %d\n",
> +				b->fence_fd);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return vb2_core_qbuf(q, b->index, b, fence);
>  }
>  EXPORT_SYMBOL_GPL(vb2_qbuf);
>  
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 38b9c8dd42c6..5f48c7be7770 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -16,6 +16,7 @@
>  #include <linux/mutex.h>
>  #include <linux/poll.h>
>  #include <linux/dma-buf.h>
> +#include <linux/dma-fence.h>
>  
>  #define VB2_MAX_FRAME	(32)
>  #define VB2_MAX_PLANES	(8)
> @@ -254,11 +255,20 @@ struct vb2_buffer {
>  	 *			all buffers queued from userspace
>  	 * done_entry:		entry on the list that stores all buffers ready
>  	 *			to be dequeued to userspace
> +	 * in_fence:		fence receive from vb2 client to wait on before
> +	 *			using the buffer (queueing to the driver)
> +	 * fence_cb:		fence callback information
> +	 * fence_cb_lock:	protect callback signal/remove
>  	 */
>  	enum vb2_buffer_state	state;
>  
>  	struct list_head	queued_entry;
>  	struct list_head	done_entry;
> +
> +	struct dma_fence	*in_fence;
> +	struct dma_fence_cb	fence_cb;
> +	spinlock_t              fence_cb_lock;
> +
>  #ifdef CONFIG_VIDEO_ADV_DEBUG
>  	/*
>  	 * Counters for how often these buffer-related ops are
> @@ -504,6 +514,7 @@ struct vb2_buf_ops {
>   * @last_buffer_dequeued: used in poll() and DQBUF to immediately return if the
>   *		last decoded buffer was already dequeued. Set for capture queues
>   *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
> + * @last_fence:	last in-fence received. Used to keep ordering.
>   * @fileio:	file io emulator internal data, used only if emulator is active
>   * @threadio:	thread io internal data, used only if thread is active
>   */
> @@ -558,6 +569,8 @@ struct vb2_queue {
>  	unsigned int			copy_timestamp:1;
>  	unsigned int			last_buffer_dequeued:1;
>  
> +	struct dma_fence		*last_fence;
> +
>  	struct vb2_fileio_data		*fileio;
>  	struct vb2_threadio_data	*threadio;
>  
> @@ -733,6 +746,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
>   * @index:	id number of the buffer
>   * @pb:		buffer structure passed from userspace to vidioc_qbuf handler
>   *		in driver
> + * @fence:	in-fence to wait on before queueing the buffer
>   *
>   * Should be called from vidioc_qbuf ioctl handler of a driver.
>   * The passed buffer should have been verified.
> @@ -747,7 +761,8 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
>   * The return values from this function are intended to be directly returned
>   * from vidioc_qbuf handler in driver.
>   */
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence);
>  
>  /**
>   * vb2_core_dqbuf() - Dequeue a buffer to the userspace


-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17  6:49   ` Alexandre Courbot
@ 2017-11-17 13:00     ` Mauro Carvalho Chehab
  2017-11-17 13:08       ` Gustavo Padovan
  2017-11-17 13:01     ` Gustavo Padovan
  1 sibling, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 13:00 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Gustavo Padovan, linux-media, Hans Verkuil, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Fri, 17 Nov 2017 15:49:23 +0900
Alexandre Courbot <acourbot@chromium.org> escreveu:

> > @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct 
> > vb2_queue *q, struct v4l2_buffer *b,
> >  		return -EINVAL;
> >  	}
> >  
> > +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&  
> 
> Why do we need to consider both values invalid? Can 0 ever be a valid fence 
> fd?

Programs that don't use fences will initialize reserved2/fence_fd field
at the uAPI call to zero.

So, I guess using fd=0 here could be a problem. Anyway, I would, instead,
do:

	if ((b->fence_fd < 1) &&
		...

as other negative values are likely invalid as well.

-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17  6:49   ` Alexandre Courbot
  2017-11-17 13:00     ` Mauro Carvalho Chehab
@ 2017-11-17 13:01     ` Gustavo Padovan
  2017-11-20  2:53       ` Alexandre Courbot
  1 sibling, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 13:01 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Alexandre Courbot <acourbot@chromium.org>:

> Hi Gustavo,
> 
> I am coming a bit late in this series' review, so apologies if some of my
> comments have already have been discussed in an earlier revision.
> 
> On Thursday, November 16, 2017 2:10:53 AM JST, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > Receive in-fence from userspace and add support for waiting on them
> > before queueing the buffer to the driver. Buffers can't be queued to the
> > driver before its fences signal. And a buffer can't be queue to the driver
> > out of the order they were queued from userspace. That means that even if
> > it fence signal it must wait all other buffers, ahead of it in the queue,
> > to signal first.
> > 
> > To make that possible we use fence_array to keep that ordering. Basically
> > we create a fence_array that contains both the current fence and the fence
> > from the previous buffer (which might be a fence array as well). The base
> > fence class for the fence_array becomes the new buffer fence, waiting on
> > that one guarantees that it won't be queued out of order.
> > 
> > v6:
> > 	- With fences always keep the order userspace queues the buffers.
> > 	- Protect in_fence manipulation with a lock (Brian Starkey)
> > 	- check if fences have the same context before adding a fence array
> > 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> > 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> > 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> > 
> > v5:	- use fence_array to keep buffers ordered in vb2 core when
> > 	needed (Brian Starkey)
> > 	- keep backward compat on the reserved2 field (Brian Starkey)
> > 	- protect fence callback removal with lock (Brian Starkey)
> > 
> > v4:
> > 	- Add a comment about dma_fence_add_callback() not returning a
> > 	error (Hans)
> > 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> > 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> > 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> > 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> > 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> > 	vb2_start_streaming() (Hans)
> > 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> > 	- Queue buffers to the driver as soon as they are ready (Hans)
> > 	- call fill_user_buffer() after queuing the buffer (Hans)
> > 	- add err: label to clean up fence
> > 	- add dma_fence_wait() before calling vb2_start_streaming()
> > 
> > v3:	- document fence parameter
> > 	- remove ternary if at vb2_qbuf() return (Mauro)
> > 	- do not change if conditions behaviour (Mauro)
> > 
> > v2:
> > 	- fix vb2_queue_or_prepare_buf() ret check
> > 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> > 	- check num of ready buffers to start streaming
> > 	- when queueing, start from the first ready buffer
> > 	- handle queue cancel
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  drivers/media/v4l2-core/Kconfig          |   1 +
> >  drivers/media/v4l2-core/videobuf2-core.c | 202
> > ++++++++++++++++++++++++++++---
> >  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
> >  include/media/videobuf2-core.h           |  17 ++-
> >  4 files changed, 231 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/media/v4l2-core/Kconfig
> > b/drivers/media/v4l2-core/Kconfig
> > index a35c33686abf..3f988c407c80 100644
> > --- a/drivers/media/v4l2-core/Kconfig
> > +++ b/drivers/media/v4l2-core/Kconfig
> > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> >  # Used by drivers that need Videobuf2 modules
> >  config VIDEOBUF2_CORE
> >  	select DMA_SHARED_BUFFER
> > +	select SYNC_FILE
> >  	tristate
> >  config VIDEOBUF2_MEMOPS
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c
> > b/drivers/media/v4l2-core/videobuf2-core.c
> > index 60f8b582396a..26de4c80717d 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/sched.h>
> >  #include <linux/freezer.h>
> >  #include <linux/kthread.h>
> > +#include <linux/dma-fence-array.h>
> >  #include <media/videobuf2-core.h>
> >  #include <media/v4l2-mc.h>
> > @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q,
> > enum vb2_memory memory,
> >  		vb->index = q->num_buffers + buffer;
> >  		vb->type = q->type;
> >  		vb->memory = memory;
> > +		spin_lock_init(&vb->fence_cb_lock);
> >  		for (plane = 0; plane < num_planes; ++plane) {
> >  			vb->planes[plane].length = plane_sizes[plane];
> >  			vb->planes[plane].min_length = plane_sizes[plane];
> > @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
> >  {
> >  	struct vb2_queue *q = vb->vb2_queue;
> > +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> > +		return;
> > +
> >  	vb->state = VB2_BUF_STATE_ACTIVE;
> >  	atomic_inc(&q->owned_by_drv_count);
> >   @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb,
> > const void *pb)
> >  	return 0;
> >  }
> > +static int __get_num_ready_buffers(struct vb2_queue *q)
> > +{
> > +	struct vb2_buffer *vb;
> > +	int ready_count = 0;
> > +	unsigned long flags;
> > +
> > +	/* count num of buffers ready in front of the queued_list */
> > +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> > +			ready_count++;
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +	}
> > +
> > +	return ready_count;
> > +}
> > +
> >  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
> >  {
> >  	struct vb2_buffer *vb;
> > @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
> >  	return ret;
> >  }
> > -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> > +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> > +					struct vb2_buffer *vb,
> > +					struct dma_fence *fence)
> > +{
> > +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> > +		dma_fence_put(q->last_fence);
> > +		q->last_fence = NULL;
> > +	}
> > +
> > +	/*
> > +	 * We always guarantee the ordering of buffers queued from
> > +	 * userspace to be the same it is queued to the driver. For that
> > +	 * we create a fence array with the fence from the last queued
> > +	 * buffer and this one, that way the fence for this buffer can't
> > +	 * signal before the last one.
> > +	 */
> > +	if (fence && q->last_fence) {
> > +		struct dma_fence **fences;
> > +		struct dma_fence_array *arr;
> > +
> > +		if (fence->context == q->last_fence->context) {
> > +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> > +				dma_fence_put(q->last_fence);
> > +				q->last_fence = dma_fence_get(fence);
> > +			} else {
> > +				dma_fence_put(fence);
> > +				fence = dma_fence_get(q->last_fence);
> > +			}
> > +			return fence;
> > +		}
> > +
> > +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> > +		if (!fences)
> > +			return ERR_PTR(-ENOMEM);
> > +
> > +		fences[0] = fence;
> > +		fences[1] = q->last_fence;
> > +
> > +		arr = dma_fence_array_create(2, fences,
> > +					     dma_fence_context_alloc(1),
> > +					     1, false);
> > +		if (!arr) {
> > +			kfree(fences);
> > +			return ERR_PTR(-ENOMEM);
> > +		}
> > +
> > +		fence = &arr->base;
> > +
> > +		q->last_fence = dma_fence_get(fence);
> > +	} else if (!fence && q->last_fence) {
> > +		fence = dma_fence_get(q->last_fence);
> > +	}
> > +
> > +	return fence;
> > +}
> > +
> > +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> > +{
> > +	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
> > +	struct vb2_queue *q = vb->vb2_queue;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +	if (!vb->in_fence) {
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +		return;
> > +	}
> 
> Is this block necessary? IIUC the callback will never be set on buffers
> without
> an input fence, so (!vb->in_fence) should never be satisfied.

Not anymore! I added it when trying to fix the potential race condition
in the wrong way, but the newly added spinlock fixes that.

> 
> > +
> > +	dma_fence_put(vb->in_fence);
> > +	vb->in_fence = NULL;
> > +
> > +	if (q->start_streaming_called)
> > +		__enqueue_in_driver(vb);
> > +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +}
> > +
> > +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> > +		  struct dma_fence *fence)
> >  {
> >  	struct vb2_buffer *vb;
> > +	unsigned long flags;
> >  	int ret;
> >  	vb = q->bufs[index];
> > @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned
> > int index, void *pb)
> >  	case VB2_BUF_STATE_DEQUEUED:
> >  		ret = __buf_prepare(vb, pb);
> >  		if (ret)
> > -			return ret;
> > +			goto err;
> >  		break;
> >  	case VB2_BUF_STATE_PREPARED:
> >  		break;
> >  	case VB2_BUF_STATE_PREPARING:
> >  		dprintk(1, "buffer still being prepared\n");
> > -		return -EINVAL;
> > +		ret = -EINVAL;
> > +		goto err;
> >  	default:
> >  		dprintk(1, "invalid buffer state %d\n", vb->state);
> > -		return -EINVAL;
> > +		ret = -EINVAL;
> > +		goto err;
> >  	}
> >  	/*
> > @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned
> > int index, void *pb)
> >  	trace_vb2_qbuf(q, vb);
> > +	vb->in_fence = __set_in_fence(q, vb, fence);
> > +	if (IS_ERR(vb->in_fence)) {
> > +		dma_fence_put(fence);
> > +		ret = PTR_ERR(vb->in_fence);
> > +		goto err;
> > +	}
> > +	fence = NULL;
> > +
> > +	/*
> > +	 * If it is time to call vb2_start_streaming() wait for the fence
> > +	 * to signal first. Of course, this happens only once per streaming.
> > +	 * We want to run any step that might fail before we set the callback
> > +	 * to queue the fence when it signals.
> > +	 */
> > +	if (vb->in_fence && !q->start_streaming_called &&
> > +	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
> > +		dma_fence_wait(vb->in_fence, true);
> 
> Mmm, that's a tough call. Userspace may unexpectingly block due to this
> (which
> fences are supposed to prevent), not sure how much of a problem this may be.

Yes, but it happen just once per stream, when we are about to start it.
Alternatively, we can wait later, and if the fence wait returns a error
we can carry it until we are able to send the buffer back with error on
DQBUF.

> 
> > +
> >  	/*
> >  	 * If streamon has been called, and we haven't yet called
> >  	 * start_streaming() since not enough buffers were queued, and
> >  	 * we now have reached the minimum number of queued buffers,
> >  	 * then we can finally call start_streaming().
> > -	 *
> > -	 * If already streaming, give the buffer to driver for processing.
> > -	 * If not, the buffer will be given to driver on next streamon.
> >  	 */
> >  	if (q->streaming && !q->start_streaming_called &&
> > -	    q->queued_count >= q->min_buffers_needed) {
> > +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {
> >  		ret = vb2_start_streaming(q);
> >  		if (ret)
> > -			return ret;
> > -	} else if (q->start_streaming_called) {
> > -		__enqueue_in_driver(vb);
> > +			goto err;
> >  	}
> > +	/*
> > +	 * For explicit synchronization: If the fence didn't signal
> > +	 * yet we setup a callback to queue the buffer once the fence
> > +	 * signals, and then, return successfully. But if the fence
> > +	 * already signaled we lose the reference we held and queue the
> > +	 * buffer to the driver.
> > +	 */
> > +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +	if (vb->in_fence) {
> > +		ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
> > +					     vb2_qbuf_fence_cb);
> > +		if (ret == -EINVAL) {
> > +			spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +			goto err;
> > +		} else if (!ret) {
> > +			goto fill;
> > +		}
> > +
> > +		dma_fence_put(vb->in_fence);
> > +		vb->in_fence = NULL;
> 
> I suppose the last two lines are supposed to be called when
> dma_fence_add_callback() returns -ENOENT, because the fence has already been
> signaled? In that case I think the code would be more readable if you made
> it
> more explicit, something like:
> 
> // fence already signaled?
> if (ret == -ENOENT) {
> 	dma_fence_put(vb->in_fence);
> 	vb->in_fence = NULL;
> } else if (ret != 0) {
> 	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> 	goto err;
> }
> 
> and that way you can get rid of the short-ranged fill: label.

Yes, that is a better way to write it!

> 
> > +	}
> > +
> > +fill:
> > +	/*
> > +	 * If already streaming and there is no fence to wait on
> > +	 * give the buffer to driver for processing.
> > +	 */
> > +	if (q->start_streaming_called && !vb->in_fence)
> > +		__enqueue_in_driver(vb);
> 
> Since that code was previously called when vb2_start_streaming() wasn't, we
> had a guarantee that the buffer was only queued once. Now I wonder if we
> could
> not run into a situation where __enqueue_in_driver is called twice for the
> last
> queued buffer, once by vb2_start_streaming() and once here, now that
> vb2_start_streaming() has set q->start_streaming_called to 1?

maybe check for buffer state == QUEUED may suffice, I'll investigate it
further.

> 
> > +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +
> >  	/* Fill buffer information for the userspace */
> >  	if (pb)
> >  		call_void_bufop(q, fill_user_buffer, vb, pb);
> >  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
> >  	return 0;
> > +
> > +err:
> > +	if (vb->in_fence) {
> > +		dma_fence_put(vb->in_fence);
> > +		vb->in_fence = NULL;
> > +	}
> > +
> > +	return ret;
> > +
> >  }
> >  EXPORT_SYMBOL_GPL(vb2_core_qbuf);
> > @@ -1632,6 +1787,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
> >  static void __vb2_queue_cancel(struct vb2_queue *q)
> >  {
> >  	unsigned int i;
> > +	struct vb2_buffer *vb;
> > +	unsigned long flags;
> >  	/*
> >  	 * Tell driver to stop all transactions and release all queued
> > @@ -1659,6 +1816,21 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
> >  	q->queued_count = 0;
> >  	q->error = 0;
> > +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +		if (vb->in_fence) {
> > +			dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
> > +			dma_fence_put(vb->in_fence);
> > +			vb->in_fence = NULL;
> > +		}
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +	}
> > +
> > +	if (q->last_fence) {
> > +		dma_fence_put(q->last_fence);
> > +		q->last_fence = NULL;
> > +	}
> > +
> >  	/*
> >  	 * Remove all buffers from videobuf's list...
> >  	 */
> > @@ -1720,7 +1892,7 @@ int vb2_core_streamon(struct vb2_queue *q,
> > unsigned int type)
> >  	 * Tell driver to start streaming provided sufficient buffers
> >  	 * are available.
> >  	 */
> > -	if (q->queued_count >= q->min_buffers_needed) {
> > +	if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
> >  		ret = v4l_vb2q_enable_media_source(q);
> >  		if (ret)
> >  			return ret;
> > @@ -2240,7 +2412,7 @@ static int __vb2_init_fileio(struct vb2_queue *q,
> > int read)
> >  		 * Queue all buffers.
> >  		 */
> >  		for (i = 0; i < q->num_buffers; i++) {
> > -			ret = vb2_core_qbuf(q, i, NULL);
> > +			ret = vb2_core_qbuf(q, i, NULL, NULL);
> >  			if (ret)
> >  				goto err_reqbufs;
> >  			fileio->bufs[i].queued = 1;
> > @@ -2419,7 +2591,7 @@ static size_t __vb2_perform_fileio(struct
> > vb2_queue *q, char __user *data, size_
> >  		if (copy_timestamp)
> >  			b->timestamp = ktime_get_ns();
> > -		ret = vb2_core_qbuf(q, index, NULL);
> > +		ret = vb2_core_qbuf(q, index, NULL, NULL);
> >  		dprintk(5, "vb2_dbuf result: %d\n", ret);
> >  		if (ret)
> >  			return ret;
> > @@ -2522,7 +2694,7 @@ static int vb2_thread(void *data)
> >  		if (copy_timestamp)
> >  			vb->timestamp = ktime_get_ns();;
> >  		if (!threadio->stop)
> > -			ret = vb2_core_qbuf(q, vb->index, NULL);
> > +			ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
> >  		call_void_qop(q, wait_prepare, q);
> >  		if (ret || threadio->stop)
> >  			break;
> > diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c
> > b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > index 110fb45fef6f..4c09ea007d90 100644
> > --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> > +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/sched.h>
> >  #include <linux/freezer.h>
> >  #include <linux/kthread.h>
> > +#include <linux/sync_file.h>
> >  #include <media/v4l2-dev.h>
> >  #include <media/v4l2-fh.h>
> > @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct
> > vb2_queue *q, struct v4l2_buffer *b,
> >  		return -EINVAL;
> >  	}
> > +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&
> 
> Why do we need to consider both values invalid? Can 0 ever be a valid fence
> fd?

For backward compatibility we need to exclude 0.

> 
> For completeness you may also want to check the case where (b->fence_fd ==
> -1 &&
> (b->flags & V4L2_BUF_FLAG_IN_FENCE)) and return an error ahead of time, but
> I
> suppose the call to sync_file_get_fence() later on will catch this as well.

Yes, I think we can add this check too.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:00     ` Mauro Carvalho Chehab
@ 2017-11-17 13:08       ` Gustavo Padovan
  2017-11-17 13:19         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 13:08 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Alexandre Courbot, linux-media, Hans Verkuil, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:

> Em Fri, 17 Nov 2017 15:49:23 +0900
> Alexandre Courbot <acourbot@chromium.org> escreveu:
> 
> > > @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct 
> > > vb2_queue *q, struct v4l2_buffer *b,
> > >  		return -EINVAL;
> > >  	}
> > >  
> > > +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&  
> > 
> > Why do we need to consider both values invalid? Can 0 ever be a valid fence 
> > fd?
> 
> Programs that don't use fences will initialize reserved2/fence_fd field
> at the uAPI call to zero.
> 
> So, I guess using fd=0 here could be a problem. Anyway, I would, instead,
> do:
> 
> 	if ((b->fence_fd < 1) &&
> 		...
> 
> as other negative values are likely invalid as well.

We are checking when the fence_fd is set but the flag wasn't. Checking
for < 1 is exactly the opposite. so we keep as is or do it fence_fd > 0.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 12:53   ` Mauro Carvalho Chehab
@ 2017-11-17 13:12     ` Gustavo Padovan
  2017-11-17 13:47       ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 13:12 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:

> Em Wed, 15 Nov 2017 15:10:53 -0200
> Gustavo Padovan <gustavo@padovan.org> escreveu:
> 
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > Receive in-fence from userspace and add support for waiting on them
> > before queueing the buffer to the driver. Buffers can't be queued to the
> > driver before its fences signal. And a buffer can't be queue to the driver
> > out of the order they were queued from userspace. That means that even if
> > it fence signal it must wait all other buffers, ahead of it in the queue,
> > to signal first.
> > 
> > To make that possible we use fence_array to keep that ordering. Basically
> > we create a fence_array that contains both the current fence and the fence
> > from the previous buffer (which might be a fence array as well). The base
> > fence class for the fence_array becomes the new buffer fence, waiting on
> > that one guarantees that it won't be queued out of order.
> > 
> > v6:
> > 	- With fences always keep the order userspace queues the buffers.
> > 	- Protect in_fence manipulation with a lock (Brian Starkey)
> > 	- check if fences have the same context before adding a fence array
> > 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> > 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> > 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> > 
> > v5:	- use fence_array to keep buffers ordered in vb2 core when
> > 	needed (Brian Starkey)
> > 	- keep backward compat on the reserved2 field (Brian Starkey)
> > 	- protect fence callback removal with lock (Brian Starkey)
> > 
> > v4:
> > 	- Add a comment about dma_fence_add_callback() not returning a
> > 	error (Hans)
> > 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> > 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> > 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> > 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> > 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> > 	vb2_start_streaming() (Hans)
> > 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> > 	- Queue buffers to the driver as soon as they are ready (Hans)
> > 	- call fill_user_buffer() after queuing the buffer (Hans)
> > 	- add err: label to clean up fence
> > 	- add dma_fence_wait() before calling vb2_start_streaming()
> > 
> > v3:	- document fence parameter
> > 	- remove ternary if at vb2_qbuf() return (Mauro)
> > 	- do not change if conditions behaviour (Mauro)
> > 
> > v2:
> > 	- fix vb2_queue_or_prepare_buf() ret check
> > 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> > 	- check num of ready buffers to start streaming
> > 	- when queueing, start from the first ready buffer
> > 	- handle queue cancel
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  drivers/media/v4l2-core/Kconfig          |   1 +
> >  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
> >  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
> >  include/media/videobuf2-core.h           |  17 ++-
> >  4 files changed, 231 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> > index a35c33686abf..3f988c407c80 100644
> > --- a/drivers/media/v4l2-core/Kconfig
> > +++ b/drivers/media/v4l2-core/Kconfig
> > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> >  # Used by drivers that need Videobuf2 modules
> >  config VIDEOBUF2_CORE
> >  	select DMA_SHARED_BUFFER
> > +	select SYNC_FILE
> >  	tristate
> >  
> >  config VIDEOBUF2_MEMOPS
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> > index 60f8b582396a..26de4c80717d 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/sched.h>
> >  #include <linux/freezer.h>
> >  #include <linux/kthread.h>
> > +#include <linux/dma-fence-array.h>
> >  
> >  #include <media/videobuf2-core.h>
> >  #include <media/v4l2-mc.h>
> > @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
> >  		vb->index = q->num_buffers + buffer;
> >  		vb->type = q->type;
> >  		vb->memory = memory;
> > +		spin_lock_init(&vb->fence_cb_lock);
> >  		for (plane = 0; plane < num_planes; ++plane) {
> >  			vb->planes[plane].length = plane_sizes[plane];
> >  			vb->planes[plane].min_length = plane_sizes[plane];
> > @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
> >  {
> >  	struct vb2_queue *q = vb->vb2_queue;
> >  
> > +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> > +		return;
> > +
> >  	vb->state = VB2_BUF_STATE_ACTIVE;
> >  	atomic_inc(&q->owned_by_drv_count);
> >  
> > @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
> >  	return 0;
> >  }
> >  
> > +static int __get_num_ready_buffers(struct vb2_queue *q)
> > +{
> > +	struct vb2_buffer *vb;
> > +	int ready_count = 0;
> > +	unsigned long flags;
> > +
> > +	/* count num of buffers ready in front of the queued_list */
> > +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> > +			ready_count++;
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +	}
> > +
> > +	return ready_count;
> > +}
> > +
> >  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
> >  {
> >  	struct vb2_buffer *vb;
> > @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
> >  	return ret;
> >  }
> >  
> > -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> > +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> > +					struct vb2_buffer *vb,
> > +					struct dma_fence *fence)
> > +{
> > +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> > +		dma_fence_put(q->last_fence);
> > +		q->last_fence = NULL;
> > +	}
> > +
> > +	/*
> > +	 * We always guarantee the ordering of buffers queued from
> > +	 * userspace to be the same it is queued to the driver. For that
> > +	 * we create a fence array with the fence from the last queued
> > +	 * buffer and this one, that way the fence for this buffer can't
> > +	 * signal before the last one.
> > +	 */
> > +	if (fence && q->last_fence) {
> > +		struct dma_fence **fences;
> > +		struct dma_fence_array *arr;
> > +
> > +		if (fence->context == q->last_fence->context) {
> > +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> > +				dma_fence_put(q->last_fence);
> > +				q->last_fence = dma_fence_get(fence);
> > +			} else {
> > +				dma_fence_put(fence);
> > +				fence = dma_fence_get(q->last_fence);
> > +			}
> > +			return fence;
> > +		}
> > +
> > +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> > +		if (!fences)
> > +			return ERR_PTR(-ENOMEM);
> > +
> > +		fences[0] = fence;
> > +		fences[1] = q->last_fence;
> > +
> > +		arr = dma_fence_array_create(2, fences,
> > +					     dma_fence_context_alloc(1),
> > +					     1, false);
> > +		if (!arr) {
> > +			kfree(fences);
> > +			return ERR_PTR(-ENOMEM);
> > +		}
> > +
> > +		fence = &arr->base;
> > +
> > +		q->last_fence = dma_fence_get(fence);
> > +	} else if (!fence && q->last_fence) {
> > +		fence = dma_fence_get(q->last_fence);
> > +	}
> > +
> > +	return fence;
> > +}
> > +
> > +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> > +{
> > +	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
> > +	struct vb2_queue *q = vb->vb2_queue;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +	if (!vb->in_fence) {
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +		return;
> > +	}
> > +
> > +	dma_fence_put(vb->in_fence);
> > +	vb->in_fence = NULL;
> > +
> > +	if (q->start_streaming_called)
> > +		__enqueue_in_driver(vb);
> > +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +}
> > +
> > +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> > +		  struct dma_fence *fence)
> >  {
> >  	struct vb2_buffer *vb;
> > +	unsigned long flags;
> >  	int ret;
> >  
> >  	vb = q->bufs[index];
> > @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> >  	case VB2_BUF_STATE_DEQUEUED:
> >  		ret = __buf_prepare(vb, pb);
> >  		if (ret)
> > -			return ret;
> > +			goto err;
> >  		break;
> >  	case VB2_BUF_STATE_PREPARED:
> >  		break;
> >  	case VB2_BUF_STATE_PREPARING:
> >  		dprintk(1, "buffer still being prepared\n");
> > -		return -EINVAL;
> > +		ret = -EINVAL;
> > +		goto err;
> >  	default:
> >  		dprintk(1, "invalid buffer state %d\n", vb->state);
> > -		return -EINVAL;
> > +		ret = -EINVAL;
> > +		goto err;
> >  	}
> >  
> >  	/*
> > @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> >  
> >  	trace_vb2_qbuf(q, vb);
> >  
> > +	vb->in_fence = __set_in_fence(q, vb, fence);
> > +	if (IS_ERR(vb->in_fence)) {
> > +		dma_fence_put(fence);
> > +		ret = PTR_ERR(vb->in_fence);
> > +		goto err;
> > +	}
> > +	fence = NULL;
> > +
> > +	/*
> > +	 * If it is time to call vb2_start_streaming() wait for the fence
> > +	 * to signal first. Of course, this happens only once per streaming.
> > +	 * We want to run any step that might fail before we set the callback
> > +	 * to queue the fence when it signals.
> > +	 */
> > +	if (vb->in_fence && !q->start_streaming_called &&
> > +	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
> > +		dma_fence_wait(vb->in_fence, true);
> > +
> 
> The above code sounds weird: it is called even if the user is not
> using fences.
> 
> You should probably be doing something like:
> 
> 	if (fence || q->last_fence) {
> 		vb->in_fence = __set_in_fence(q, vb, fence);
> 		...
> 	}
> 

Sure.

> >  	/*
> >  	 * If streamon has been called, and we haven't yet called
> >  	 * start_streaming() since not enough buffers were queued, and
> >  	 * we now have reached the minimum number of queued buffers,
> >  	 * then we can finally call start_streaming().
> > -	 *
> > -	 * If already streaming, give the buffer to driver for processing.
> > -	 * If not, the buffer will be given to driver on next streamon.
> >  	 */
> >  	if (q->streaming && !q->start_streaming_called &&
> > -	    q->queued_count >= q->min_buffers_needed) {
> > +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {
> 
> I guess the case where fences is not used is not covered here.
> 
> You probably should add a check at __get_num_ready_buffers(q)
> as well, making it just return q->queued_count if fences isn't
> used.

We can't know that beforehand, some buffer ahead may have a fence,
but there is already a check for !fence for each buffer. If none of
them have fences the return will be equal to q->queued_count.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:08       ` Gustavo Padovan
@ 2017-11-17 13:19         ` Mauro Carvalho Chehab
  2017-11-20 11:41           ` Brian Starkey
  0 siblings, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 13:19 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: Alexandre Courbot, linux-media, Hans Verkuil, Shuah Khan,
	Pawel Osciak, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Fri, 17 Nov 2017 11:08:01 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> 2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
> 
> > Em Fri, 17 Nov 2017 15:49:23 +0900
> > Alexandre Courbot <acourbot@chromium.org> escreveu:
> >   
> > > > @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct 
> > > > vb2_queue *q, struct v4l2_buffer *b,
> > > >  		return -EINVAL;
> > > >  	}
> > > >  
> > > > +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&    
> > > 
> > > Why do we need to consider both values invalid? Can 0 ever be a valid fence 
> > > fd?  
> > 
> > Programs that don't use fences will initialize reserved2/fence_fd field
> > at the uAPI call to zero.
> > 
> > So, I guess using fd=0 here could be a problem. Anyway, I would, instead,
> > do:
> > 
> > 	if ((b->fence_fd < 1) &&
> > 		...
> > 
> > as other negative values are likely invalid as well.  
> 
> We are checking when the fence_fd is set but the flag wasn't. Checking
> for < 1 is exactly the opposite. so we keep as is or do it fence_fd > 0.

Ah, yes. Anyway, I would stick with:
	if ((b->fence_fd > 0) &&
		...

> 
> Gustavo


-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 06/11] [media] vb2: add explicit fence user API
  2017-11-15 17:10 ` [RFC v5 06/11] [media] vb2: add explicit fence user API Gustavo Padovan
  2017-11-17 12:25   ` Mauro Carvalho Chehab
@ 2017-11-17 13:29   ` Hans Verkuil
  2017-11-17 13:53     ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 48+ messages in thread
From: Hans Verkuil @ 2017-11-17 13:29 UTC (permalink / raw)
  To: Gustavo Padovan, linux-media
  Cc: Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On 15/11/17 18:10, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> Turn the reserved2 field into fence_fd that we will use to send
> an in-fence to the kernel and return an out-fence from the kernel to
> userspace.
> 
> Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
> when sending a fence to the kernel to be waited on, and
> V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
> 
> v4:
> 	- make it a union with reserved2 and fence_fd (Hans Verkuil)
> 
> v3:
> 	- make the out_fence refer to the current buffer (Hans Verkuil)
> 
> v2: add documentation
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  Documentation/media/uapi/v4l/buffer.rst       | 15 +++++++++++++++
>  drivers/media/usb/cpia2/cpia2_v4l.c           |  2 +-
>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  4 ++--
>  drivers/media/v4l2-core/videobuf2-v4l2.c      |  2 +-
>  include/uapi/linux/videodev2.h                |  7 ++++++-
>  5 files changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
> index ae6ee73f151c..eeefbd2547e7 100644
> --- a/Documentation/media/uapi/v4l/buffer.rst
> +++ b/Documentation/media/uapi/v4l/buffer.rst
> @@ -648,6 +648,21 @@ Buffer Flags
>        - Start Of Exposure. The buffer timestamp has been taken when the
>  	exposure of the frame has begun. This is only valid for the
>  	``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
> +    * .. _`V4L2-BUF-FLAG-IN-FENCE`:
> +
> +      - ``V4L2_BUF_FLAG_IN_FENCE``
> +      - 0x00200000
> +      - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
> +	won't be queued to the driver until the fence signals.
> +
> +    * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
> +
> +      - ``V4L2_BUF_FLAG_OUT_FENCE``
> +      - 0x00400000
> +      - Request a fence to be attached to the buffer. The ``fence_fd``
> +	field on
> +	:ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
> +	fd to userspace.

How would userspace know if fences are not supported? E.g. any driver that does
not use vb2 will have no support for it.

While the driver could clear the flag on return, the problem is that it is a bit
late for applications to discover lack of fence support.

Perhaps we do need a capability flag for this? I wonder what others think.

Regards,

	Hans

>  
>  
>  
> diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
> index 3dedd83f0b19..6cde686bf44c 100644
> --- a/drivers/media/usb/cpia2/cpia2_v4l.c
> +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
> @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
>  	buf->sequence = cam->buffers[buf->index].seq;
>  	buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
>  	buf->length = cam->frame_size;
> -	buf->reserved2 = 0;
> +	buf->fence_fd = -1;
>  	buf->reserved = 0;
>  	memset(&buf->timecode, 0, sizeof(buf->timecode));
>  
> diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> index 821f2aa299ae..3a31d318df2a 100644
> --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> @@ -370,7 +370,7 @@ struct v4l2_buffer32 {
>  		__s32		fd;
>  	} m;
>  	__u32			length;
> -	__u32			reserved2;
> +	__s32			fence_fd;
>  	__u32			reserved;
>  };
>  
> @@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
>  		put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
>  		copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
>  		put_user(kp->sequence, &up->sequence) ||
> -		put_user(kp->reserved2, &up->reserved2) ||
> +		put_user(kp->fence_fd, &up->fence_fd) ||
>  		put_user(kp->reserved, &up->reserved) ||
>  		put_user(kp->length, &up->length))
>  			return -EFAULT;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 0c0669976bdc..110fb45fef6f 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>  	b->timestamp = ns_to_timeval(vb->timestamp);
>  	b->timecode = vbuf->timecode;
>  	b->sequence = vbuf->sequence;
> -	b->reserved2 = 0;
> +	b->fence_fd = -1;
>  	b->reserved = 0;
>  
>  	if (q->is_multiplanar) {
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index cd6fc1387f47..311c3a331eda 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -925,7 +925,10 @@ struct v4l2_buffer {
>  		__s32		fd;
>  	} m;
>  	__u32			length;
> -	__u32			reserved2;
> +	union {
> +		__s32		fence_fd;
> +		__u32		reserved2;
> +	};
>  	__u32			reserved;
>  };
>  
> @@ -962,6 +965,8 @@ struct v4l2_buffer {
>  #define V4L2_BUF_FLAG_TSTAMP_SRC_SOE		0x00010000
>  /* mem2mem encoder/decoder */
>  #define V4L2_BUF_FLAG_LAST			0x00100000
> +#define V4L2_BUF_FLAG_IN_FENCE			0x00200000
> +#define V4L2_BUF_FLAG_OUT_FENCE			0x00400000
>  
>  /**
>   * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 10/11] [media] vb2: add out-fence support to QBUF
  2017-11-15 17:10 ` [RFC v5 10/11] [media] vb2: add out-fence support to QBUF Gustavo Padovan
  2017-11-17  7:38   ` Alexandre Courbot
@ 2017-11-17 13:34   ` Hans Verkuil
  1 sibling, 0 replies; 48+ messages in thread
From: Hans Verkuil @ 2017-11-17 13:34 UTC (permalink / raw)
  To: Gustavo Padovan, linux-media
  Cc: Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On 15/11/17 18:10, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> If V4L2_BUF_FLAG_OUT_FENCE flag is present on the QBUF call we create
> an out_fence and send its fd to userspace on the fence_fd field as a
> return arg for the QBUF call.
> 
> The fence is signaled on buffer_done(), when the job on the buffer is
> finished.
> 
> With out-fences we do not allow drivers to requeue buffers through vb2,
> instead we flag an error on the buffer, signals it fence with error and
> return it to userspace.
> 
> v6
> 	- get rid of the V4L2_EVENT_OUT_FENCE event. We always keep the
> 	ordering in vb2 for queueing in the driver, so the event is not
> 	necessary anymore and the out_fence_fd is sent back to userspace
> 	on QBUF call return arg
> 	- do not allow requeueing with out-fences, instead mark the buffer
> 	with an error and wake up to userspace.
> 	- send the out_fence_fd back to userspace on the fence_fd field
> 
> v5:
> 	- delay fd_install to DQ_EVENT (Hans)
> 	- if queue is fully ordered send OUT_FENCE event right away
> 	(Brian)
> 	- rename 'q->ordered' to 'q->ordered_in_driver'
> 	- merge change to implement OUT_FENCE event here
> 
> v4:
> 	- return the out_fence_fd in the BUF_QUEUED event(Hans)
> 
> v3:	- add WARN_ON_ONCE(q->ordered) on requeueing (Hans)
> 	- set the OUT_FENCE flag if there is a fence pending (Hans)
> 	- call fd_install() after vb2_core_qbuf() (Hans)
> 	- clean up fence if vb2_core_qbuf() fails (Hans)
> 	- add list to store sync_file and fence for the next queued buffer
> 
> v2: check if the queue is ordered.
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/videobuf2-core.c | 42 +++++++++++++++++++++++++++++---
>  drivers/media/v4l2-core/videobuf2-v4l2.c | 21 +++++++++++++++-
>  2 files changed, 58 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index 8b4f0e9bcb36..2eb5ffa8e028 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -354,6 +354,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  			vb->planes[plane].length = plane_sizes[plane];
>  			vb->planes[plane].min_length = plane_sizes[plane];
>  		}
> +		vb->out_fence_fd = -1;
>  		q->bufs[vb->index] = vb;
>  
>  		/* Allocate video buffer memory for the MMAP type */
> @@ -934,10 +935,26 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>  	case VB2_BUF_STATE_QUEUED:
>  		return;
>  	case VB2_BUF_STATE_REQUEUEING:
> -		if (q->start_streaming_called)
> -			__enqueue_in_driver(vb);
> -		return;
> +
> +		if (!vb->out_fence) {
> +			if (q->start_streaming_called)
> +				__enqueue_in_driver(vb);
> +			return;
> +		}
> +
> +		/* Do not allow requeuing with explicit synchronization,
> +		 * report it as an error to userspace */
> +		state = VB2_BUF_STATE_ERROR;

No, don't do this. Drivers that use requeueing simply cannot use fences.
I.e., it should be a WARN_ON_ONCE(), then continue with the original
code. The only mainline driver that requeues is the cobalt PCI driver
and it is OK to disable fences there. I maintain that one and I can take
a look later whether I can replace the requeuing with something better.

Regards,

	Hans

> +
> +		/* fall through */
>  	default:
> +		if (state == VB2_BUF_STATE_ERROR)
> +			dma_fence_set_error(vb->out_fence, -EFAULT);
> +		dma_fence_signal(vb->out_fence);
> +		dma_fence_put(vb->out_fence);
> +		vb->out_fence = NULL;
> +		vb->out_fence_fd = -1;
> +
>  		/* Inform any processes that may be waiting for buffers */
>  		wake_up(&q->done_wq);
>  		break;
> @@ -1325,12 +1342,18 @@ EXPORT_SYMBOL_GPL(vb2_core_prepare_buf);
>  int vb2_setup_out_fence(struct vb2_queue *q, unsigned int index)
>  {
>  	struct vb2_buffer *vb;
> +	u64 context;
>  
>  	vb = q->bufs[index];
>  
>  	vb->out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
>  
> -	vb->out_fence = vb2_fence_alloc(q->out_fence_context);
> +	if (q->ordered_in_driver)
> +		context = q->out_fence_context;
> +	else
> +		context = dma_fence_context_alloc(1);
> +
> +	vb->out_fence = vb2_fence_alloc(context);
>  	if (!vb->out_fence) {
>  		put_unused_fd(vb->out_fence_fd);
>  		return -ENOMEM;
> @@ -1594,6 +1617,9 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
>  	if (pb)
>  		call_void_bufop(q, fill_user_buffer, vb, pb);
>  
> +	fd_install(vb->out_fence_fd, vb->sync_file->file);
> +	vb->sync_file = NULL;
> +
>  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>  	return 0;
>  
> @@ -1860,6 +1886,12 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>  	}
>  
>  	/*
> +	 * Renew out-fence context.
> +	 */
> +	if (q->ordered_in_driver)
> +		q->out_fence_context = dma_fence_context_alloc(1);
> +
> +	/*
>  	 * Remove all buffers from videobuf's list...
>  	 */
>  	INIT_LIST_HEAD(&q->queued_list);
> @@ -2191,6 +2223,8 @@ int vb2_core_queue_init(struct vb2_queue *q)
>  	spin_lock_init(&q->done_lock);
>  	mutex_init(&q->mmap_lock);
>  	init_waitqueue_head(&q->done_wq);
> +	if (q->ordered_in_driver)
> +		q->out_fence_context = dma_fence_context_alloc(1);
>  
>  	if (q->buf_struct_size == 0)
>  		q->buf_struct_size = sizeof(struct vb2_buffer);
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 4c09ea007d90..f2e60d2908ae 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -212,7 +212,13 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>  	b->sequence = vbuf->sequence;
>  	b->reserved = 0;
>  
> -	b->fence_fd = -1;
> +	if (vb->out_fence) {
> +		b->flags |= V4L2_BUF_FLAG_OUT_FENCE;
> +		b->fence_fd = vb->out_fence_fd;
> +	} else {
> +		b->fence_fd = -1;
> +	}
> +
>  	if (vb->in_fence)
>  		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
>  	else
> @@ -489,6 +495,10 @@ int vb2_querybuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  	ret = __verify_planes_array(vb, b);
>  	if (!ret)
>  		vb2_core_querybuf(q, b->index, b);
> +
> +	/* Do not return the out-fence fd on querybuf */
> +	if (vb->out_fence)
> +		b->fence_fd = -1;
>  	return ret;
>  }
>  EXPORT_SYMBOL(vb2_querybuf);
> @@ -593,6 +603,15 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  		}
>  	}
>  
> +	if (b->flags & V4L2_BUF_FLAG_OUT_FENCE) {
> +		ret = vb2_setup_out_fence(q, b->index);
> +		if (ret) {
> +			dprintk(1, "failed to set up out-fence\n");
> +			dma_fence_put(fence);
> +			return ret;
> +		}
> +	}
> +
>  	return vb2_core_qbuf(q, b->index, b, fence);
>  }
>  EXPORT_SYMBOL_GPL(vb2_qbuf);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:12     ` Gustavo Padovan
@ 2017-11-17 13:47       ` Mauro Carvalho Chehab
  2017-11-17 17:20         ` Gustavo Padovan
  0 siblings, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 13:47 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Fri, 17 Nov 2017 11:12:48 -0200
Gustavo Padovan <gustavo@padovan.org> escreveu:

> > >  	/*
> > >  	 * If streamon has been called, and we haven't yet called
> > >  	 * start_streaming() since not enough buffers were queued, and
> > >  	 * we now have reached the minimum number of queued buffers,
> > >  	 * then we can finally call start_streaming().
> > > -	 *
> > > -	 * If already streaming, give the buffer to driver for processing.
> > > -	 * If not, the buffer will be given to driver on next streamon.
> > >  	 */
> > >  	if (q->streaming && !q->start_streaming_called &&
> > > -	    q->queued_count >= q->min_buffers_needed) {
> > > +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {  
> > 
> > I guess the case where fences is not used is not covered here.
> > 
> > You probably should add a check at __get_num_ready_buffers(q)
> > as well, making it just return q->queued_count if fences isn't
> > used.  
> 
> We can't know that beforehand, some buffer ahead may have a fence,
> but there is already a check for !fence for each buffer. If none of
> them have fences the return will be equal to q->queued_count.

Hmm... are we willing to support the case where just some
buffers have fences? Why?

In any case, we should add a notice at the documentation telling
about what happens if not all buffers have fences, and if fences
are required to be setup before starting streaming, or if it is
possible to dynamically change the fances behavior while streaming.

-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 06/11] [media] vb2: add explicit fence user API
  2017-11-17 13:29   ` Hans Verkuil
@ 2017-11-17 13:53     ` Mauro Carvalho Chehab
  2017-11-17 14:31       ` Hans Verkuil
  0 siblings, 1 reply; 48+ messages in thread
From: Mauro Carvalho Chehab @ 2017-11-17 13:53 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Gustavo Padovan, linux-media, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

Em Fri, 17 Nov 2017 14:29:23 +0100
Hans Verkuil <hverkuil@xs4all.nl> escreveu:

> On 15/11/17 18:10, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > Turn the reserved2 field into fence_fd that we will use to send
> > an in-fence to the kernel and return an out-fence from the kernel to
> > userspace.
> > 
> > Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
> > when sending a fence to the kernel to be waited on, and
> > V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
> > 
> > v4:
> > 	- make it a union with reserved2 and fence_fd (Hans Verkuil)
> > 
> > v3:
> > 	- make the out_fence refer to the current buffer (Hans Verkuil)
> > 
> > v2: add documentation
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  Documentation/media/uapi/v4l/buffer.rst       | 15 +++++++++++++++
> >  drivers/media/usb/cpia2/cpia2_v4l.c           |  2 +-
> >  drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  4 ++--
> >  drivers/media/v4l2-core/videobuf2-v4l2.c      |  2 +-
> >  include/uapi/linux/videodev2.h                |  7 ++++++-
> >  5 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
> > index ae6ee73f151c..eeefbd2547e7 100644
> > --- a/Documentation/media/uapi/v4l/buffer.rst
> > +++ b/Documentation/media/uapi/v4l/buffer.rst
> > @@ -648,6 +648,21 @@ Buffer Flags
> >        - Start Of Exposure. The buffer timestamp has been taken when the
> >  	exposure of the frame has begun. This is only valid for the
> >  	``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
> > +    * .. _`V4L2-BUF-FLAG-IN-FENCE`:
> > +
> > +      - ``V4L2_BUF_FLAG_IN_FENCE``
> > +      - 0x00200000
> > +      - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
> > +	won't be queued to the driver until the fence signals.
> > +
> > +    * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
> > +
> > +      - ``V4L2_BUF_FLAG_OUT_FENCE``
> > +      - 0x00400000
> > +      - Request a fence to be attached to the buffer. The ``fence_fd``
> > +	field on
> > +	:ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
> > +	fd to userspace.  
> 
> How would userspace know if fences are not supported? E.g. any driver that does
> not use vb2 will have no support for it.
> 
> While the driver could clear the flag on return, the problem is that it is a bit
> late for applications to discover lack of fence support.
> 
> Perhaps we do need a capability flag for this? I wonder what others think.

We're almost running out of flags at v4l2 caps (and at struct v4l2_buffer).

So, I would prefer to not add more flags on those structs if there is
another way.

As the fences out of order flags should go to ENUM_FMT (and, currently
there's just one flag defined there), I wander if it would make sense
to also add CAN_IN_FENCES/CAN_OUT_FENCES flags there, as maybe we
would want to disable/enable fences based on the format.

> 
> Regards,
> 
> 	Hans
> 
> >  
> >  
> >  
> > diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
> > index 3dedd83f0b19..6cde686bf44c 100644
> > --- a/drivers/media/usb/cpia2/cpia2_v4l.c
> > +++ b/drivers/media/usb/cpia2/cpia2_v4l.c
> > @@ -948,7 +948,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
> >  	buf->sequence = cam->buffers[buf->index].seq;
> >  	buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
> >  	buf->length = cam->frame_size;
> > -	buf->reserved2 = 0;
> > +	buf->fence_fd = -1;
> >  	buf->reserved = 0;
> >  	memset(&buf->timecode, 0, sizeof(buf->timecode));
> >  
> > diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > index 821f2aa299ae..3a31d318df2a 100644
> > --- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
> > @@ -370,7 +370,7 @@ struct v4l2_buffer32 {
> >  		__s32		fd;
> >  	} m;
> >  	__u32			length;
> > -	__u32			reserved2;
> > +	__s32			fence_fd;
> >  	__u32			reserved;
> >  };
> >  
> > @@ -533,7 +533,7 @@ static int put_v4l2_buffer32(struct v4l2_buffer *kp, struct v4l2_buffer32 __user
> >  		put_user(kp->timestamp.tv_usec, &up->timestamp.tv_usec) ||
> >  		copy_to_user(&up->timecode, &kp->timecode, sizeof(struct v4l2_timecode)) ||
> >  		put_user(kp->sequence, &up->sequence) ||
> > -		put_user(kp->reserved2, &up->reserved2) ||
> > +		put_user(kp->fence_fd, &up->fence_fd) ||
> >  		put_user(kp->reserved, &up->reserved) ||
> >  		put_user(kp->length, &up->length))
> >  			return -EFAULT;
> > diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > index 0c0669976bdc..110fb45fef6f 100644
> > --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> > +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> > @@ -203,7 +203,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
> >  	b->timestamp = ns_to_timeval(vb->timestamp);
> >  	b->timecode = vbuf->timecode;
> >  	b->sequence = vbuf->sequence;
> > -	b->reserved2 = 0;
> > +	b->fence_fd = -1;
> >  	b->reserved = 0;
> >  
> >  	if (q->is_multiplanar) {
> > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > index cd6fc1387f47..311c3a331eda 100644
> > --- a/include/uapi/linux/videodev2.h
> > +++ b/include/uapi/linux/videodev2.h
> > @@ -925,7 +925,10 @@ struct v4l2_buffer {
> >  		__s32		fd;
> >  	} m;
> >  	__u32			length;
> > -	__u32			reserved2;
> > +	union {
> > +		__s32		fence_fd;
> > +		__u32		reserved2;
> > +	};
> >  	__u32			reserved;
> >  };
> >  
> > @@ -962,6 +965,8 @@ struct v4l2_buffer {
> >  #define V4L2_BUF_FLAG_TSTAMP_SRC_SOE		0x00010000
> >  /* mem2mem encoder/decoder */
> >  #define V4L2_BUF_FLAG_LAST			0x00100000
> > +#define V4L2_BUF_FLAG_IN_FENCE			0x00200000
> > +#define V4L2_BUF_FLAG_OUT_FENCE			0x00400000
> >  
> >  /**
> >   * struct v4l2_exportbuffer - export of video buffer as DMABUF file descriptor
> >   
> 


-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
  2017-11-17  6:49   ` Alexandre Courbot
  2017-11-17 12:53   ` Mauro Carvalho Chehab
@ 2017-11-17 14:15   ` Hans Verkuil
  2017-11-17 17:40     ` Gustavo Padovan
  2 siblings, 1 reply; 48+ messages in thread
From: Hans Verkuil @ 2017-11-17 14:15 UTC (permalink / raw)
  To: Gustavo Padovan, linux-media
  Cc: Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On 15/11/17 18:10, Gustavo Padovan wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
> 
> Receive in-fence from userspace and add support for waiting on them
> before queueing the buffer to the driver. Buffers can't be queued to the
> driver before its fences signal. And a buffer can't be queue to the driver
> out of the order they were queued from userspace. That means that even if
> it fence signal it must wait all other buffers, ahead of it in the queue,
> to signal first.
> 
> To make that possible we use fence_array to keep that ordering. Basically
> we create a fence_array that contains both the current fence and the fence
> from the previous buffer (which might be a fence array as well). The base
> fence class for the fence_array becomes the new buffer fence, waiting on
> that one guarantees that it won't be queued out of order.
> 
> v6:
> 	- With fences always keep the order userspace queues the buffers.
> 	- Protect in_fence manipulation with a lock (Brian Starkey)
> 	- check if fences have the same context before adding a fence array
> 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> 
> v5:	- use fence_array to keep buffers ordered in vb2 core when
> 	needed (Brian Starkey)
> 	- keep backward compat on the reserved2 field (Brian Starkey)
> 	- protect fence callback removal with lock (Brian Starkey)
> 
> v4:
> 	- Add a comment about dma_fence_add_callback() not returning a
> 	error (Hans)
> 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> 	vb2_start_streaming() (Hans)
> 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> 	- Queue buffers to the driver as soon as they are ready (Hans)
> 	- call fill_user_buffer() after queuing the buffer (Hans)
> 	- add err: label to clean up fence
> 	- add dma_fence_wait() before calling vb2_start_streaming()
> 
> v3:	- document fence parameter
> 	- remove ternary if at vb2_qbuf() return (Mauro)
> 	- do not change if conditions behaviour (Mauro)
> 
> v2:
> 	- fix vb2_queue_or_prepare_buf() ret check
> 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> 	- check num of ready buffers to start streaming
> 	- when queueing, start from the first ready buffer
> 	- handle queue cancel
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> ---
>  drivers/media/v4l2-core/Kconfig          |   1 +
>  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
>  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
>  include/media/videobuf2-core.h           |  17 ++-
>  4 files changed, 231 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> index a35c33686abf..3f988c407c80 100644
> --- a/drivers/media/v4l2-core/Kconfig
> +++ b/drivers/media/v4l2-core/Kconfig
> @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
>  # Used by drivers that need Videobuf2 modules
>  config VIDEOBUF2_CORE
>  	select DMA_SHARED_BUFFER
> +	select SYNC_FILE
>  	tristate
>  
>  config VIDEOBUF2_MEMOPS
> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> index 60f8b582396a..26de4c80717d 100644
> --- a/drivers/media/v4l2-core/videobuf2-core.c
> +++ b/drivers/media/v4l2-core/videobuf2-core.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/dma-fence-array.h>
>  
>  #include <media/videobuf2-core.h>
>  #include <media/v4l2-mc.h>
> @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  		vb->index = q->num_buffers + buffer;
>  		vb->type = q->type;
>  		vb->memory = memory;
> +		spin_lock_init(&vb->fence_cb_lock);
>  		for (plane = 0; plane < num_planes; ++plane) {
>  			vb->planes[plane].length = plane_sizes[plane];
>  			vb->planes[plane].min_length = plane_sizes[plane];
> @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>  {
>  	struct vb2_queue *q = vb->vb2_queue;
>  
> +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> +		return;
> +
>  	vb->state = VB2_BUF_STATE_ACTIVE;
>  	atomic_inc(&q->owned_by_drv_count);
>  
> @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
>  	return 0;
>  }
>  
> +static int __get_num_ready_buffers(struct vb2_queue *q)
> +{
> +	struct vb2_buffer *vb;
> +	int ready_count = 0;
> +	unsigned long flags;
> +
> +	/* count num of buffers ready in front of the queued_list */
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> +			ready_count++;

Shouldn't there be a:

		else
			break;

here? You're counting the number of available (i.e. no fence or signaled
fence) buffers at the start of the queued buffer list.

> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	return ready_count;
> +}
> +
>  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>  {
>  	struct vb2_buffer *vb;
> @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
>  	return ret;
>  }
>  
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> +					struct vb2_buffer *vb,
> +					struct dma_fence *fence)
> +{
> +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
> +	/*
> +	 * We always guarantee the ordering of buffers queued from
> +	 * userspace to be the same it is queued to the driver. For that
> +	 * we create a fence array with the fence from the last queued
> +	 * buffer and this one, that way the fence for this buffer can't
> +	 * signal before the last one.
> +	 */
> +	if (fence && q->last_fence) {
> +		struct dma_fence **fences;
> +		struct dma_fence_array *arr;
> +
> +		if (fence->context == q->last_fence->context) {
> +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> +				dma_fence_put(q->last_fence);
> +				q->last_fence = dma_fence_get(fence);
> +			} else {
> +				dma_fence_put(fence);
> +				fence = dma_fence_get(q->last_fence);
> +			}
> +			return fence;
> +		}
> +
> +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> +		if (!fences)
> +			return ERR_PTR(-ENOMEM);
> +
> +		fences[0] = fence;
> +		fences[1] = q->last_fence;
> +
> +		arr = dma_fence_array_create(2, fences,
> +					     dma_fence_context_alloc(1),
> +					     1, false);
> +		if (!arr) {
> +			kfree(fences);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +
> +		fence = &arr->base;
> +
> +		q->last_fence = dma_fence_get(fence);
> +	} else if (!fence && q->last_fence) {
> +		fence = dma_fence_get(q->last_fence);
> +	}

I have absolutely no idea what the purpose is of this code.

The application is queuing buffers with in-fences. Why would it matter in what
order the in-fences trigger? All vb2 needs is the minimum number of usable
(no fence or signaled fence) buffers counted from the start of the queued list
(i.e. a contiguous set of buffers with no unsignaled fences in between).

When that's reached it can start streaming.

When streaming you cannot enqueue buffers to the driver as long as the first
buffer in the queued list is still waiting on a signal. Once that arrives the
fence callback can enqueue all buffers from the start of the queued list until
it reaches the next unsignaled fence.

At least, that's the mechanism I have in my head and I think that's what users
expect.

It looks like you allow for buffers to be queued to the driver out-of-order,
but you shouldn't do this.

If userspace queues buffers A, B, C and D in that order, then that's also the
order they should be queued to the driver. Especially with the upcoming support
for the request API keeping the sequence intact is crucial.

> +
> +	return fence;
> +}
> +
> +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> +	struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
> +	struct vb2_queue *q = vb->vb2_queue;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (!vb->in_fence) {
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +		return;
> +	}
> +
> +	dma_fence_put(vb->in_fence);
> +	vb->in_fence = NULL;
> +
> +	if (q->start_streaming_called)
> +		__enqueue_in_driver(vb);
> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +}
> +
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence)
>  {
>  	struct vb2_buffer *vb;
> +	unsigned long flags;
>  	int ret;
>  
>  	vb = q->bufs[index];
> @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>  	case VB2_BUF_STATE_DEQUEUED:
>  		ret = __buf_prepare(vb, pb);
>  		if (ret)
> -			return ret;
> +			goto err;
>  		break;
>  	case VB2_BUF_STATE_PREPARED:
>  		break;
>  	case VB2_BUF_STATE_PREPARING:
>  		dprintk(1, "buffer still being prepared\n");
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	default:
>  		dprintk(1, "invalid buffer state %d\n", vb->state);
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err;
>  	}
>  
>  	/*
> @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>  
>  	trace_vb2_qbuf(q, vb);
>  
> +	vb->in_fence = __set_in_fence(q, vb, fence);
> +	if (IS_ERR(vb->in_fence)) {
> +		dma_fence_put(fence);
> +		ret = PTR_ERR(vb->in_fence);
> +		goto err;
> +	}
> +	fence = NULL;
> +
> +	/*
> +	 * If it is time to call vb2_start_streaming() wait for the fence
> +	 * to signal first. Of course, this happens only once per streaming.
> +	 * We want to run any step that might fail before we set the callback
> +	 * to queue the fence when it signals.
> +	 */
> +	if (vb->in_fence && !q->start_streaming_called &&
> +	    __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
> +		dma_fence_wait(vb->in_fence, true);

This is very wrong. This should be done in the fence callback: i.e. if we're not
streaming and __get_num_ready_buffers(q) >= q->min_buffers_needed, then call
vb2_start_streaming(q).

We definitely should not do a blocking qbuf here.

> +
>  	/*
>  	 * If streamon has been called, and we haven't yet called
>  	 * start_streaming() since not enough buffers were queued, and
>  	 * we now have reached the minimum number of queued buffers,
>  	 * then we can finally call start_streaming().
> -	 *
> -	 * If already streaming, give the buffer to driver for processing.
> -	 * If not, the buffer will be given to driver on next streamon.
>  	 */
>  	if (q->streaming && !q->start_streaming_called &&
> -	    q->queued_count >= q->min_buffers_needed) {
> +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {
>  		ret = vb2_start_streaming(q);
>  		if (ret)
> -			return ret;
> -	} else if (q->start_streaming_called) {
> -		__enqueue_in_driver(vb);
> +			goto err;
>  	}
>  
> +	/*
> +	 * For explicit synchronization: If the fence didn't signal
> +	 * yet we setup a callback to queue the buffer once the fence
> +	 * signals, and then, return successfully. But if the fence
> +	 * already signaled we lose the reference we held and queue the
> +	 * buffer to the driver.
> +	 */
> +	spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +	if (vb->in_fence) {
> +		ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
> +					     vb2_qbuf_fence_cb);
> +		if (ret == -EINVAL) {
> +			spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +			goto err;
> +		} else if (!ret) {
> +			goto fill;
> +		}
> +
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;
> +	}
> +
> +fill:
> +	/*
> +	 * If already streaming and there is no fence to wait on
> +	 * give the buffer to driver for processing.
> +	 */
> +	if (q->start_streaming_called && !vb->in_fence)
> +		__enqueue_in_driver(vb);
> +	spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +
>  	/* Fill buffer information for the userspace */
>  	if (pb)
>  		call_void_bufop(q, fill_user_buffer, vb, pb);
>  
>  	dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>  	return 0;
> +
> +err:
> +	if (vb->in_fence) {
> +		dma_fence_put(vb->in_fence);
> +		vb->in_fence = NULL;
> +	}
> +
> +	return ret;
> +
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_qbuf);
>  
> @@ -1632,6 +1787,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
>  static void __vb2_queue_cancel(struct vb2_queue *q)
>  {
>  	unsigned int i;
> +	struct vb2_buffer *vb;
> +	unsigned long flags;
>  
>  	/*
>  	 * Tell driver to stop all transactions and release all queued
> @@ -1659,6 +1816,21 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>  	q->queued_count = 0;
>  	q->error = 0;
>  
> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> +		if (vb->in_fence) {
> +			dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
> +			dma_fence_put(vb->in_fence);
> +			vb->in_fence = NULL;
> +		}
> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> +	}
> +
> +	if (q->last_fence) {
> +		dma_fence_put(q->last_fence);
> +		q->last_fence = NULL;
> +	}
> +
>  	/*
>  	 * Remove all buffers from videobuf's list...
>  	 */
> @@ -1720,7 +1892,7 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
>  	 * Tell driver to start streaming provided sufficient buffers
>  	 * are available.
>  	 */
> -	if (q->queued_count >= q->min_buffers_needed) {
> +	if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
>  		ret = v4l_vb2q_enable_media_source(q);
>  		if (ret)
>  			return ret;
> @@ -2240,7 +2412,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>  		 * Queue all buffers.
>  		 */
>  		for (i = 0; i < q->num_buffers; i++) {
> -			ret = vb2_core_qbuf(q, i, NULL);
> +			ret = vb2_core_qbuf(q, i, NULL, NULL);
>  			if (ret)
>  				goto err_reqbufs;
>  			fileio->bufs[i].queued = 1;
> @@ -2419,7 +2591,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
>  
>  		if (copy_timestamp)
>  			b->timestamp = ktime_get_ns();
> -		ret = vb2_core_qbuf(q, index, NULL);
> +		ret = vb2_core_qbuf(q, index, NULL, NULL);
>  		dprintk(5, "vb2_dbuf result: %d\n", ret);
>  		if (ret)
>  			return ret;
> @@ -2522,7 +2694,7 @@ static int vb2_thread(void *data)
>  		if (copy_timestamp)
>  			vb->timestamp = ktime_get_ns();;
>  		if (!threadio->stop)
> -			ret = vb2_core_qbuf(q, vb->index, NULL);
> +			ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
>  		call_void_qop(q, wait_prepare, q);
>  		if (ret || threadio->stop)
>  			break;
> diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
> index 110fb45fef6f..4c09ea007d90 100644
> --- a/drivers/media/v4l2-core/videobuf2-v4l2.c
> +++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
> @@ -23,6 +23,7 @@
>  #include <linux/sched.h>
>  #include <linux/freezer.h>
>  #include <linux/kthread.h>
> +#include <linux/sync_file.h>
>  
>  #include <media/v4l2-dev.h>
>  #include <media/v4l2-fh.h>
> @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
>  		return -EINVAL;
>  	}
>  
> +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&
> +	    !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
> +		dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
> +		return -EINVAL;
> +	}
> +
>  	return __verify_planes_array(q->bufs[b->index], b);
>  }
>  
> @@ -203,9 +210,14 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>  	b->timestamp = ns_to_timeval(vb->timestamp);
>  	b->timecode = vbuf->timecode;
>  	b->sequence = vbuf->sequence;
> -	b->fence_fd = -1;
>  	b->reserved = 0;
>  
> +	b->fence_fd = -1;
> +	if (vb->in_fence)
> +		b->flags |= V4L2_BUF_FLAG_IN_FENCE;
> +	else
> +		b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
> +
>  	if (q->is_multiplanar) {
>  		/*
>  		 * Fill in plane-related data if userspace provided an array
> @@ -560,6 +572,7 @@ EXPORT_SYMBOL_GPL(vb2_create_bufs);
>  
>  int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  {
> +	struct dma_fence *fence = NULL;
>  	int ret;
>  
>  	if (vb2_fileio_is_active(q)) {
> @@ -568,7 +581,19 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>  	}
>  
>  	ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> -	return ret ? ret : vb2_core_qbuf(q, b->index, b);
> +	if (ret)
> +		return ret;
> +
> +	if (b->flags & V4L2_BUF_FLAG_IN_FENCE) {
> +		fence = sync_file_get_fence(b->fence_fd);
> +		if (!fence) {
> +			dprintk(1, "failed to get in-fence from fd %d\n",
> +				b->fence_fd);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	return vb2_core_qbuf(q, b->index, b, fence);
>  }
>  EXPORT_SYMBOL_GPL(vb2_qbuf);
>  
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 38b9c8dd42c6..5f48c7be7770 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -16,6 +16,7 @@
>  #include <linux/mutex.h>
>  #include <linux/poll.h>
>  #include <linux/dma-buf.h>
> +#include <linux/dma-fence.h>
>  
>  #define VB2_MAX_FRAME	(32)
>  #define VB2_MAX_PLANES	(8)
> @@ -254,11 +255,20 @@ struct vb2_buffer {
>  	 *			all buffers queued from userspace
>  	 * done_entry:		entry on the list that stores all buffers ready
>  	 *			to be dequeued to userspace
> +	 * in_fence:		fence receive from vb2 client to wait on before
> +	 *			using the buffer (queueing to the driver)
> +	 * fence_cb:		fence callback information
> +	 * fence_cb_lock:	protect callback signal/remove
>  	 */
>  	enum vb2_buffer_state	state;
>  
>  	struct list_head	queued_entry;
>  	struct list_head	done_entry;
> +
> +	struct dma_fence	*in_fence;
> +	struct dma_fence_cb	fence_cb;
> +	spinlock_t              fence_cb_lock;
> +
>  #ifdef CONFIG_VIDEO_ADV_DEBUG
>  	/*
>  	 * Counters for how often these buffer-related ops are
> @@ -504,6 +514,7 @@ struct vb2_buf_ops {
>   * @last_buffer_dequeued: used in poll() and DQBUF to immediately return if the
>   *		last decoded buffer was already dequeued. Set for capture queues
>   *		when a buffer with the V4L2_BUF_FLAG_LAST is dequeued.
> + * @last_fence:	last in-fence received. Used to keep ordering.
>   * @fileio:	file io emulator internal data, used only if emulator is active
>   * @threadio:	thread io internal data, used only if thread is active
>   */
> @@ -558,6 +569,8 @@ struct vb2_queue {
>  	unsigned int			copy_timestamp:1;
>  	unsigned int			last_buffer_dequeued:1;
>  
> +	struct dma_fence		*last_fence;
> +
>  	struct vb2_fileio_data		*fileio;
>  	struct vb2_threadio_data	*threadio;
>  
> @@ -733,6 +746,7 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
>   * @index:	id number of the buffer
>   * @pb:		buffer structure passed from userspace to vidioc_qbuf handler
>   *		in driver
> + * @fence:	in-fence to wait on before queueing the buffer
>   *
>   * Should be called from vidioc_qbuf ioctl handler of a driver.
>   * The passed buffer should have been verified.
> @@ -747,7 +761,8 @@ int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb);
>   * The return values from this function are intended to be directly returned
>   * from vidioc_qbuf handler in driver.
>   */
> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb);
> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
> +		  struct dma_fence *fence);
>  
>  /**
>   * vb2_core_dqbuf() - Dequeue a buffer to the userspace
> 

Regards,

	Hans

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 06/11] [media] vb2: add explicit fence user API
  2017-11-17 13:53     ` Mauro Carvalho Chehab
@ 2017-11-17 14:31       ` Hans Verkuil
  0 siblings, 0 replies; 48+ messages in thread
From: Hans Verkuil @ 2017-11-17 14:31 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Gustavo Padovan, linux-media, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On 17/11/17 14:53, Mauro Carvalho Chehab wrote:
> Em Fri, 17 Nov 2017 14:29:23 +0100
> Hans Verkuil <hverkuil@xs4all.nl> escreveu:
> 
>> On 15/11/17 18:10, Gustavo Padovan wrote:
>>> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>>>
>>> Turn the reserved2 field into fence_fd that we will use to send
>>> an in-fence to the kernel and return an out-fence from the kernel to
>>> userspace.
>>>
>>> Two new flags were added, V4L2_BUF_FLAG_IN_FENCE, that should be used
>>> when sending a fence to the kernel to be waited on, and
>>> V4L2_BUF_FLAG_OUT_FENCE, to ask the kernel to give back an out-fence.
>>>
>>> v4:
>>> 	- make it a union with reserved2 and fence_fd (Hans Verkuil)
>>>
>>> v3:
>>> 	- make the out_fence refer to the current buffer (Hans Verkuil)
>>>
>>> v2: add documentation
>>>
>>> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
>>> ---
>>>  Documentation/media/uapi/v4l/buffer.rst       | 15 +++++++++++++++
>>>  drivers/media/usb/cpia2/cpia2_v4l.c           |  2 +-
>>>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  4 ++--
>>>  drivers/media/v4l2-core/videobuf2-v4l2.c      |  2 +-
>>>  include/uapi/linux/videodev2.h                |  7 ++++++-
>>>  5 files changed, 25 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
>>> index ae6ee73f151c..eeefbd2547e7 100644
>>> --- a/Documentation/media/uapi/v4l/buffer.rst
>>> +++ b/Documentation/media/uapi/v4l/buffer.rst
>>> @@ -648,6 +648,21 @@ Buffer Flags
>>>        - Start Of Exposure. The buffer timestamp has been taken when the
>>>  	exposure of the frame has begun. This is only valid for the
>>>  	``V4L2_BUF_TYPE_VIDEO_CAPTURE`` buffer type.
>>> +    * .. _`V4L2-BUF-FLAG-IN-FENCE`:
>>> +
>>> +      - ``V4L2_BUF_FLAG_IN_FENCE``
>>> +      - 0x00200000
>>> +      - Ask V4L2 to wait on fence passed in ``fence_fd`` field. The buffer
>>> +	won't be queued to the driver until the fence signals.
>>> +
>>> +    * .. _`V4L2-BUF-FLAG-OUT-FENCE`:
>>> +
>>> +      - ``V4L2_BUF_FLAG_OUT_FENCE``
>>> +      - 0x00400000
>>> +      - Request a fence to be attached to the buffer. The ``fence_fd``
>>> +	field on
>>> +	:ref:`VIDIOC_QBUF` is used as a return argument to send the out-fence
>>> +	fd to userspace.  
>>
>> How would userspace know if fences are not supported? E.g. any driver that does
>> not use vb2 will have no support for it.
>>
>> While the driver could clear the flag on return, the problem is that it is a bit
>> late for applications to discover lack of fence support.
>>
>> Perhaps we do need a capability flag for this? I wonder what others think.
> 
> We're almost running out of flags at v4l2 caps (and at struct v4l2_buffer).

struct v4l2_capability has more than enough room to add a new device_caps2 field.
So I see no problem there, and it is very useful for applications to know what
features are supported up front and not when you start to use them.

Think about it: you're setting up complete fence support in your application, only
to discover when you queue the first buffer that there is no fence support! That
doesn't work.

The reserved[] array wasn't added for nothing to v4l2_capability.

struct v4l2_buffer is indeed very full. But I posted an RFC on October 26 introducing
a struct v4l2_ext_buffer, designed from scratch. We can switch to a u64 flags there.

See here: https://www.mail-archive.com/linux-media@vger.kernel.org/msg121215.html

I am waiting for fences and the request API to go in before continuing with that
RFC series, unless we think it is better to only support fences/request API with
this redesign. Let me know and I can pick up development of that RFC.

> 
> So, I would prefer to not add more flags on those structs if there is
> another way.
> 
> As the fences out of order flags should go to ENUM_FMT (and, currently
> there's just one flag defined there), I wander if it would make sense
> to also add CAN_IN_FENCES/CAN_OUT_FENCES flags there, as maybe we
> would want to disable/enable fences based on the format.

I don't see a reason for that. There is no relationship between formats
and fences. Fences apply to buffers, not formats. Whereas the 'ordered'
value can be specific to a format.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:47       ` Mauro Carvalho Chehab
@ 2017-11-17 17:20         ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 17:20 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, Hans Verkuil, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:

> Em Fri, 17 Nov 2017 11:12:48 -0200
> Gustavo Padovan <gustavo@padovan.org> escreveu:
> 
> > > >  	/*
> > > >  	 * If streamon has been called, and we haven't yet called
> > > >  	 * start_streaming() since not enough buffers were queued, and
> > > >  	 * we now have reached the minimum number of queued buffers,
> > > >  	 * then we can finally call start_streaming().
> > > > -	 *
> > > > -	 * If already streaming, give the buffer to driver for processing.
> > > > -	 * If not, the buffer will be given to driver on next streamon.
> > > >  	 */
> > > >  	if (q->streaming && !q->start_streaming_called &&
> > > > -	    q->queued_count >= q->min_buffers_needed) {
> > > > +	    __get_num_ready_buffers(q) >= q->min_buffers_needed) {  
> > > 
> > > I guess the case where fences is not used is not covered here.
> > > 
> > > You probably should add a check at __get_num_ready_buffers(q)
> > > as well, making it just return q->queued_count if fences isn't
> > > used.  
> > 
> > We can't know that beforehand, some buffer ahead may have a fence,
> > but there is already a check for !fence for each buffer. If none of
> > them have fences the return will be equal to q->queued_count.
> 
> Hmm... are we willing to support the case where just some
> buffers have fences? Why?

It may be that some fences are already signaled before the QBUF call
happens, so the app may just pass -1 instead.

> In any case, we should add a notice at the documentation telling
> about what happens if not all buffers have fences, and if fences
> are required to be setup before starting streaming, or if it is
> possible to dynamically change the fances behavior while streaming.

We don't have such thing. The fence behavior is tied to each QBUF call,
the stream can be setup without knowing anything about if fences are
going to be used or not. I think we need a good reason to do otherwise.
Yet, I can add something to the docs saying that fences are exclusively
per QBUF call.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 14:15   ` Hans Verkuil
@ 2017-11-17 17:40     ` Gustavo Padovan
  2017-11-17 17:50       ` Gustavo Padovan
  2017-11-18  9:30       ` Hans Verkuil
  0 siblings, 2 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 17:40 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: linux-media, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Hans Verkuil <hverkuil@xs4all.nl>:

> On 15/11/17 18:10, Gustavo Padovan wrote:
> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > 
> > Receive in-fence from userspace and add support for waiting on them
> > before queueing the buffer to the driver. Buffers can't be queued to the
> > driver before its fences signal. And a buffer can't be queue to the driver
> > out of the order they were queued from userspace. That means that even if
> > it fence signal it must wait all other buffers, ahead of it in the queue,
> > to signal first.
> > 
> > To make that possible we use fence_array to keep that ordering. Basically
> > we create a fence_array that contains both the current fence and the fence
> > from the previous buffer (which might be a fence array as well). The base
> > fence class for the fence_array becomes the new buffer fence, waiting on
> > that one guarantees that it won't be queued out of order.
> > 
> > v6:
> > 	- With fences always keep the order userspace queues the buffers.
> > 	- Protect in_fence manipulation with a lock (Brian Starkey)
> > 	- check if fences have the same context before adding a fence array
> > 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> > 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> > 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> > 
> > v5:	- use fence_array to keep buffers ordered in vb2 core when
> > 	needed (Brian Starkey)
> > 	- keep backward compat on the reserved2 field (Brian Starkey)
> > 	- protect fence callback removal with lock (Brian Starkey)
> > 
> > v4:
> > 	- Add a comment about dma_fence_add_callback() not returning a
> > 	error (Hans)
> > 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> > 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> > 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> > 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> > 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> > 	vb2_start_streaming() (Hans)
> > 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> > 	- Queue buffers to the driver as soon as they are ready (Hans)
> > 	- call fill_user_buffer() after queuing the buffer (Hans)
> > 	- add err: label to clean up fence
> > 	- add dma_fence_wait() before calling vb2_start_streaming()
> > 
> > v3:	- document fence parameter
> > 	- remove ternary if at vb2_qbuf() return (Mauro)
> > 	- do not change if conditions behaviour (Mauro)
> > 
> > v2:
> > 	- fix vb2_queue_or_prepare_buf() ret check
> > 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> > 	- check num of ready buffers to start streaming
> > 	- when queueing, start from the first ready buffer
> > 	- handle queue cancel
> > 
> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > ---
> >  drivers/media/v4l2-core/Kconfig          |   1 +
> >  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
> >  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
> >  include/media/videobuf2-core.h           |  17 ++-
> >  4 files changed, 231 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> > index a35c33686abf..3f988c407c80 100644
> > --- a/drivers/media/v4l2-core/Kconfig
> > +++ b/drivers/media/v4l2-core/Kconfig
> > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> >  # Used by drivers that need Videobuf2 modules
> >  config VIDEOBUF2_CORE
> >  	select DMA_SHARED_BUFFER
> > +	select SYNC_FILE
> >  	tristate
> >  
> >  config VIDEOBUF2_MEMOPS
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> > index 60f8b582396a..26de4c80717d 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/sched.h>
> >  #include <linux/freezer.h>
> >  #include <linux/kthread.h>
> > +#include <linux/dma-fence-array.h>
> >  
> >  #include <media/videobuf2-core.h>
> >  #include <media/v4l2-mc.h>
> > @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
> >  		vb->index = q->num_buffers + buffer;
> >  		vb->type = q->type;
> >  		vb->memory = memory;
> > +		spin_lock_init(&vb->fence_cb_lock);
> >  		for (plane = 0; plane < num_planes; ++plane) {
> >  			vb->planes[plane].length = plane_sizes[plane];
> >  			vb->planes[plane].min_length = plane_sizes[plane];
> > @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
> >  {
> >  	struct vb2_queue *q = vb->vb2_queue;
> >  
> > +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> > +		return;
> > +
> >  	vb->state = VB2_BUF_STATE_ACTIVE;
> >  	atomic_inc(&q->owned_by_drv_count);
> >  
> > @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
> >  	return 0;
> >  }
> >  
> > +static int __get_num_ready_buffers(struct vb2_queue *q)
> > +{
> > +	struct vb2_buffer *vb;
> > +	int ready_count = 0;
> > +	unsigned long flags;
> > +
> > +	/* count num of buffers ready in front of the queued_list */
> > +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> > +			ready_count++;
> 
> Shouldn't there be a:
> 
> 		else
> 			break;
> 
> here? You're counting the number of available (i.e. no fence or signaled
> fence) buffers at the start of the queued buffer list.

Sure.

> 
> > +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
> > +	}
> > +
> > +	return ready_count;
> > +}
> > +
> >  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
> >  {
> >  	struct vb2_buffer *vb;
> > @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
> >  	return ret;
> >  }
> >  
> > -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> > +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
> > +					struct vb2_buffer *vb,
> > +					struct dma_fence *fence)
> > +{
> > +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
> > +		dma_fence_put(q->last_fence);
> > +		q->last_fence = NULL;
> > +	}
> > +
> > +	/*
> > +	 * We always guarantee the ordering of buffers queued from
> > +	 * userspace to be the same it is queued to the driver. For that
> > +	 * we create a fence array with the fence from the last queued
> > +	 * buffer and this one, that way the fence for this buffer can't
> > +	 * signal before the last one.
> > +	 */
> > +	if (fence && q->last_fence) {
> > +		struct dma_fence **fences;
> > +		struct dma_fence_array *arr;
> > +
> > +		if (fence->context == q->last_fence->context) {
> > +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
> > +				dma_fence_put(q->last_fence);
> > +				q->last_fence = dma_fence_get(fence);
> > +			} else {
> > +				dma_fence_put(fence);
> > +				fence = dma_fence_get(q->last_fence);
> > +			}
> > +			return fence;
> > +		}
> > +
> > +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
> > +		if (!fences)
> > +			return ERR_PTR(-ENOMEM);
> > +
> > +		fences[0] = fence;
> > +		fences[1] = q->last_fence;
> > +
> > +		arr = dma_fence_array_create(2, fences,
> > +					     dma_fence_context_alloc(1),
> > +					     1, false);
> > +		if (!arr) {
> > +			kfree(fences);
> > +			return ERR_PTR(-ENOMEM);
> > +		}
> > +
> > +		fence = &arr->base;
> > +
> > +		q->last_fence = dma_fence_get(fence);
> > +	} else if (!fence && q->last_fence) {
> > +		fence = dma_fence_get(q->last_fence);
> > +	}
> 
> I have absolutely no idea what the purpose is of this code.
> 
> The application is queuing buffers with in-fences. Why would it matter in what
> order the in-fences trigger? All vb2 needs is the minimum number of usable
> (no fence or signaled fence) buffers counted from the start of the queued list
> (i.e. a contiguous set of buffers with no unsignaled fences in between).
> 
> When that's reached it can start streaming.
> 
> When streaming you cannot enqueue buffers to the driver as long as the first
> buffer in the queued list is still waiting on a signal. Once that arrives the
> fence callback can enqueue all buffers from the start of the queued list until
> it reaches the next unsignaled fence.
> 
> At least, that's the mechanism I have in my head and I think that's what users
> expect.
> 
> It looks like you allow for buffers to be queued to the driver out-of-order,
> but you shouldn't do this.
> 
> If userspace queues buffers A, B, C and D in that order, then that's also the
> order they should be queued to the driver. Especially with the upcoming support
> for the request API keeping the sequence intact is crucial.

And is exactly what this code is for, it keeps the ordering using fences
magic. We append the fences of the previous buffer to the new one - that
is called a fence_array. The fence_array just signal after both fences
signal. That is enough to make sure it won't signal before any
previously queued fence.

Arternatively we can get rid of this code and deal directly with
ordering instead of relying on fences for that.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 17:40     ` Gustavo Padovan
@ 2017-11-17 17:50       ` Gustavo Padovan
  2017-11-18  9:30       ` Hans Verkuil
  1 sibling, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-17 17:50 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: linux-media, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

2017-11-17 Gustavo Padovan <gustavo@padovan.org>:

> 2017-11-17 Hans Verkuil <hverkuil@xs4all.nl>:
> 
> > On 15/11/17 18:10, Gustavo Padovan wrote:
> > > From: Gustavo Padovan <gustavo.padovan@collabora.com>
> > > 
> > > Receive in-fence from userspace and add support for waiting on them
> > > before queueing the buffer to the driver. Buffers can't be queued to the
> > > driver before its fences signal. And a buffer can't be queue to the driver
> > > out of the order they were queued from userspace. That means that even if
> > > it fence signal it must wait all other buffers, ahead of it in the queue,
> > > to signal first.
> > > 
> > > To make that possible we use fence_array to keep that ordering. Basically
> > > we create a fence_array that contains both the current fence and the fence
> > > from the previous buffer (which might be a fence array as well). The base
> > > fence class for the fence_array becomes the new buffer fence, waiting on
> > > that one guarantees that it won't be queued out of order.
> > > 
> > > v6:
> > > 	- With fences always keep the order userspace queues the buffers.
> > > 	- Protect in_fence manipulation with a lock (Brian Starkey)
> > > 	- check if fences have the same context before adding a fence array
> > > 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
> > > 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
> > > 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
> > > 
> > > v5:	- use fence_array to keep buffers ordered in vb2 core when
> > > 	needed (Brian Starkey)
> > > 	- keep backward compat on the reserved2 field (Brian Starkey)
> > > 	- protect fence callback removal with lock (Brian Starkey)
> > > 
> > > v4:
> > > 	- Add a comment about dma_fence_add_callback() not returning a
> > > 	error (Hans)
> > > 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
> > > 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
> > > 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
> > > 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
> > > 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
> > > 	vb2_start_streaming() (Hans)
> > > 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
> > > 	- Queue buffers to the driver as soon as they are ready (Hans)
> > > 	- call fill_user_buffer() after queuing the buffer (Hans)
> > > 	- add err: label to clean up fence
> > > 	- add dma_fence_wait() before calling vb2_start_streaming()
> > > 
> > > v3:	- document fence parameter
> > > 	- remove ternary if at vb2_qbuf() return (Mauro)
> > > 	- do not change if conditions behaviour (Mauro)
> > > 
> > > v2:
> > > 	- fix vb2_queue_or_prepare_buf() ret check
> > > 	- remove check for VB2_MEMORY_DMABUF only (Javier)
> > > 	- check num of ready buffers to start streaming
> > > 	- when queueing, start from the first ready buffer
> > > 	- handle queue cancel
> > > 
> > > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
> > > ---
> > >  drivers/media/v4l2-core/Kconfig          |   1 +
> > >  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
> > >  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
> > >  include/media/videobuf2-core.h           |  17 ++-
> > >  4 files changed, 231 insertions(+), 18 deletions(-)
> > > 
> > > diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
> > > index a35c33686abf..3f988c407c80 100644
> > > --- a/drivers/media/v4l2-core/Kconfig
> > > +++ b/drivers/media/v4l2-core/Kconfig
> > > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
> > >  # Used by drivers that need Videobuf2 modules
> > >  config VIDEOBUF2_CORE
> > >  	select DMA_SHARED_BUFFER
> > > +	select SYNC_FILE
> > >  	tristate
> > >  
> > >  config VIDEOBUF2_MEMOPS
> > > diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> > > index 60f8b582396a..26de4c80717d 100644
> > > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > > @@ -23,6 +23,7 @@
> > >  #include <linux/sched.h>
> > >  #include <linux/freezer.h>
> > >  #include <linux/kthread.h>
> > > +#include <linux/dma-fence-array.h>
> > >  
> > >  #include <media/videobuf2-core.h>
> > >  #include <media/v4l2-mc.h>
> > > @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
> > >  		vb->index = q->num_buffers + buffer;
> > >  		vb->type = q->type;
> > >  		vb->memory = memory;
> > > +		spin_lock_init(&vb->fence_cb_lock);
> > >  		for (plane = 0; plane < num_planes; ++plane) {
> > >  			vb->planes[plane].length = plane_sizes[plane];
> > >  			vb->planes[plane].min_length = plane_sizes[plane];
> > > @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
> > >  {
> > >  	struct vb2_queue *q = vb->vb2_queue;
> > >  
> > > +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
> > > +		return;
> > > +
> > >  	vb->state = VB2_BUF_STATE_ACTIVE;
> > >  	atomic_inc(&q->owned_by_drv_count);
> > >  
> > > @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
> > >  	return 0;
> > >  }
> > >  
> > > +static int __get_num_ready_buffers(struct vb2_queue *q)
> > > +{
> > > +	struct vb2_buffer *vb;
> > > +	int ready_count = 0;
> > > +	unsigned long flags;
> > > +
> > > +	/* count num of buffers ready in front of the queued_list */
> > > +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > > +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
> > > +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
> > > +			ready_count++;
> > 
> > Shouldn't there be a:
> > 
> > 		else
> > 			break;
> > 
> > here? You're counting the number of available (i.e. no fence or signaled
> > fence) buffers at the start of the queued buffer list.
> 
> Sure.

Actually, by using the fence_array machinery the fence won't be
signaling out of order, so the 'else break' isn't strictly necessary
here.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 17:40     ` Gustavo Padovan
  2017-11-17 17:50       ` Gustavo Padovan
@ 2017-11-18  9:30       ` Hans Verkuil
  1 sibling, 0 replies; 48+ messages in thread
From: Hans Verkuil @ 2017-11-18  9:30 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Mauro Carvalho Chehab, Shuah Khan, Pawel Osciak,
	Alexandre Courbot, Sakari Ailus, Brian Starkey, Thierry Escande,
	linux-kernel, Gustavo Padovan

On 17/11/17 18:40, Gustavo Padovan wrote:
> 2017-11-17 Hans Verkuil <hverkuil@xs4all.nl>:
> 
>> On 15/11/17 18:10, Gustavo Padovan wrote:
>>> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>>>
>>> Receive in-fence from userspace and add support for waiting on them
>>> before queueing the buffer to the driver. Buffers can't be queued to the
>>> driver before its fences signal. And a buffer can't be queue to the driver
>>> out of the order they were queued from userspace. That means that even if
>>> it fence signal it must wait all other buffers, ahead of it in the queue,
>>> to signal first.
>>>
>>> To make that possible we use fence_array to keep that ordering. Basically
>>> we create a fence_array that contains both the current fence and the fence
>>> from the previous buffer (which might be a fence array as well). The base
>>> fence class for the fence_array becomes the new buffer fence, waiting on
>>> that one guarantees that it won't be queued out of order.
>>>
>>> v6:
>>> 	- With fences always keep the order userspace queues the buffers.
>>> 	- Protect in_fence manipulation with a lock (Brian Starkey)
>>> 	- check if fences have the same context before adding a fence array
>>> 	- Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
>>> 	- Clean up fence if __set_in_fence() fails (Brian Starkey)
>>> 	- treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
>>>
>>> v5:	- use fence_array to keep buffers ordered in vb2 core when
>>> 	needed (Brian Starkey)
>>> 	- keep backward compat on the reserved2 field (Brian Starkey)
>>> 	- protect fence callback removal with lock (Brian Starkey)
>>>
>>> v4:
>>> 	- Add a comment about dma_fence_add_callback() not returning a
>>> 	error (Hans)
>>> 	- Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
>>> 	- select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
>>> 	- Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
>>> 	- Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
>>> 	-  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
>>> 	vb2_start_streaming() (Hans)
>>> 	- set IN_FENCE flags on __fill_v4l2_buffer (Hans)
>>> 	- Queue buffers to the driver as soon as they are ready (Hans)
>>> 	- call fill_user_buffer() after queuing the buffer (Hans)
>>> 	- add err: label to clean up fence
>>> 	- add dma_fence_wait() before calling vb2_start_streaming()
>>>
>>> v3:	- document fence parameter
>>> 	- remove ternary if at vb2_qbuf() return (Mauro)
>>> 	- do not change if conditions behaviour (Mauro)
>>>
>>> v2:
>>> 	- fix vb2_queue_or_prepare_buf() ret check
>>> 	- remove check for VB2_MEMORY_DMABUF only (Javier)
>>> 	- check num of ready buffers to start streaming
>>> 	- when queueing, start from the first ready buffer
>>> 	- handle queue cancel
>>>
>>> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
>>> ---
>>>  drivers/media/v4l2-core/Kconfig          |   1 +
>>>  drivers/media/v4l2-core/videobuf2-core.c | 202 ++++++++++++++++++++++++++++---
>>>  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
>>>  include/media/videobuf2-core.h           |  17 ++-
>>>  4 files changed, 231 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig
>>> index a35c33686abf..3f988c407c80 100644
>>> --- a/drivers/media/v4l2-core/Kconfig
>>> +++ b/drivers/media/v4l2-core/Kconfig
>>> @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
>>>  # Used by drivers that need Videobuf2 modules
>>>  config VIDEOBUF2_CORE
>>>  	select DMA_SHARED_BUFFER
>>> +	select SYNC_FILE
>>>  	tristate
>>>  
>>>  config VIDEOBUF2_MEMOPS
>>> diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
>>> index 60f8b582396a..26de4c80717d 100644
>>> --- a/drivers/media/v4l2-core/videobuf2-core.c
>>> +++ b/drivers/media/v4l2-core/videobuf2-core.c
>>> @@ -23,6 +23,7 @@
>>>  #include <linux/sched.h>
>>>  #include <linux/freezer.h>
>>>  #include <linux/kthread.h>
>>> +#include <linux/dma-fence-array.h>
>>>  
>>>  #include <media/videobuf2-core.h>
>>>  #include <media/v4l2-mc.h>
>>> @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>>>  		vb->index = q->num_buffers + buffer;
>>>  		vb->type = q->type;
>>>  		vb->memory = memory;
>>> +		spin_lock_init(&vb->fence_cb_lock);
>>>  		for (plane = 0; plane < num_planes; ++plane) {
>>>  			vb->planes[plane].length = plane_sizes[plane];
>>>  			vb->planes[plane].min_length = plane_sizes[plane];
>>> @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>>>  {
>>>  	struct vb2_queue *q = vb->vb2_queue;
>>>  
>>> +	if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
>>> +		return;
>>> +
>>>  	vb->state = VB2_BUF_STATE_ACTIVE;
>>>  	atomic_inc(&q->owned_by_drv_count);
>>>  
>>> @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
>>>  	return 0;
>>>  }
>>>  
>>> +static int __get_num_ready_buffers(struct vb2_queue *q)
>>> +{
>>> +	struct vb2_buffer *vb;
>>> +	int ready_count = 0;
>>> +	unsigned long flags;
>>> +
>>> +	/* count num of buffers ready in front of the queued_list */
>>> +	list_for_each_entry(vb, &q->queued_list, queued_entry) {
>>> +		spin_lock_irqsave(&vb->fence_cb_lock, flags);
>>> +		if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
>>> +			ready_count++;
>>
>> Shouldn't there be a:
>>
>> 		else
>> 			break;
>>
>> here? You're counting the number of available (i.e. no fence or signaled
>> fence) buffers at the start of the queued buffer list.
> 
> Sure.
> 
>>
>>> +		spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>>> +	}
>>> +
>>> +	return ready_count;
>>> +}
>>> +
>>>  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>>>  {
>>>  	struct vb2_buffer *vb;
>>> @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
>>>  	return ret;
>>>  }
>>>  
>>> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>>> +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
>>> +					struct vb2_buffer *vb,
>>> +					struct dma_fence *fence)
>>> +{
>>> +	if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
>>> +		dma_fence_put(q->last_fence);
>>> +		q->last_fence = NULL;
>>> +	}
>>> +
>>> +	/*
>>> +	 * We always guarantee the ordering of buffers queued from
>>> +	 * userspace to be the same it is queued to the driver. For that
>>> +	 * we create a fence array with the fence from the last queued
>>> +	 * buffer and this one, that way the fence for this buffer can't
>>> +	 * signal before the last one.
>>> +	 */
>>> +	if (fence && q->last_fence) {
>>> +		struct dma_fence **fences;
>>> +		struct dma_fence_array *arr;
>>> +
>>> +		if (fence->context == q->last_fence->context) {
>>> +			if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
>>> +				dma_fence_put(q->last_fence);
>>> +				q->last_fence = dma_fence_get(fence);
>>> +			} else {
>>> +				dma_fence_put(fence);
>>> +				fence = dma_fence_get(q->last_fence);
>>> +			}
>>> +			return fence;
>>> +		}
>>> +
>>> +		fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
>>> +		if (!fences)
>>> +			return ERR_PTR(-ENOMEM);
>>> +
>>> +		fences[0] = fence;
>>> +		fences[1] = q->last_fence;
>>> +
>>> +		arr = dma_fence_array_create(2, fences,
>>> +					     dma_fence_context_alloc(1),
>>> +					     1, false);
>>> +		if (!arr) {
>>> +			kfree(fences);
>>> +			return ERR_PTR(-ENOMEM);
>>> +		}
>>> +
>>> +		fence = &arr->base;
>>> +
>>> +		q->last_fence = dma_fence_get(fence);
>>> +	} else if (!fence && q->last_fence) {
>>> +		fence = dma_fence_get(q->last_fence);
>>> +	}
>>
>> I have absolutely no idea what the purpose is of this code.
>>
>> The application is queuing buffers with in-fences. Why would it matter in what
>> order the in-fences trigger? All vb2 needs is the minimum number of usable
>> (no fence or signaled fence) buffers counted from the start of the queued list
>> (i.e. a contiguous set of buffers with no unsignaled fences in between).
>>
>> When that's reached it can start streaming.
>>
>> When streaming you cannot enqueue buffers to the driver as long as the first
>> buffer in the queued list is still waiting on a signal. Once that arrives the
>> fence callback can enqueue all buffers from the start of the queued list until
>> it reaches the next unsignaled fence.
>>
>> At least, that's the mechanism I have in my head and I think that's what users
>> expect.
>>
>> It looks like you allow for buffers to be queued to the driver out-of-order,
>> but you shouldn't do this.
>>
>> If userspace queues buffers A, B, C and D in that order, then that's also the
>> order they should be queued to the driver. Especially with the upcoming support
>> for the request API keeping the sequence intact is crucial.
> 
> And is exactly what this code is for, it keeps the ordering using fences
> magic. We append the fences of the previous buffer to the new one - that
> is called a fence_array. The fence_array just signal after both fences
> signal. That is enough to make sure it won't signal before any
> previously queued fence.
> 
> Arternatively we can get rid of this code and deal directly with
> ordering instead of relying on fences for that.

That sounds way better. We have the ordering already (i.e. the order in
which buffers are queued to the queued_list) so this code only complicates
matters.

The logic in the fence callback is the same as for the qbuf code:

If not streaming, then check if there are now sufficient buffers ready. If
not, return, if there are then queue them to the driver and start streaming.

If streaming, then queue all available (no fence or the fence signaled) buffers
from the start of the queue_list until you reach a buffer that is still waiting
for a fence.

It's simple and consistent. It might make sense to put this in a function that's
called by both qbuf and the fence callback. It depends on the code if that
makes sense.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:01     ` Gustavo Padovan
@ 2017-11-20  2:53       ` Alexandre Courbot
  0 siblings, 0 replies; 48+ messages in thread
From: Alexandre Courbot @ 2017-11-20  2:53 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: Linux Media Mailing List, Hans Verkuil, Mauro Carvalho Chehab,
	Shuah Khan, Pawel Osciak, Sakari Ailus, Brian Starkey,
	Thierry Escande, linux-kernel, Gustavo Padovan

On Fri, Nov 17, 2017 at 10:01 PM, Gustavo Padovan <gustavo@padovan.org> wrote:
> 2017-11-17 Alexandre Courbot <acourbot@chromium.org>:
>
>> Hi Gustavo,
>>
>> I am coming a bit late in this series' review, so apologies if some of my
>> comments have already have been discussed in an earlier revision.
>>
>> On Thursday, November 16, 2017 2:10:53 AM JST, Gustavo Padovan wrote:
>> > From: Gustavo Padovan <gustavo.padovan@collabora.com>
>> >
>> > Receive in-fence from userspace and add support for waiting on them
>> > before queueing the buffer to the driver. Buffers can't be queued to the
>> > driver before its fences signal. And a buffer can't be queue to the driver
>> > out of the order they were queued from userspace. That means that even if
>> > it fence signal it must wait all other buffers, ahead of it in the queue,
>> > to signal first.
>> >
>> > To make that possible we use fence_array to keep that ordering. Basically
>> > we create a fence_array that contains both the current fence and the fence
>> > from the previous buffer (which might be a fence array as well). The base
>> > fence class for the fence_array becomes the new buffer fence, waiting on
>> > that one guarantees that it won't be queued out of order.
>> >
>> > v6:
>> >     - With fences always keep the order userspace queues the buffers.
>> >     - Protect in_fence manipulation with a lock (Brian Starkey)
>> >     - check if fences have the same context before adding a fence array
>> >     - Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
>> >     - Clean up fence if __set_in_fence() fails (Brian Starkey)
>> >     - treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
>> >
>> > v5: - use fence_array to keep buffers ordered in vb2 core when
>> >     needed (Brian Starkey)
>> >     - keep backward compat on the reserved2 field (Brian Starkey)
>> >     - protect fence callback removal with lock (Brian Starkey)
>> >
>> > v4:
>> >     - Add a comment about dma_fence_add_callback() not returning a
>> >     error (Hans)
>> >     - Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
>> >     - select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
>> >     - Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
>> >     - Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
>> >     -  Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
>> >     vb2_start_streaming() (Hans)
>> >     - set IN_FENCE flags on __fill_v4l2_buffer (Hans)
>> >     - Queue buffers to the driver as soon as they are ready (Hans)
>> >     - call fill_user_buffer() after queuing the buffer (Hans)
>> >     - add err: label to clean up fence
>> >     - add dma_fence_wait() before calling vb2_start_streaming()
>> >
>> > v3: - document fence parameter
>> >     - remove ternary if at vb2_qbuf() return (Mauro)
>> >     - do not change if conditions behaviour (Mauro)
>> >
>> > v2:
>> >     - fix vb2_queue_or_prepare_buf() ret check
>> >     - remove check for VB2_MEMORY_DMABUF only (Javier)
>> >     - check num of ready buffers to start streaming
>> >     - when queueing, start from the first ready buffer
>> >     - handle queue cancel
>> >
>> > Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
>> > ---
>> >  drivers/media/v4l2-core/Kconfig          |   1 +
>> >  drivers/media/v4l2-core/videobuf2-core.c | 202
>> > ++++++++++++++++++++++++++++---
>> >  drivers/media/v4l2-core/videobuf2-v4l2.c |  29 ++++-
>> >  include/media/videobuf2-core.h           |  17 ++-
>> >  4 files changed, 231 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/media/v4l2-core/Kconfig
>> > b/drivers/media/v4l2-core/Kconfig
>> > index a35c33686abf..3f988c407c80 100644
>> > --- a/drivers/media/v4l2-core/Kconfig
>> > +++ b/drivers/media/v4l2-core/Kconfig
>> > @@ -83,6 +83,7 @@ config VIDEOBUF_DVB
>> >  # Used by drivers that need Videobuf2 modules
>> >  config VIDEOBUF2_CORE
>> >     select DMA_SHARED_BUFFER
>> > +   select SYNC_FILE
>> >     tristate
>> >  config VIDEOBUF2_MEMOPS
>> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c
>> > b/drivers/media/v4l2-core/videobuf2-core.c
>> > index 60f8b582396a..26de4c80717d 100644
>> > --- a/drivers/media/v4l2-core/videobuf2-core.c
>> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
>> > @@ -23,6 +23,7 @@
>> >  #include <linux/sched.h>
>> >  #include <linux/freezer.h>
>> >  #include <linux/kthread.h>
>> > +#include <linux/dma-fence-array.h>
>> >  #include <media/videobuf2-core.h>
>> >  #include <media/v4l2-mc.h>
>> > @@ -346,6 +347,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q,
>> > enum vb2_memory memory,
>> >             vb->index = q->num_buffers + buffer;
>> >             vb->type = q->type;
>> >             vb->memory = memory;
>> > +           spin_lock_init(&vb->fence_cb_lock);
>> >             for (plane = 0; plane < num_planes; ++plane) {
>> >                     vb->planes[plane].length = plane_sizes[plane];
>> >                     vb->planes[plane].min_length = plane_sizes[plane];
>> > @@ -1222,6 +1224,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>> >  {
>> >     struct vb2_queue *q = vb->vb2_queue;
>> > +   if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
>> > +           return;
>> > +
>> >     vb->state = VB2_BUF_STATE_ACTIVE;
>> >     atomic_inc(&q->owned_by_drv_count);
>> >   @@ -1273,6 +1278,23 @@ static int __buf_prepare(struct vb2_buffer *vb,
>> > const void *pb)
>> >     return 0;
>> >  }
>> > +static int __get_num_ready_buffers(struct vb2_queue *q)
>> > +{
>> > +   struct vb2_buffer *vb;
>> > +   int ready_count = 0;
>> > +   unsigned long flags;
>> > +
>> > +   /* count num of buffers ready in front of the queued_list */
>> > +   list_for_each_entry(vb, &q->queued_list, queued_entry) {
>> > +           spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> > +           if (!vb->in_fence || dma_fence_is_signaled(vb->in_fence))
>> > +                   ready_count++;
>> > +           spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> > +   }
>> > +
>> > +   return ready_count;
>> > +}
>> > +
>> >  int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>> >  {
>> >     struct vb2_buffer *vb;
>> > @@ -1361,9 +1383,87 @@ static int vb2_start_streaming(struct vb2_queue *q)
>> >     return ret;
>> >  }
>> > -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> > +static struct dma_fence *__set_in_fence(struct vb2_queue *q,
>> > +                                   struct vb2_buffer *vb,
>> > +                                   struct dma_fence *fence)
>> > +{
>> > +   if (q->last_fence && dma_fence_is_signaled(q->last_fence)) {
>> > +           dma_fence_put(q->last_fence);
>> > +           q->last_fence = NULL;
>> > +   }
>> > +
>> > +   /*
>> > +    * We always guarantee the ordering of buffers queued from
>> > +    * userspace to be the same it is queued to the driver. For that
>> > +    * we create a fence array with the fence from the last queued
>> > +    * buffer and this one, that way the fence for this buffer can't
>> > +    * signal before the last one.
>> > +    */
>> > +   if (fence && q->last_fence) {
>> > +           struct dma_fence **fences;
>> > +           struct dma_fence_array *arr;
>> > +
>> > +           if (fence->context == q->last_fence->context) {
>> > +                   if (fence->seqno - q->last_fence->seqno <= INT_MAX) {
>> > +                           dma_fence_put(q->last_fence);
>> > +                           q->last_fence = dma_fence_get(fence);
>> > +                   } else {
>> > +                           dma_fence_put(fence);
>> > +                           fence = dma_fence_get(q->last_fence);
>> > +                   }
>> > +                   return fence;
>> > +           }
>> > +
>> > +           fences = kcalloc(2, sizeof(*fences), GFP_KERNEL);
>> > +           if (!fences)
>> > +                   return ERR_PTR(-ENOMEM);
>> > +
>> > +           fences[0] = fence;
>> > +           fences[1] = q->last_fence;
>> > +
>> > +           arr = dma_fence_array_create(2, fences,
>> > +                                        dma_fence_context_alloc(1),
>> > +                                        1, false);
>> > +           if (!arr) {
>> > +                   kfree(fences);
>> > +                   return ERR_PTR(-ENOMEM);
>> > +           }
>> > +
>> > +           fence = &arr->base;
>> > +
>> > +           q->last_fence = dma_fence_get(fence);
>> > +   } else if (!fence && q->last_fence) {
>> > +           fence = dma_fence_get(q->last_fence);
>> > +   }
>> > +
>> > +   return fence;
>> > +}
>> > +
>> > +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
>> > +{
>> > +   struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
>> > +   struct vb2_queue *q = vb->vb2_queue;
>> > +   unsigned long flags;
>> > +
>> > +   spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> > +   if (!vb->in_fence) {
>> > +           spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> > +           return;
>> > +   }
>>
>> Is this block necessary? IIUC the callback will never be set on buffers
>> without
>> an input fence, so (!vb->in_fence) should never be satisfied.
>
> Not anymore! I added it when trying to fix the potential race condition
> in the wrong way, but the newly added spinlock fixes that.
>
>>
>> > +
>> > +   dma_fence_put(vb->in_fence);
>> > +   vb->in_fence = NULL;
>> > +
>> > +   if (q->start_streaming_called)
>> > +           __enqueue_in_driver(vb);
>> > +   spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> > +}
>> > +
>> > +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
>> > +             struct dma_fence *fence)
>> >  {
>> >     struct vb2_buffer *vb;
>> > +   unsigned long flags;
>> >     int ret;
>> >     vb = q->bufs[index];
>> > @@ -1372,16 +1472,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned
>> > int index, void *pb)
>> >     case VB2_BUF_STATE_DEQUEUED:
>> >             ret = __buf_prepare(vb, pb);
>> >             if (ret)
>> > -                   return ret;
>> > +                   goto err;
>> >             break;
>> >     case VB2_BUF_STATE_PREPARED:
>> >             break;
>> >     case VB2_BUF_STATE_PREPARING:
>> >             dprintk(1, "buffer still being prepared\n");
>> > -           return -EINVAL;
>> > +           ret = -EINVAL;
>> > +           goto err;
>> >     default:
>> >             dprintk(1, "invalid buffer state %d\n", vb->state);
>> > -           return -EINVAL;
>> > +           ret = -EINVAL;
>> > +           goto err;
>> >     }
>> >     /*
>> > @@ -1398,30 +1500,83 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned
>> > int index, void *pb)
>> >     trace_vb2_qbuf(q, vb);
>> > +   vb->in_fence = __set_in_fence(q, vb, fence);
>> > +   if (IS_ERR(vb->in_fence)) {
>> > +           dma_fence_put(fence);
>> > +           ret = PTR_ERR(vb->in_fence);
>> > +           goto err;
>> > +   }
>> > +   fence = NULL;
>> > +
>> > +   /*
>> > +    * If it is time to call vb2_start_streaming() wait for the fence
>> > +    * to signal first. Of course, this happens only once per streaming.
>> > +    * We want to run any step that might fail before we set the callback
>> > +    * to queue the fence when it signals.
>> > +    */
>> > +   if (vb->in_fence && !q->start_streaming_called &&
>> > +       __get_num_ready_buffers(q) == q->min_buffers_needed - 1)
>> > +           dma_fence_wait(vb->in_fence, true);
>>
>> Mmm, that's a tough call. Userspace may unexpectingly block due to this
>> (which
>> fences are supposed to prevent), not sure how much of a problem this may be.
>
> Yes, but it happen just once per stream, when we are about to start it.
> Alternatively, we can wait later, and if the fence wait returns a error
> we can carry it until we are able to send the buffer back with error on
> DQBUF.

I'm afraid that having this wait happening here, and only at that
time, would be quite confusing for user-space. It may go unnoticed for
a while and then manifest itself, causing the UI to block unexpectedly
and resulting in frustrating hours of debugging user programs for what
happens to be "normal" kernel behavior. It would probably be for the
best if this could be moved somewhere else so user-space gets a more
consistent behavior.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 00/11] V4L2 Explicit Synchronization
  2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
                   ` (10 preceding siblings ...)
  2017-11-15 17:10 ` [RFC v5 11/11] [media] v4l: Document explicit synchronization behavior Gustavo Padovan
@ 2017-11-20 10:19 ` Smitha T Murthy
  2017-11-30 18:53   ` Gustavo Padovan
  11 siblings, 1 reply; 48+ messages in thread
From: Smitha T Murthy @ 2017-11-20 10:19 UTC (permalink / raw)
  To: Gustavo Padovan
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Alexandre Courbot, Sakari Ailus, Brian Starkey,
	Thierry Escande, linux-kernel, Gustavo Padovan

Hi Gustavo,

I am currently referring to your implementation for explicit
synchronisation. For the same I needed your testapp, but I am unable
to download the same at the link provided
“https://gitlab.collabora.com/padovan/v4l2-fences-test”

Could you please help me out with the same.

Regards
Smitha

On Wednesday, November 15, 2017, Gustavo Padovan <gustavo@padovan.org> wrote:
>
> From: Gustavo Padovan <gustavo.padovan@collabora.com>
>
> Hi,
>
> After the comments received in the last patchset[1] and
> during the media summit [2] here is the new and improved version
> of the patchset. The implementation is simpler, smaller and cover
> a lot more cases.
>
> If you look to the last patchset I got rid of a few things, the main
> one is the OUT_FENCE event, one thing that we decided in Prague was
> that, when using fences, we would keep ordering of all buffers queued
> to vb2. That means they would be queued to the drivers in the same order
> that the QBUF calls happen, just like it already happens when not using
> fences. Fences can signal in whatever order, so we need this guarantee
> here. Drivers can, however, not keep ordering when processing the
> buffers.
>
> But there is one conclusion of that that we didn't reached at the
> summit, maybe because of the order we discussed things, and that is: we do
> not need the OUT_FENCE event anymore, because now at the QBUF call time
> we *always* know the order in which the buffers will be queued to the
> v4l2 driver. So the out-fence fd is now returned using the fence_fd
> field as a return argument, thus the event is not necessary anymore.
>
> The fence_fd field is now used to comunicate both in-fences and
> out-fences, just like we do for GPU drivers. We pass in-fences as input
> arguments and get out-fences as return arguments on the QBUF call.
> The approach is documented.
>
> I also added a capability flag, V4L2_CAP_ORDERED, to tell userspace if
> the v4l2 drivers keep the buffers ordered or not.
>
> We still have the 'ordered_in_driver' property for queues, but its
> meaning has changed. When set videobuf2 will know that the driver can
> keep the order of the buffers, thus videobuf2 can use the same fence
> context for all out-fences. Fences inside the same context should signal
> in order, so 'ordered_in_driver' is a optimization for that case.
> When not set, a context for each out-fence is created.
>
> So now explicit synchronization also works for drivers that do not keep
> the ordering of buffers.
>
> Another thing is that we do not allow videobuf2 to requeue buffers
> internally when using fences, they have a fence associated to it and
> we need to finish the job on them, i.e., signal the fence, even if an
> error happened.
>
> The rest of the changes are documented in each patch separated.
>
> There a test app at:
>
> https://gitlab.collabora.com/padovan/v4l2-fences-test
>
> Among my next steps is to create a v4l2->drm test app using fences as a
> PoC, and also look into how to support it in ChromeOS.
>
> Open Questions
> --------------
>
> * Do drivers reorder buffers internally? How to handle that with fences?
>
> * How to handle audio/video syncronization? Fences aren't enough, we  need
>   to know things like the start of capture timestamp.
>
> Regards,
>
> Gustavo
> --
>
> [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1518928.html
> [2] http://muistio.tieke.fi/p/media-summit-2017
>
> Gustavo Padovan (10):
>   [media] v4l: add V4L2_CAP_ORDERED to the uapi
>   [media] vivid: add the V4L2_CAP_ORDERED capability
>   [media] vb2: add 'ordered_in_driver' property to queues
>   [media] vivid: mark vivid queues as ordered_in_driver
>   [media] vb2: check earlier if stream can be started
>   [media] vb2: add explicit fence user API
>   [media] vb2: add in-fence support to QBUF
>   [media] vb2: add infrastructure to support out-fences
>   [media] vb2: add out-fence support to QBUF
>   [media] v4l: Document explicit synchronization behavior
>
> Javier Martinez Canillas (1):
>   [media] vb2: add videobuf2 dma-buf fence helpers
>
>  Documentation/media/uapi/v4l/buffer.rst          |  15 ++
>  Documentation/media/uapi/v4l/vidioc-qbuf.rst     |  42 +++-
>  Documentation/media/uapi/v4l/vidioc-querybuf.rst |   9 +-
>  Documentation/media/uapi/v4l/vidioc-querycap.rst |   3 +
>  drivers/media/platform/vivid/vivid-core.c        |  24 +-
>  drivers/media/usb/cpia2/cpia2_v4l.c              |   2 +-
>  drivers/media/v4l2-core/Kconfig                  |   1 +
>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c    |   4 +-
>  drivers/media/v4l2-core/videobuf2-core.c         | 274 +++++++++++++++++++++--
>  drivers/media/v4l2-core/videobuf2-v4l2.c         |  48 +++-
>  include/media/videobuf2-core.h                   |  44 +++-
>  include/media/videobuf2-fence.h                  |  48 ++++
>  include/uapi/linux/videodev2.h                   |   8 +-
>  13 files changed, 485 insertions(+), 37 deletions(-)
>  create mode 100644 include/media/videobuf2-fence.h
>
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 07/11] [media] vb2: add in-fence support to QBUF
  2017-11-17 13:19         ` Mauro Carvalho Chehab
@ 2017-11-20 11:41           ` Brian Starkey
  0 siblings, 0 replies; 48+ messages in thread
From: Brian Starkey @ 2017-11-20 11:41 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Gustavo Padovan, Alexandre Courbot, linux-media, Hans Verkuil,
	Shuah Khan, Pawel Osciak, Sakari Ailus, Thierry Escande,
	linux-kernel, Gustavo Padovan

On Fri, Nov 17, 2017 at 11:19:05AM -0200, Mauro Carvalho Chehab wrote:
>Em Fri, 17 Nov 2017 11:08:01 -0200
>Gustavo Padovan <gustavo@padovan.org> escreveu:
>
>> 2017-11-17 Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
>>
>> > Em Fri, 17 Nov 2017 15:49:23 +0900
>> > Alexandre Courbot <acourbot@chromium.org> escreveu:
>> >
>> > > > @@ -178,6 +179,12 @@ static int vb2_queue_or_prepare_buf(struct
>> > > > vb2_queue *q, struct v4l2_buffer *b,
>> > > >  		return -EINVAL;
>> > > >  	}
>> > > >
>> > > > +	if ((b->fence_fd != 0 && b->fence_fd != -1) &&
>> > >
>> > > Why do we need to consider both values invalid? Can 0 ever be a valid fence
>> > > fd?
>> >
>> > Programs that don't use fences will initialize reserved2/fence_fd field
>> > at the uAPI call to zero.
>> >
>> > So, I guess using fd=0 here could be a problem. Anyway, I would, instead,
>> > do:
>> >
>> > 	if ((b->fence_fd < 1) &&
>> > 		...
>> >
>> > as other negative values are likely invalid as well.
>>
>> We are checking when the fence_fd is set but the flag wasn't. Checking
>> for < 1 is exactly the opposite. so we keep as is or do it fence_fd > 0.
>
>Ah, yes. Anyway, I would stick with:
>	if ((b->fence_fd > 0) &&
>		...
>

0 is a valid fence_fd right? If I close stdin, and create a sync_file,
couldn't I get a fence with fd zero?

-Brian

>>
>> Gustavo
>
>
>-- 
>Thanks,
>Mauro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC v5 00/11] V4L2 Explicit Synchronization
  2017-11-20 10:19 ` [RFC v5 00/11] V4L2 Explicit Synchronization Smitha T Murthy
@ 2017-11-30 18:53   ` Gustavo Padovan
  0 siblings, 0 replies; 48+ messages in thread
From: Gustavo Padovan @ 2017-11-30 18:53 UTC (permalink / raw)
  To: Smitha T Murthy
  Cc: linux-media, Hans Verkuil, Mauro Carvalho Chehab, Shuah Khan,
	Pawel Osciak, Alexandre Courbot, Sakari Ailus, Brian Starkey,
	Thierry Escande, linux-kernel, Gustavo Padovan

Hi Smitha,

2017-11-20 Smitha T Murthy <smithatmurthy@gmail.com>:

> Hi Gustavo,
> 
> I am currently referring to your implementation for explicit
> synchronisation. For the same I needed your testapp, but I am unable
> to download the same at the link provided
> “https://gitlab.collabora.com/padovan/v4l2-fences-test”
> 
> Could you please help me out with the same.

Sorry about that, our gitlab instance creates the repos as internal by
default. I just changed it to Public.

Gustavo

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2017-11-30 18:53 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-15 17:10 [RFC v5 00/11] V4L2 Explicit Synchronization Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 01/11] [media] v4l: add V4L2_CAP_ORDERED to the uapi Gustavo Padovan
2017-11-17 11:57   ` Mauro Carvalho Chehab
2017-11-17 12:23     ` Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 02/11] [media] vivid: add the V4L2_CAP_ORDERED capability Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 03/11] [media] vb2: add 'ordered_in_driver' property to queues Gustavo Padovan
2017-11-17  5:56   ` Alexandre Courbot
2017-11-17 11:23     ` Gustavo Padovan
2017-11-17 12:15   ` Mauro Carvalho Chehab
2017-11-17 12:27     ` Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 04/11] [media] vivid: mark vivid queues as ordered_in_driver Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 05/11] [media] vb2: check earlier if stream can be started Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 06/11] [media] vb2: add explicit fence user API Gustavo Padovan
2017-11-17 12:25   ` Mauro Carvalho Chehab
2017-11-17 13:29   ` Hans Verkuil
2017-11-17 13:53     ` Mauro Carvalho Chehab
2017-11-17 14:31       ` Hans Verkuil
2017-11-15 17:10 ` [RFC v5 07/11] [media] vb2: add in-fence support to QBUF Gustavo Padovan
2017-11-17  6:49   ` Alexandre Courbot
2017-11-17 13:00     ` Mauro Carvalho Chehab
2017-11-17 13:08       ` Gustavo Padovan
2017-11-17 13:19         ` Mauro Carvalho Chehab
2017-11-20 11:41           ` Brian Starkey
2017-11-17 13:01     ` Gustavo Padovan
2017-11-20  2:53       ` Alexandre Courbot
2017-11-17 12:53   ` Mauro Carvalho Chehab
2017-11-17 13:12     ` Gustavo Padovan
2017-11-17 13:47       ` Mauro Carvalho Chehab
2017-11-17 17:20         ` Gustavo Padovan
2017-11-17 14:15   ` Hans Verkuil
2017-11-17 17:40     ` Gustavo Padovan
2017-11-17 17:50       ` Gustavo Padovan
2017-11-18  9:30       ` Hans Verkuil
2017-11-15 17:10 ` [RFC v5 08/11] [media] vb2: add videobuf2 dma-buf fence helpers Gustavo Padovan
2017-11-17  7:02   ` Alexandre Courbot
2017-11-17  7:11     ` Alexandre Courbot
2017-11-17 11:27       ` Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 09/11] [media] vb2: add infrastructure to support out-fences Gustavo Padovan
2017-11-17  7:19   ` Alexandre Courbot
2017-11-17  7:29     ` Alexandre Courbot
2017-11-17 11:30     ` Gustavo Padovan
2017-11-15 17:10 ` [RFC v5 10/11] [media] vb2: add out-fence support to QBUF Gustavo Padovan
2017-11-17  7:38   ` Alexandre Courbot
2017-11-17 11:48     ` Gustavo Padovan
2017-11-17 13:34   ` Hans Verkuil
2017-11-15 17:10 ` [RFC v5 11/11] [media] v4l: Document explicit synchronization behavior Gustavo Padovan
2017-11-20 10:19 ` [RFC v5 00/11] V4L2 Explicit Synchronization Smitha T Murthy
2017-11-30 18:53   ` Gustavo Padovan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).