linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
@ 2019-09-29 20:00 Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 1/6] vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF Jernej Skrabec
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

This series adds support for decoding multi-slice H264 frames along with
support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.

Code was tested by modified ffmpeg, which can be found here:
https://github.com/jernejsk/FFmpeg, branch mainline-test
It has to be configured with at least following options:
--enable-v4l2-request --enable-libudev --enable-libdrm

Samples used for testing:
http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
http://jernej.libreelec.tv/videos/h264/h264.mp4

Command line used for testing:
ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt bgra -f fbdev /dev/fb0

Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
not sure how. ffmpeg follows exactly which slice is last in frame
and sets hold flag accordingly. Improper usage of hold flag would
corrupt ffmpeg assumptions and it would probably crash. Any ideas
how to test this are welcome!

Thanks to Jonas for adjusting ffmpeg.

Please let me know what you think.

Best regards,
Jernej

Changes from v1:
- added Rb tags
- updated V4L2_DEC_CMD_FLUSH documentation
- updated first slice detection in Cedrus
- hold capture buffer flag is set according to source format
- added v4l m2m stateless_(try_)decoder_cmd ioctl helpers

Hans Verkuil (2):
  vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
  videodev2.h: add V4L2_DEC_CMD_FLUSH

Jernej Skrabec (4):
  media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  media: cedrus: Detect first slice of a frame
  media: cedrus: h264: Support multiple slices per frame
  media: cedrus: Add support for holding capture buffer

 Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
 .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
 .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
 .../media/videodev2.h.rst.exceptions          |  1 +
 .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
 drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
 drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
 .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
 .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
 include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
 include/media/videobuf2-core.h                |  3 ++
 include/media/videobuf2-v4l2.h                |  5 ++
 include/uapi/linux/videodev2.h                | 14 ++++--
 15 files changed, 175 insertions(+), 11 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/6] vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 2/6] videodev2.h: add V4L2_DEC_CMD_FLUSH Jernej Skrabec
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

This patch adds support for the V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF flag.
It also adds a new V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF capability and
a v4l2_m2m_release_capture_buf() helper function.

Drivers should set vb2_queue->subsystem_flags to
VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF to indicate support for this flag.

The device_run() function should look like this:

if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
	v4l2_m2m_dst_buf_remove(m2m_ctx);
	v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
	cap_vb = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
}
cap_vb->is_held = out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;

...

v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
if (!cap_vb->is_held) {
	v4l2_m2m_dst_buf_remove(m2m_ctx);
	v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
}

In order to handle the corner case where V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
is always set for the output buffer, and you reached the last frame (so no
new output buffer will be queued with a new timestamp), the driver must
implement support for the V4L2_DEC_CMD_FLUSH command, and that should do:

struct vb2_v4l2_buffer *out_vb = v4l2_m2m_last_src_buf(m2m_ctx);
struct vb2_v4l2_buffer *cap_vb = v4l2_m2m_last_dst_buf(m2m_ctx);

if (out_vb)
	out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
else if (cap_vb && cap_vb->is_held)
	v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);

For formats that do not support slicing (e.g. VP8), the FLUSH command
just does nothing.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
[Adjust example code]
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
 .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
 .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
 include/media/v4l2-mem2mem.h                  | 42 +++++++++++++++++++
 include/media/videobuf2-core.h                |  3 ++
 include/media/videobuf2-v4l2.h                |  5 +++
 include/uapi/linux/videodev2.h                | 13 +++---
 7 files changed, 84 insertions(+), 6 deletions(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst b/Documentation/media/uapi/v4l/buffer.rst
index 1cbd9cde57f3..afb03906ead9 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -607,6 +607,19 @@ Buffer Flags
 	applications shall use this flag for output buffers if the data in
 	this buffer has not been created by the CPU but by some
 	DMA-capable unit, in which case caches have not been used.
+    * .. _`V4L2-BUF-FLAG-M2M-HOLD-CAPTURE-BUF`:
+
+      - ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF``
+      - 0x00000200
+      - Only valid if ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` is
+        set. It is typically used with stateless decoders where multiple
+	output buffers each decode to a slice of the decoded frame.
+	Applications can set this flag when queueing the output buffer
+	to prevent the driver from dequeueing the capture buffer after
+	the output buffer has been decoded (i.e. the capture buffer is
+	'held'). If the timestamp of this output buffer differs from that
+	of the previous output buffer, then that indicates the start of a
+	new frame and the previously held capture buffer is dequeued.
     * .. _`V4L2-BUF-FLAG-LAST`:
 
       - ``V4L2_BUF_FLAG_LAST``
diff --git a/Documentation/media/uapi/v4l/vidioc-reqbufs.rst b/Documentation/media/uapi/v4l/vidioc-reqbufs.rst
index d7faef10e39b..d0c643db477a 100644
--- a/Documentation/media/uapi/v4l/vidioc-reqbufs.rst
+++ b/Documentation/media/uapi/v4l/vidioc-reqbufs.rst
@@ -125,6 +125,7 @@ aborting or finishing any DMA in progress, an implicit
 .. _V4L2-BUF-CAP-SUPPORTS-DMABUF:
 .. _V4L2-BUF-CAP-SUPPORTS-REQUESTS:
 .. _V4L2-BUF-CAP-SUPPORTS-ORPHANED-BUFS:
+.. _V4L2-BUF-CAP-SUPPORTS-M2M-HOLD-CAPTURE-BUF:
 
 .. cssclass:: longtable
 
@@ -150,6 +151,11 @@ aborting or finishing any DMA in progress, an implicit
       - The kernel allows calling :ref:`VIDIOC_REQBUFS` while buffers are still
         mapped or exported via DMABUF. These orphaned buffers will be freed
         when they are unmapped or when the exported DMABUF fds are closed.
+    * - ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF``
+      - 0x00000020
+      - Only valid for stateless decoders. If set, then userspace can set the
+        ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag to hold off on returning the
+	capture buffer until the OUTPUT timestamp changes.
 
 Return Value
 ============
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 5a9ba3846f0a..699787f48f46 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -50,7 +50,8 @@ module_param(debug, int, 0644);
 				 V4L2_BUF_FLAG_TIMESTAMP_MASK)
 /* Output buffer flags that should be passed on to the driver */
 #define V4L2_BUFFER_OUT_FLAGS	(V4L2_BUF_FLAG_PFRAME | V4L2_BUF_FLAG_BFRAME | \
-				 V4L2_BUF_FLAG_KEYFRAME | V4L2_BUF_FLAG_TIMECODE)
+				 V4L2_BUF_FLAG_KEYFRAME | V4L2_BUF_FLAG_TIMECODE | \
+				 V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)
 
 /*
  * __verify_planes_array() - verify that the planes array passed in struct
@@ -194,6 +195,7 @@ static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
 	}
 	vbuf->sequence = 0;
 	vbuf->request_fd = -1;
+	vbuf->is_held = false;
 
 	if (V4L2_TYPE_IS_MULTIPLANAR(b->type)) {
 		switch (b->memory) {
@@ -321,6 +323,8 @@ static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
 		 */
 		vbuf->flags &= ~V4L2_BUF_FLAG_TIMECODE;
 		vbuf->field = b->field;
+		if (!(q->subsystem_flags & VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF))
+			vbuf->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
 	} else {
 		/* Zero any output buffer flags as this is a capture buffer */
 		vbuf->flags &= ~V4L2_BUFFER_OUT_FLAGS;
@@ -654,6 +658,8 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
 		*caps |= V4L2_BUF_CAP_SUPPORTS_USERPTR;
 	if (q->io_modes & VB2_DMABUF)
 		*caps |= V4L2_BUF_CAP_SUPPORTS_DMABUF;
+	if (q->subsystem_flags & VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF)
+		*caps |= V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
 #ifdef CONFIG_MEDIA_CONTROLLER_REQUEST_API
 	if (q->supports_requests)
 		*caps |= V4L2_BUF_CAP_SUPPORTS_REQUESTS;
diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index 0b9c3a287061..c9fa96c8eed1 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -644,6 +644,48 @@ void v4l2_m2m_buf_copy_metadata(const struct vb2_v4l2_buffer *out_vb,
 				struct vb2_v4l2_buffer *cap_vb,
 				bool copy_frame_flags);
 
+/**
+ * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
+ * released
+ *
+ * @out_vb: the output buffer
+ * @cap_vb: the capture buffer
+ *
+ * This helper function returns true if the current capture buffer should
+ * be released to vb2. This is the case if the output buffer specified that
+ * the capture buffer should be held (i.e. not returned to vb2) AND if the
+ * timestamp of the capture buffer differs from the output buffer timestamp.
+ *
+ * This helper is to be called at the start of the device_run callback:
+ *
+ * .. code-block:: c
+ *
+ *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
+ *		v4l2_m2m_dst_buf_remove(m2m_ctx);
+ *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+ *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
+ *	}
+ *	cap_vb->is_held = out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+ *
+ *	...
+ *
+ *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
+ *	if (!cap_vb->is_held) {
+ *		v4l2_m2m_dst_buf_remove(m2m_ctx);
+ *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+ *	}
+ *
+ * This allows for multiple output buffers to be used to fill in a single
+ * capture buffer. This is typically used by stateless decoders where
+ * multiple e.g. H.264 slices contribute to a single decoded frame.
+ */
+static inline bool v4l2_m2m_release_capture_buf(const struct vb2_v4l2_buffer *out_vb,
+						const struct vb2_v4l2_buffer *cap_vb)
+{
+	return cap_vb->is_held && cap_vb->vb2_buf.copied_timestamp &&
+	       out_vb->vb2_buf.timestamp != cap_vb->vb2_buf.timestamp;
+}
+
 /* v4l2 request helper */
 
 void v4l2_m2m_request_queue(struct media_request *req);
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 640aabe69450..a2b2208b02da 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -505,6 +505,8 @@ struct vb2_buf_ops {
  * @buf_ops:	callbacks to deliver buffer information.
  *		between user-space and kernel-space.
  * @drv_priv:	driver private data.
+ * @subsystem_flags: Flags specific to the subsystem (V4L2/DVB/etc.). Not used
+ *		by the vb2 core.
  * @buf_struct_size: size of the driver-specific buffer structure;
  *		"0" indicates the driver doesn't want to use a custom buffer
  *		structure type. for example, ``sizeof(struct vb2_v4l2_buffer)``
@@ -571,6 +573,7 @@ struct vb2_queue {
 	const struct vb2_buf_ops	*buf_ops;
 
 	void				*drv_priv;
+	u32				subsystem_flags;
 	unsigned int			buf_struct_size;
 	u32				timestamp_flags;
 	gfp_t				gfp_flags;
diff --git a/include/media/videobuf2-v4l2.h b/include/media/videobuf2-v4l2.h
index 8a10889dc2fd..13ab101864aa 100644
--- a/include/media/videobuf2-v4l2.h
+++ b/include/media/videobuf2-v4l2.h
@@ -33,6 +33,7 @@
  * @timecode:	frame timecode.
  * @sequence:	sequence count of this frame.
  * @request_fd:	the request_fd associated with this buffer
+ * @is_held:	if true, then this buffer was held
  * @planes:	plane information (userptr/fd, length, bytesused, data_offset).
  *
  * Should contain enough information to be able to cover all the fields
@@ -46,9 +47,13 @@ struct vb2_v4l2_buffer {
 	struct v4l2_timecode	timecode;
 	__u32			sequence;
 	__s32			request_fd;
+	bool			is_held;
 	struct vb2_plane	planes[VB2_MAX_PLANES];
 };
 
+/* VB2 V4L2 flags as set in vb2_queue.subsystem_flags */
+#define VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF (1 << 0)
+
 /*
  * to_vb2_v4l2_buffer() - cast struct vb2_buffer * to struct vb2_v4l2_buffer *
  */
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 530638dffd93..4fa9f543742d 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -915,11 +915,12 @@ struct v4l2_requestbuffers {
 };
 
 /* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
-#define V4L2_BUF_CAP_SUPPORTS_MMAP	(1 << 0)
-#define V4L2_BUF_CAP_SUPPORTS_USERPTR	(1 << 1)
-#define V4L2_BUF_CAP_SUPPORTS_DMABUF	(1 << 2)
-#define V4L2_BUF_CAP_SUPPORTS_REQUESTS	(1 << 3)
-#define V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS (1 << 4)
+#define V4L2_BUF_CAP_SUPPORTS_MMAP			(1 << 0)
+#define V4L2_BUF_CAP_SUPPORTS_USERPTR			(1 << 1)
+#define V4L2_BUF_CAP_SUPPORTS_DMABUF			(1 << 2)
+#define V4L2_BUF_CAP_SUPPORTS_REQUESTS			(1 << 3)
+#define V4L2_BUF_CAP_SUPPORTS_ORPHANED_BUFS		(1 << 4)
+#define V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF	(1 << 5)
 
 /**
  * struct v4l2_plane - plane info for multi-planar buffers
@@ -1041,6 +1042,8 @@ static inline __u64 v4l2_timeval_to_ns(const struct timeval *tv)
 #define V4L2_BUF_FLAG_IN_REQUEST		0x00000080
 /* timecode field is valid */
 #define V4L2_BUF_FLAG_TIMECODE			0x00000100
+/* Don't return the capture buffer until OUTPUT timestamp changes */
+#define V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF	0x00000200
 /* Buffer is prepared for queuing */
 #define V4L2_BUF_FLAG_PREPARED			0x00000400
 /* Cache handling flags */
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/6] videodev2.h: add V4L2_DEC_CMD_FLUSH
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 1/6] vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers Jernej Skrabec
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec,
	Alexandre Courbot

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Add this new V4L2_DEC_CMD_FLUSH decoder command and document it.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[Adjusted description]
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst | 10 +++++++++-
 Documentation/media/videodev2.h.rst.exceptions      |  1 +
 include/uapi/linux/videodev2.h                      |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
index 57f0066f4cff..f1a504836f31 100644
--- a/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
+++ b/Documentation/media/uapi/v4l/vidioc-decoder-cmd.rst
@@ -208,7 +208,15 @@ introduced in Linux 3.3. They are, however, mandatory for stateful mem2mem decod
 	been started yet, the driver will return an ``EPERM`` error code. When
 	the decoder is already running, this command does nothing. No
 	flags are defined for this command.
-
+    * - ``V4L2_DEC_CMD_FLUSH``
+      - 4
+      - Flush any held capture buffers. Only valid for stateless decoders.
+	This command is typically used when the application reached the
+	end of the stream and the last output buffer had the
+	``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag set. This would prevent
+	dequeueing the capture buffer containing the last decoded frame.
+	So this command can be used to explicitly flush that final decoded
+	frame. This command does nothing if there are no held capture buffers.
 
 Return Value
 ============
diff --git a/Documentation/media/videodev2.h.rst.exceptions b/Documentation/media/videodev2.h.rst.exceptions
index adeb6b7a15cb..a79028e4d929 100644
--- a/Documentation/media/videodev2.h.rst.exceptions
+++ b/Documentation/media/videodev2.h.rst.exceptions
@@ -434,6 +434,7 @@ replace define V4L2_DEC_CMD_START decoder-cmds
 replace define V4L2_DEC_CMD_STOP decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE decoder-cmds
 replace define V4L2_DEC_CMD_RESUME decoder-cmds
+replace define V4L2_DEC_CMD_FLUSH decoder-cmds
 
 replace define V4L2_DEC_CMD_START_MUTE_AUDIO decoder-cmds
 replace define V4L2_DEC_CMD_PAUSE_TO_BLACK decoder-cmds
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 4fa9f543742d..91a79e16089c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1978,6 +1978,7 @@ struct v4l2_encoder_cmd {
 #define V4L2_DEC_CMD_STOP        (1)
 #define V4L2_DEC_CMD_PAUSE       (2)
 #define V4L2_DEC_CMD_RESUME      (3)
+#define V4L2_DEC_CMD_FLUSH       (4)
 
 /* Flags for V4L2_DEC_CMD_START */
 #define V4L2_DEC_CMD_START_MUTE_AUDIO	(1 << 0)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 1/6] vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 2/6] videodev2.h: add V4L2_DEC_CMD_FLUSH Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-10-04  9:21   ` Hans Verkuil
  2019-09-29 20:00 ` [PATCH v2 4/6] media: cedrus: Detect first slice of a frame Jernej Skrabec
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

These helpers are used by stateless codecs when they support multiple
slices per frame and hold capture buffer flag is set. It's expected that
all such codecs will use this code.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 drivers/media/v4l2-core/v4l2-mem2mem.c | 35 ++++++++++++++++++++++++++
 include/media/v4l2-mem2mem.h           |  4 +++
 2 files changed, 39 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 19937dd3c6f6..2677a07e4c9b 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -1154,6 +1154,41 @@ int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
 }
 EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_try_decoder_cmd);
 
+int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
+					     struct v4l2_decoder_cmd *dc)
+{
+	if (dc->cmd != V4L2_DEC_CMD_FLUSH)
+		return -EINVAL;
+
+	dc->flags = 0;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_try_decoder_cmd);
+
+int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
+					 struct v4l2_decoder_cmd *dc)
+{
+	struct v4l2_fh *fh = file->private_data;
+	struct vb2_v4l2_buffer *out_vb, *cap_vb;
+	int ret;
+
+	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
+	if (ret < 0)
+		return ret;
+
+	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
+	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
+
+	if (out_vb)
+		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+	else if (cap_vb && cap_vb->is_held)
+		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_decoder_cmd);
+
 /*
  * v4l2_file_operations helpers. It is assumed here same lock is used
  * for the output and the capture buffer queue.
diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index c9fa96c8eed1..8ae2f56c7fa3 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -714,6 +714,10 @@ int v4l2_m2m_ioctl_try_encoder_cmd(struct file *file, void *fh,
 				   struct v4l2_encoder_cmd *ec);
 int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
 				   struct v4l2_decoder_cmd *dc);
+int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
+					     struct v4l2_decoder_cmd *dc);
+int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
+					 struct v4l2_decoder_cmd *dc);
 int v4l2_m2m_fop_mmap(struct file *file, struct vm_area_struct *vma);
 __poll_t v4l2_m2m_fop_poll(struct file *file, poll_table *wait);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/6] media: cedrus: Detect first slice of a frame
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
                   ` (2 preceding siblings ...)
  2019-09-29 20:00 ` [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 5/6] media: cedrus: h264: Support multiple slices per frame Jernej Skrabec
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

When codec supports multiple slices in one frame, VPU has to know when
first slice of each frame is being processed, presumably to correctly
clear/set data in auxiliary buffers.

Add first_slice field to cedrus_run structure and set it according to
timestamps of capture and output buffers. If timestamps are different,
it's first slice and viceversa.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 drivers/staging/media/sunxi/cedrus/cedrus.h     | 1 +
 drivers/staging/media/sunxi/cedrus/cedrus_dec.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.h b/drivers/staging/media/sunxi/cedrus/cedrus.h
index 2f017a651848..32cb38e541c6 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.h
@@ -70,6 +70,7 @@ struct cedrus_mpeg2_run {
 struct cedrus_run {
 	struct vb2_v4l2_buffer	*src;
 	struct vb2_v4l2_buffer	*dst;
+	bool first_slice;
 
 	union {
 		struct cedrus_h264_run	h264;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index 56ca4c9ad01c..e49c3396ca4d 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -31,6 +31,8 @@ void cedrus_device_run(void *priv)
 
 	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
 	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
+		run.src->vb2_buf.timestamp != run.dst->vb2_buf.timestamp;
 
 	/* Apply request(s) controls if needed. */
 	src_req = run.src->vb2_buf.req_obj.req;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 5/6] media: cedrus: h264: Support multiple slices per frame
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
                   ` (3 preceding siblings ...)
  2019-09-29 20:00 ` [PATCH v2 4/6] media: cedrus: Detect first slice of a frame Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-09-29 20:00 ` [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer Jernej Skrabec
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

With recent changes, support for decoding multi-slice frames can be
easily added now.

Signal VPU if current slice is first in frame or not and add information
about first macroblock coordinates.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 drivers/staging/media/sunxi/cedrus/cedrus_h264.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
index d6a782703c9b..b7260cfeb6f4 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
@@ -301,6 +301,8 @@ static void cedrus_set_params(struct cedrus_ctx *ctx,
 	dma_addr_t src_buf_addr;
 	u32 offset = slice->header_bit_size;
 	u32 len = (slice->size * 8) - offset;
+	unsigned int pic_width_in_mbs;
+	bool mbaff_pic;
 	u32 reg;
 
 	cedrus_write(dev, VE_H264_VLD_LEN, len);
@@ -370,12 +372,19 @@ static void cedrus_set_params(struct cedrus_ctx *ctx,
 		reg |= VE_H264_SPS_DIRECT_8X8_INFERENCE;
 	cedrus_write(dev, VE_H264_SPS, reg);
 
+	mbaff_pic = !(slice->flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) &&
+		    (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
+	pic_width_in_mbs = sps->pic_width_in_mbs_minus1 + 1;
+
 	// slice parameters
 	reg = 0;
+	reg |= ((slice->first_mb_in_slice % pic_width_in_mbs) & 0xff) << 24;
+	reg |= (((slice->first_mb_in_slice / pic_width_in_mbs) * (mbaff_pic + 1)) & 0xff) << 16;
 	reg |= decode->nal_ref_idc ? BIT(12) : 0;
 	reg |= (slice->slice_type & 0xf) << 8;
 	reg |= slice->cabac_init_idc & 0x3;
-	reg |= VE_H264_SHS_FIRST_SLICE_IN_PIC;
+	if (run->first_slice)
+		reg |= VE_H264_SHS_FIRST_SLICE_IN_PIC;
 	if (slice->flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)
 		reg |= VE_H264_SHS_FIELD_PIC;
 	if (slice->flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
                   ` (4 preceding siblings ...)
  2019-09-29 20:00 ` [PATCH v2 5/6] media: cedrus: h264: Support multiple slices per frame Jernej Skrabec
@ 2019-09-29 20:00 ` Jernej Skrabec
  2019-09-30  8:14   ` Hans Verkuil
  2019-09-30  8:10 ` [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Hans Verkuil
  2019-10-07 10:44 ` Hans Verkuil
  7 siblings, 1 reply; 22+ messages in thread
From: Jernej Skrabec @ 2019-09-29 20:00 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas, jernej.skrabec

When frame contains multiple slices and driver works in slice mode, it's
more efficient to hold capture buffer in queue until all slices of a
same frame are decoded.

Add support for that to Cedrus driver by exposing and implementing
V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF capability.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
---
 drivers/staging/media/sunxi/cedrus/cedrus_dec.c   |  9 +++++++++
 drivers/staging/media/sunxi/cedrus/cedrus_hw.c    |  8 +++++---
 drivers/staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++++++++++
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index e49c3396ca4d..67f7d4326fc1 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -31,6 +31,14 @@ void cedrus_device_run(void *priv)
 
 	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
 	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+
+	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
+		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
+		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	}
+	run.dst->is_held = run.src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+
 	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
 		run.src->vb2_buf.timestamp != run.dst->vb2_buf.timestamp;
 
@@ -46,6 +54,7 @@ void cedrus_device_run(void *priv)
 			V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS);
 		run.mpeg2.quantization = cedrus_find_control_data(ctx,
 			V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION);
+		run.dst->is_held = false;
 		break;
 
 	case V4L2_PIX_FMT_H264_SLICE:
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index fc8579b90dab..b466041c25db 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -122,7 +122,7 @@ static irqreturn_t cedrus_irq(int irq, void *data)
 	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
 
 	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
 
 	if (!src_buf || !dst_buf) {
 		v4l2_err(&dev->v4l2_dev,
@@ -136,8 +136,10 @@ static irqreturn_t cedrus_irq(int irq, void *data)
 		state = VB2_BUF_STATE_DONE;
 
 	v4l2_m2m_buf_done(src_buf, state);
-	v4l2_m2m_buf_done(dst_buf, state);
-
+	if (!dst_buf->is_held) {
+		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+		v4l2_m2m_buf_done(dst_buf, state);
+	}
 	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
 
 	return IRQ_HANDLED;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
index 3ec3a2db790c..82198b2bb081 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
@@ -303,6 +303,17 @@ static int cedrus_s_fmt_vid_out(struct file *file, void *priv,
 
 	ctx->src_fmt = f->fmt.pix;
 
+	switch (ctx->src_fmt.pixelformat) {
+	case V4L2_PIX_FMT_H264_SLICE:
+		vq->subsystem_flags |=
+			VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
+		break;
+	default:
+		vq->subsystem_flags &=
+			(u32)~VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
+		break;
+	}
+
 	/* Propagate colorspace information to capture. */
 	ctx->dst_fmt.colorspace = f->fmt.pix.colorspace;
 	ctx->dst_fmt.xfer_func = f->fmt.pix.xfer_func;
@@ -336,6 +347,9 @@ const struct v4l2_ioctl_ops cedrus_ioctl_ops = {
 	.vidioc_streamon		= v4l2_m2m_ioctl_streamon,
 	.vidioc_streamoff		= v4l2_m2m_ioctl_streamoff,
 
+	.vidioc_try_decoder_cmd		= v4l2_m2m_ioctl_stateless_try_decoder_cmd,
+	.vidioc_decoder_cmd		= v4l2_m2m_ioctl_stateless_decoder_cmd,
+
 	.vidioc_subscribe_event		= v4l2_ctrl_subscribe_event,
 	.vidioc_unsubscribe_event	= v4l2_event_unsubscribe,
 };
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
                   ` (5 preceding siblings ...)
  2019-09-29 20:00 ` [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer Jernej Skrabec
@ 2019-09-30  8:10 ` Hans Verkuil
  2019-09-30 22:27   ` Jernej Škrabec
  2019-10-07 10:44 ` Hans Verkuil
  7 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-09-30  8:10 UTC (permalink / raw)
  To: Jernej Skrabec, mchehab, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas

On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> This series adds support for decoding multi-slice H264 frames along with
> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> 
> Code was tested by modified ffmpeg, which can be found here:
> https://github.com/jernejsk/FFmpeg, branch mainline-test
> It has to be configured with at least following options:
> --enable-v4l2-request --enable-libudev --enable-libdrm
> 
> Samples used for testing:
> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> http://jernej.libreelec.tv/videos/h264/h264.mp4
> 
> Command line used for testing:
> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt bgra -f fbdev /dev/fb0
> 
> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> not sure how. ffmpeg follows exactly which slice is last in frame
> and sets hold flag accordingly. Improper usage of hold flag would
> corrupt ffmpeg assumptions and it would probably crash. Any ideas
> how to test this are welcome!

It can be tested partially at least: if ffmpeg tells you it is the last
slice, then queue the slice with the HOLD flag set, then call FLUSH afterwards.
This should clear the HOLD flag again. In this case the queued buffer is
probably not yet processed, so the flag is cleared before the decode job starts.

You can also try to add a sleep before calling FLUSH to see what happens
if the last queued buffer is already decoded.

Regards,

	Hans

> 
> Thanks to Jonas for adjusting ffmpeg.
> 
> Please let me know what you think.
> 
> Best regards,
> Jernej
> 
> Changes from v1:
> - added Rb tags
> - updated V4L2_DEC_CMD_FLUSH documentation
> - updated first slice detection in Cedrus
> - hold capture buffer flag is set according to source format
> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> 
> Hans Verkuil (2):
>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
>   videodev2.h: add V4L2_DEC_CMD_FLUSH
> 
> Jernej Skrabec (4):
>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
>   media: cedrus: Detect first slice of a frame
>   media: cedrus: h264: Support multiple slices per frame
>   media: cedrus: Add support for holding capture buffer
> 
>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
>  .../media/videodev2.h.rst.exceptions          |  1 +
>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
>  include/media/videobuf2-core.h                |  3 ++
>  include/media/videobuf2-v4l2.h                |  5 ++
>  include/uapi/linux/videodev2.h                | 14 ++++--
>  15 files changed, 175 insertions(+), 11 deletions(-)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer
  2019-09-29 20:00 ` [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer Jernej Skrabec
@ 2019-09-30  8:14   ` Hans Verkuil
  2019-09-30 16:44     ` Jernej Škrabec
  0 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-09-30  8:14 UTC (permalink / raw)
  To: Jernej Skrabec, mchehab, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas

On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> When frame contains multiple slices and driver works in slice mode, it's
> more efficient to hold capture buffer in queue until all slices of a
> same frame are decoded.
> 
> Add support for that to Cedrus driver by exposing and implementing
> V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF capability.
> 
> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> ---
>  drivers/staging/media/sunxi/cedrus/cedrus_dec.c   |  9 +++++++++
>  drivers/staging/media/sunxi/cedrus/cedrus_hw.c    |  8 +++++---
>  drivers/staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++++++++++
>  3 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> index e49c3396ca4d..67f7d4326fc1 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> @@ -31,6 +31,14 @@ void cedrus_device_run(void *priv)
>  
>  	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
>  	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> +
> +	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
> +		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> +		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
> +		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> +	}
> +	run.dst->is_held = run.src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> +
>  	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
>  		run.src->vb2_buf.timestamp != run.dst->vb2_buf.timestamp;
>  
> @@ -46,6 +54,7 @@ void cedrus_device_run(void *priv)
>  			V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS);
>  		run.mpeg2.quantization = cedrus_find_control_data(ctx,
>  			V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION);
> +		run.dst->is_held = false;
>  		break;
>  
>  	case V4L2_PIX_FMT_H264_SLICE:
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> index fc8579b90dab..b466041c25db 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> @@ -122,7 +122,7 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>  	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
>  
>  	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> -	dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> +	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>  
>  	if (!src_buf || !dst_buf) {
>  		v4l2_err(&dev->v4l2_dev,
> @@ -136,8 +136,10 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>  		state = VB2_BUF_STATE_DONE;
>  
>  	v4l2_m2m_buf_done(src_buf, state);
> -	v4l2_m2m_buf_done(dst_buf, state);
> -
> +	if (!dst_buf->is_held) {
> +		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> +		v4l2_m2m_buf_done(dst_buf, state);
> +	}
>  	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
>  
>  	return IRQ_HANDLED;
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> index 3ec3a2db790c..82198b2bb081 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> @@ -303,6 +303,17 @@ static int cedrus_s_fmt_vid_out(struct file *file, void *priv,
>  
>  	ctx->src_fmt = f->fmt.pix;
>  
> +	switch (ctx->src_fmt.pixelformat) {
> +	case V4L2_PIX_FMT_H264_SLICE:
> +		vq->subsystem_flags |=
> +			VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
> +		break;
> +	default:
> +		vq->subsystem_flags &=
> +			(u32)~VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;

Why the u32 cast?

Regards,

	Hans

> +		break;
> +	}
> +
>  	/* Propagate colorspace information to capture. */
>  	ctx->dst_fmt.colorspace = f->fmt.pix.colorspace;
>  	ctx->dst_fmt.xfer_func = f->fmt.pix.xfer_func;
> @@ -336,6 +347,9 @@ const struct v4l2_ioctl_ops cedrus_ioctl_ops = {
>  	.vidioc_streamon		= v4l2_m2m_ioctl_streamon,
>  	.vidioc_streamoff		= v4l2_m2m_ioctl_streamoff,
>  
> +	.vidioc_try_decoder_cmd		= v4l2_m2m_ioctl_stateless_try_decoder_cmd,
> +	.vidioc_decoder_cmd		= v4l2_m2m_ioctl_stateless_decoder_cmd,
> +
>  	.vidioc_subscribe_event		= v4l2_ctrl_subscribe_event,
>  	.vidioc_unsubscribe_event	= v4l2_event_unsubscribe,
>  };
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer
  2019-09-30  8:14   ` Hans Verkuil
@ 2019-09-30 16:44     ` Jernej Škrabec
  0 siblings, 0 replies; 22+ messages in thread
From: Jernej Škrabec @ 2019-09-30 16:44 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne ponedeljek, 30. september 2019 ob 10:14:32 CEST je Hans Verkuil 
napisal(a):
> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> > When frame contains multiple slices and driver works in slice mode, it's
> > more efficient to hold capture buffer in queue until all slices of a
> > same frame are decoded.
> > 
> > Add support for that to Cedrus driver by exposing and implementing
> > V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF capability.
> > 
> > Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> > ---
> > 
> >  drivers/staging/media/sunxi/cedrus/cedrus_dec.c   |  9 +++++++++
> >  drivers/staging/media/sunxi/cedrus/cedrus_hw.c    |  8 +++++---
> >  drivers/staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++++++++++
> >  3 files changed, 28 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c index
> > e49c3396ca4d..67f7d4326fc1 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> > @@ -31,6 +31,14 @@ void cedrus_device_run(void *priv)
> > 
> >  	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> >  	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> > 
> > +
> > +	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
> > +		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> > +		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
> > +		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> > +	}
> > +	run.dst->is_held = run.src->flags & 
V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> > +
> > 
> >  	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
> >  	
> >  		run.src->vb2_buf.timestamp != run.dst-
>vb2_buf.timestamp;
> > 
> > @@ -46,6 +54,7 @@ void cedrus_device_run(void *priv)
> > 
> >  			V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS);
> >  		
> >  		run.mpeg2.quantization = cedrus_find_control_data(ctx,
> >  		
> >  			V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION);
> > 
> > +		run.dst->is_held = false;
> > 
> >  		break;
> >  	
> >  	case V4L2_PIX_FMT_H264_SLICE:
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c index
> > fc8579b90dab..b466041c25db 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > @@ -122,7 +122,7 @@ static irqreturn_t cedrus_irq(int irq, void *data)
> > 
> >  	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
> >  	
> >  	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> > 
> > -	dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> > +	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> > 
> >  	if (!src_buf || !dst_buf) {
> >  	
> >  		v4l2_err(&dev->v4l2_dev,
> > 
> > @@ -136,8 +136,10 @@ static irqreturn_t cedrus_irq(int irq, void *data)
> > 
> >  		state = VB2_BUF_STATE_DONE;
> >  	
> >  	v4l2_m2m_buf_done(src_buf, state);
> > 
> > -	v4l2_m2m_buf_done(dst_buf, state);
> > -
> > +	if (!dst_buf->is_held) {
> > +		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> > +		v4l2_m2m_buf_done(dst_buf, state);
> > +	}
> > 
> >  	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
> >  	
> >  	return IRQ_HANDLED;
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_video.c index
> > 3ec3a2db790c..82198b2bb081 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
> > @@ -303,6 +303,17 @@ static int cedrus_s_fmt_vid_out(struct file *file,
> > void *priv,> 
> >  	ctx->src_fmt = f->fmt.pix;
> > 
> > +	switch (ctx->src_fmt.pixelformat) {
> > +	case V4L2_PIX_FMT_H264_SLICE:
> > +		vq->subsystem_flags |=
> > +			VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
> > +		break;
> > +	default:
> > +		vq->subsystem_flags &=
> > +			
(u32)~VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF;
> 
> Why the u32 cast?

To prevent warnings on arm64 such as reported here:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1620796.html

But I'm not sure if this aplies for this case. I compiled kernel for arm64 but 
there is no warning without this cast with my configuration. I guess I can 
remove it.

Best regards,
Jernej

> 
> Regards,
> 
> 	Hans
> 
> > +		break;
> > +	}
> > +
> > 
> >  	/* Propagate colorspace information to capture. */
> >  	ctx->dst_fmt.colorspace = f->fmt.pix.colorspace;
> >  	ctx->dst_fmt.xfer_func = f->fmt.pix.xfer_func;
> > 
> > @@ -336,6 +347,9 @@ const struct v4l2_ioctl_ops cedrus_ioctl_ops = {
> > 
> >  	.vidioc_streamon		= v4l2_m2m_ioctl_streamon,
> >  	.vidioc_streamoff		= v4l2_m2m_ioctl_streamoff,
> > 
> > +	.vidioc_try_decoder_cmd		= 
v4l2_m2m_ioctl_stateless_try_decoder_cmd,
> > +	.vidioc_decoder_cmd		= 
v4l2_m2m_ioctl_stateless_decoder_cmd,
> > +
> > 
> >  	.vidioc_subscribe_event		= v4l2_ctrl_subscribe_event,
> >  	.vidioc_unsubscribe_event	= v4l2_event_unsubscribe,
> >  
> >  };





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-30  8:10 ` [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Hans Verkuil
@ 2019-09-30 22:27   ` Jernej Škrabec
  2019-09-30 22:43     ` Hans Verkuil
  0 siblings, 1 reply; 22+ messages in thread
From: Jernej Škrabec @ 2019-09-30 22:27 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne ponedeljek, 30. september 2019 ob 10:10:48 CEST je Hans Verkuil 
napisal(a):
> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> > This series adds support for decoding multi-slice H264 frames along with
> > support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> > 
> > Code was tested by modified ffmpeg, which can be found here:
> > https://github.com/jernejsk/FFmpeg, branch mainline-test
> > It has to be configured with at least following options:
> > --enable-v4l2-request --enable-libudev --enable-libdrm
> > 
> > Samples used for testing:
> > http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> > http://jernej.libreelec.tv/videos/h264/h264.mp4
> > 
> > Command line used for testing:
> > ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
> > bgra -f fbdev /dev/fb0
> > 
> > Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> > not sure how. ffmpeg follows exactly which slice is last in frame
> > and sets hold flag accordingly. Improper usage of hold flag would
> > corrupt ffmpeg assumptions and it would probably crash. Any ideas
> > how to test this are welcome!
> 
> It can be tested partially at least: if ffmpeg tells you it is the last
> slice, then queue the slice with the HOLD flag set, then call FLUSH
> afterwards. This should clear the HOLD flag again. In this case the queued
> buffer is probably not yet processed, so the flag is cleared before the
> decode job starts.
> 
> You can also try to add a sleep before calling FLUSH to see what happens
> if the last queued buffer is already decoded.

Ok, I tried following code:
https://github.com/jernejsk/FFmpeg/blob/flush_test/libavcodec/
v4l2_request.c#L220-L233

But results are not good. It seems that out_vb in flush command is always NULL 
and so it always marks capture buffer as done, which leads to kernel warnings.

dmesg output with debugging messages is here: http://ix.io/1Ks8

Currently I'm not sure what is going on. Shouldn't be output buffer queued and 
waiting to MEDIA_REQUEST_IOC_QUEUE which is executed after flush command as it 
can be seen from ffmpeg code linked above?

Best regards,
Jernej

> 
> Regards,
> 
> 	Hans
> 
> > Thanks to Jonas for adjusting ffmpeg.
> > 
> > Please let me know what you think.
> > 
> > Best regards,
> > Jernej
> > 
> > Changes from v1:
> > - added Rb tags
> > - updated V4L2_DEC_CMD_FLUSH documentation
> > - updated first slice detection in Cedrus
> > - hold capture buffer flag is set according to source format
> > - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> > 
> > Hans Verkuil (2):
> >   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> >   videodev2.h: add V4L2_DEC_CMD_FLUSH
> > 
> > Jernej Skrabec (4):
> >   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
> >   media: cedrus: Detect first slice of a frame
> >   media: cedrus: h264: Support multiple slices per frame
> >   media: cedrus: Add support for holding capture buffer
> >  
> >  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
> >  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
> >  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
> >  .../media/videodev2.h.rst.exceptions          |  1 +
> >  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
> >  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
> >  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
> >  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
> >  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
> >  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
> >  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
> >  include/media/videobuf2-core.h                |  3 ++
> >  include/media/videobuf2-v4l2.h                |  5 ++
> >  include/uapi/linux/videodev2.h                | 14 ++++--
> >  15 files changed, 175 insertions(+), 11 deletions(-)





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-30 22:27   ` Jernej Škrabec
@ 2019-09-30 22:43     ` Hans Verkuil
  2019-09-30 22:58       ` Jernej Škrabec
  2019-10-01  5:33       ` Jernej Škrabec
  0 siblings, 2 replies; 22+ messages in thread
From: Hans Verkuil @ 2019-09-30 22:43 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

On 10/1/19 12:27 AM, Jernej Škrabec wrote:
> Dne ponedeljek, 30. september 2019 ob 10:10:48 CEST je Hans Verkuil 
> napisal(a):
>> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
>>> This series adds support for decoding multi-slice H264 frames along with
>>> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
>>>
>>> Code was tested by modified ffmpeg, which can be found here:
>>> https://github.com/jernejsk/FFmpeg, branch mainline-test
>>> It has to be configured with at least following options:
>>> --enable-v4l2-request --enable-libudev --enable-libdrm
>>>
>>> Samples used for testing:
>>> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
>>> http://jernej.libreelec.tv/videos/h264/h264.mp4
>>>
>>> Command line used for testing:
>>> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
>>> bgra -f fbdev /dev/fb0
>>>
>>> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
>>> not sure how. ffmpeg follows exactly which slice is last in frame
>>> and sets hold flag accordingly. Improper usage of hold flag would
>>> corrupt ffmpeg assumptions and it would probably crash. Any ideas
>>> how to test this are welcome!
>>
>> It can be tested partially at least: if ffmpeg tells you it is the last
>> slice, then queue the slice with the HOLD flag set, then call FLUSH
>> afterwards. This should clear the HOLD flag again. In this case the queued
>> buffer is probably not yet processed, so the flag is cleared before the
>> decode job starts.
>>
>> You can also try to add a sleep before calling FLUSH to see what happens
>> if the last queued buffer is already decoded.
> 
> Ok, I tried following code:
> https://github.com/jernejsk/FFmpeg/blob/flush_test/libavcodec/
> v4l2_request.c#L220-L233
> 
> But results are not good. It seems that out_vb in flush command is always NULL 
> and so it always marks capture buffer as done, which leads to kernel warnings.
> 
> dmesg output with debugging messages is here: http://ix.io/1Ks8
> 
> Currently I'm not sure what is going on. Shouldn't be output buffer queued and 
> waiting to MEDIA_REQUEST_IOC_QUEUE which is executed after flush command as it 
> can be seen from ffmpeg code linked above?

Argh, I forgot about the fact that this uses requests.

The FLUSH should happen *after* the MEDIA_REQUEST_IOC_QUEUE ioctl. Otherwise
it has no effect. As long as the request hasn't been queued, the buffer is also
not queued to the driver, so out_vb will indeed be NULL.

Sorry for the confusion.

Regards,

	Hans

> 
> Best regards,
> Jernej
> 
>>
>> Regards,
>>
>> 	Hans
>>
>>> Thanks to Jonas for adjusting ffmpeg.
>>>
>>> Please let me know what you think.
>>>
>>> Best regards,
>>> Jernej
>>>
>>> Changes from v1:
>>> - added Rb tags
>>> - updated V4L2_DEC_CMD_FLUSH documentation
>>> - updated first slice detection in Cedrus
>>> - hold capture buffer flag is set according to source format
>>> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
>>>
>>> Hans Verkuil (2):
>>>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
>>>   videodev2.h: add V4L2_DEC_CMD_FLUSH
>>>
>>> Jernej Skrabec (4):
>>>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
>>>   media: cedrus: Detect first slice of a frame
>>>   media: cedrus: h264: Support multiple slices per frame
>>>   media: cedrus: Add support for holding capture buffer
>>>  
>>>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
>>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
>>>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
>>>  .../media/videodev2.h.rst.exceptions          |  1 +
>>>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
>>>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
>>>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
>>>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
>>>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
>>>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
>>>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
>>>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
>>>  include/media/videobuf2-core.h                |  3 ++
>>>  include/media/videobuf2-v4l2.h                |  5 ++
>>>  include/uapi/linux/videodev2.h                | 14 ++++--
>>>  15 files changed, 175 insertions(+), 11 deletions(-)
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-30 22:43     ` Hans Verkuil
@ 2019-09-30 22:58       ` Jernej Škrabec
  2019-10-01  5:33       ` Jernej Škrabec
  1 sibling, 0 replies; 22+ messages in thread
From: Jernej Škrabec @ 2019-09-30 22:58 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne torek, 01. oktober 2019 ob 00:43:34 CEST je Hans Verkuil napisal(a):
> On 10/1/19 12:27 AM, Jernej Škrabec wrote:
> > Dne ponedeljek, 30. september 2019 ob 10:10:48 CEST je Hans Verkuil
> > 
> > napisal(a):
> >> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> >>> This series adds support for decoding multi-slice H264 frames along with
> >>> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> >>> 
> >>> Code was tested by modified ffmpeg, which can be found here:
> >>> https://github.com/jernejsk/FFmpeg, branch mainline-test
> >>> It has to be configured with at least following options:
> >>> --enable-v4l2-request --enable-libudev --enable-libdrm
> >>> 
> >>> Samples used for testing:
> >>> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> >>> http://jernej.libreelec.tv/videos/h264/h264.mp4
> >>> 
> >>> Command line used for testing:
> >>> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
> >>> bgra -f fbdev /dev/fb0
> >>> 
> >>> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> >>> not sure how. ffmpeg follows exactly which slice is last in frame
> >>> and sets hold flag accordingly. Improper usage of hold flag would
> >>> corrupt ffmpeg assumptions and it would probably crash. Any ideas
> >>> how to test this are welcome!
> >> 
> >> It can be tested partially at least: if ffmpeg tells you it is the last
> >> slice, then queue the slice with the HOLD flag set, then call FLUSH
> >> afterwards. This should clear the HOLD flag again. In this case the
> >> queued
> >> buffer is probably not yet processed, so the flag is cleared before the
> >> decode job starts.
> >> 
> >> You can also try to add a sleep before calling FLUSH to see what happens
> >> if the last queued buffer is already decoded.
> > 
> > Ok, I tried following code:
> > https://github.com/jernejsk/FFmpeg/blob/flush_test/libavcodec/
> > v4l2_request.c#L220-L233
> > 
> > But results are not good. It seems that out_vb in flush command is always
> > NULL and so it always marks capture buffer as done, which leads to kernel
> > warnings.
> > 
> > dmesg output with debugging messages is here: http://ix.io/1Ks8
> > 
> > Currently I'm not sure what is going on. Shouldn't be output buffer queued
> > and waiting to MEDIA_REQUEST_IOC_QUEUE which is executed after flush
> > command as it can be seen from ffmpeg code linked above?
> 
> Argh, I forgot about the fact that this uses requests.
> 
> The FLUSH should happen *after* the MEDIA_REQUEST_IOC_QUEUE ioctl. Otherwise
> it has no effect. As long as the request hasn't been queued, the buffer is
> also not queued to the driver, so out_vb will indeed be NULL.

Well, flush cmd has effect if it is called before MEDIA_REQUEST_IOC_QUEUE ioctl 
as it can be seen from linked dmesg output. I guess there is nothing that we 
can do to prevent wrong usage?

BTW, if capture buffer is marked as done, shouldn't be also removed from the 
queue?

Best regards,
Jernej

> 
> Sorry for the confusion.
> 
> Regards,
> 
> 	Hans
> 
> > Best regards,
> > Jernej
> > 
> >> Regards,
> >> 
> >> 	Hans
> >> 	
> >>> Thanks to Jonas for adjusting ffmpeg.
> >>> 
> >>> Please let me know what you think.
> >>> 
> >>> Best regards,
> >>> Jernej
> >>> 
> >>> Changes from v1:
> >>> - added Rb tags
> >>> - updated V4L2_DEC_CMD_FLUSH documentation
> >>> - updated first slice detection in Cedrus
> >>> - hold capture buffer flag is set according to source format
> >>> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> >>> 
> >>> Hans Verkuil (2):
> >>>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> >>>   videodev2.h: add V4L2_DEC_CMD_FLUSH
> >>> 
> >>> Jernej Skrabec (4):
> >>>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
> >>>   media: cedrus: Detect first slice of a frame
> >>>   media: cedrus: h264: Support multiple slices per frame
> >>>   media: cedrus: Add support for holding capture buffer
> >>>  
> >>>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
> >>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
> >>>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
> >>>  .../media/videodev2.h.rst.exceptions          |  1 +
> >>>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
> >>>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
> >>>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
> >>>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
> >>>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
> >>>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
> >>>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
> >>>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
> >>>  include/media/videobuf2-core.h                |  3 ++
> >>>  include/media/videobuf2-v4l2.h                |  5 ++
> >>>  include/uapi/linux/videodev2.h                | 14 ++++--
> >>>  15 files changed, 175 insertions(+), 11 deletions(-)





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-30 22:43     ` Hans Verkuil
  2019-09-30 22:58       ` Jernej Škrabec
@ 2019-10-01  5:33       ` Jernej Škrabec
  1 sibling, 0 replies; 22+ messages in thread
From: Jernej Škrabec @ 2019-10-01  5:33 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne torek, 01. oktober 2019 ob 00:43:34 CEST je Hans Verkuil napisal(a):
> On 10/1/19 12:27 AM, Jernej Škrabec wrote:
> > Dne ponedeljek, 30. september 2019 ob 10:10:48 CEST je Hans Verkuil
> > 
> > napisal(a):
> >> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> >>> This series adds support for decoding multi-slice H264 frames along with
> >>> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> >>> 
> >>> Code was tested by modified ffmpeg, which can be found here:
> >>> https://github.com/jernejsk/FFmpeg, branch mainline-test
> >>> It has to be configured with at least following options:
> >>> --enable-v4l2-request --enable-libudev --enable-libdrm
> >>> 
> >>> Samples used for testing:
> >>> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> >>> http://jernej.libreelec.tv/videos/h264/h264.mp4
> >>> 
> >>> Command line used for testing:
> >>> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
> >>> bgra -f fbdev /dev/fb0
> >>> 
> >>> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> >>> not sure how. ffmpeg follows exactly which slice is last in frame
> >>> and sets hold flag accordingly. Improper usage of hold flag would
> >>> corrupt ffmpeg assumptions and it would probably crash. Any ideas
> >>> how to test this are welcome!
> >> 
> >> It can be tested partially at least: if ffmpeg tells you it is the last
> >> slice, then queue the slice with the HOLD flag set, then call FLUSH
> >> afterwards. This should clear the HOLD flag again. In this case the
> >> queued
> >> buffer is probably not yet processed, so the flag is cleared before the
> >> decode job starts.
> >> 
> >> You can also try to add a sleep before calling FLUSH to see what happens
> >> if the last queued buffer is already decoded.
> > 
> > Ok, I tried following code:
> > https://github.com/jernejsk/FFmpeg/blob/flush_test/libavcodec/
> > v4l2_request.c#L220-L233
> > 
> > But results are not good. It seems that out_vb in flush command is always
> > NULL and so it always marks capture buffer as done, which leads to kernel
> > warnings.
> > 
> > dmesg output with debugging messages is here: http://ix.io/1Ks8
> > 
> > Currently I'm not sure what is going on. Shouldn't be output buffer queued
> > and waiting to MEDIA_REQUEST_IOC_QUEUE which is executed after flush
> > command as it can be seen from ffmpeg code linked above?
> 
> Argh, I forgot about the fact that this uses requests.
> 
> The FLUSH should happen *after* the MEDIA_REQUEST_IOC_QUEUE ioctl. Otherwise
> it has no effect. As long as the request hasn't been queued, the buffer is
> also not queued to the driver, so out_vb will indeed be NULL.

It's better, but still not working. Currently ffmpeg sometimes reports such 
messages: https://pastebin.com/raw/9tVVtc20 This is dmesg output: http://
ix.io/1L1L

It seems to me like a race condition. Sometimes flush works as indendent and 
sometimes it influences next frame.

Best regards,
Jernje

> 
> Sorry for the confusion.
> 
> Regards,
> 
> 	Hans
> 
> > Best regards,
> > Jernej
> > 
> >> Regards,
> >> 
> >> 	Hans
> >> 	
> >>> Thanks to Jonas for adjusting ffmpeg.
> >>> 
> >>> Please let me know what you think.
> >>> 
> >>> Best regards,
> >>> Jernej
> >>> 
> >>> Changes from v1:
> >>> - added Rb tags
> >>> - updated V4L2_DEC_CMD_FLUSH documentation
> >>> - updated first slice detection in Cedrus
> >>> - hold capture buffer flag is set according to source format
> >>> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> >>> 
> >>> Hans Verkuil (2):
> >>>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> >>>   videodev2.h: add V4L2_DEC_CMD_FLUSH
> >>> 
> >>> Jernej Skrabec (4):
> >>>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
> >>>   media: cedrus: Detect first slice of a frame
> >>>   media: cedrus: h264: Support multiple slices per frame
> >>>   media: cedrus: Add support for holding capture buffer
> >>>  
> >>>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
> >>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
> >>>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
> >>>  .../media/videodev2.h.rst.exceptions          |  1 +
> >>>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
> >>>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
> >>>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
> >>>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
> >>>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
> >>>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
> >>>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
> >>>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
> >>>  include/media/videobuf2-core.h                |  3 ++
> >>>  include/media/videobuf2-v4l2.h                |  5 ++
> >>>  include/uapi/linux/videodev2.h                | 14 ++++--
> >>>  15 files changed, 175 insertions(+), 11 deletions(-)





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  2019-09-29 20:00 ` [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers Jernej Skrabec
@ 2019-10-04  9:21   ` Hans Verkuil
  2019-10-07  6:02     ` Jernej Škrabec
  0 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-10-04  9:21 UTC (permalink / raw)
  To: Jernej Skrabec, mchehab, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas

On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> These helpers are used by stateless codecs when they support multiple
> slices per frame and hold capture buffer flag is set. It's expected that
> all such codecs will use this code.
> 
> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> ---
>  drivers/media/v4l2-core/v4l2-mem2mem.c | 35 ++++++++++++++++++++++++++
>  include/media/v4l2-mem2mem.h           |  4 +++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
> index 19937dd3c6f6..2677a07e4c9b 100644
> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> @@ -1154,6 +1154,41 @@ int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
>  }
>  EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_try_decoder_cmd);
>  
> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
> +					     struct v4l2_decoder_cmd *dc)
> +{
> +	if (dc->cmd != V4L2_DEC_CMD_FLUSH)
> +		return -EINVAL;
> +
> +	dc->flags = 0;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_try_decoder_cmd);
> +
> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
> +					 struct v4l2_decoder_cmd *dc)
> +{
> +	struct v4l2_fh *fh = file->private_data;
> +	struct vb2_v4l2_buffer *out_vb, *cap_vb;
> +	int ret;
> +
> +	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
> +	if (ret < 0)
> +		return ret;
> +
> +	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
> +	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);

I think this should be v4l2_m2m_next_dst_buf. If multiple capture buffers were
queued up, then it can only be the first capture buffer that can be 'HELD'.

This might solve the race condition you saw with ffmpeg.

Regards,

	Hans

> +
> +	if (out_vb)
> +		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> +	else if (cap_vb && cap_vb->is_held)
> +		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_decoder_cmd);
> +
>  /*
>   * v4l2_file_operations helpers. It is assumed here same lock is used
>   * for the output and the capture buffer queue.
> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> index c9fa96c8eed1..8ae2f56c7fa3 100644
> --- a/include/media/v4l2-mem2mem.h
> +++ b/include/media/v4l2-mem2mem.h
> @@ -714,6 +714,10 @@ int v4l2_m2m_ioctl_try_encoder_cmd(struct file *file, void *fh,
>  				   struct v4l2_encoder_cmd *ec);
>  int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
>  				   struct v4l2_decoder_cmd *dc);
> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
> +					     struct v4l2_decoder_cmd *dc);
> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
> +					 struct v4l2_decoder_cmd *dc);
>  int v4l2_m2m_fop_mmap(struct file *file, struct vm_area_struct *vma);
>  __poll_t v4l2_m2m_fop_poll(struct file *file, poll_table *wait);
>  
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  2019-10-04  9:21   ` Hans Verkuil
@ 2019-10-07  6:02     ` Jernej Škrabec
  2019-10-07  8:32       ` Hans Verkuil
  0 siblings, 1 reply; 22+ messages in thread
From: Jernej Škrabec @ 2019-10-07  6:02 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne petek, 04. oktober 2019 ob 11:21:12 CEST je Hans Verkuil napisal(a):
> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> > These helpers are used by stateless codecs when they support multiple
> > slices per frame and hold capture buffer flag is set. It's expected that
> > all such codecs will use this code.
> > 
> > Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> > ---
> > 
> >  drivers/media/v4l2-core/v4l2-mem2mem.c | 35 ++++++++++++++++++++++++++
> >  include/media/v4l2-mem2mem.h           |  4 +++
> >  2 files changed, 39 insertions(+)
> > 
> > diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
> > b/drivers/media/v4l2-core/v4l2-mem2mem.c index 19937dd3c6f6..2677a07e4c9b
> > 100644
> > --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> > +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> > @@ -1154,6 +1154,41 @@ int v4l2_m2m_ioctl_try_decoder_cmd(struct file
> > *file, void *fh,> 
> >  }
> >  EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_try_decoder_cmd);
> > 
> > +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
> > +					     struct 
v4l2_decoder_cmd *dc)
> > +{
> > +	if (dc->cmd != V4L2_DEC_CMD_FLUSH)
> > +		return -EINVAL;
> > +
> > +	dc->flags = 0;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_try_decoder_cmd);
> > +
> > +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
> > +					 struct 
v4l2_decoder_cmd *dc)
> > +{
> > +	struct v4l2_fh *fh = file->private_data;
> > +	struct vb2_v4l2_buffer *out_vb, *cap_vb;
> > +	int ret;
> > +
> > +	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
> > +	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
> 
> I think this should be v4l2_m2m_next_dst_buf. If multiple capture buffers
> were queued up, then it can only be the first capture buffer that can be
> 'HELD'.

I'm pretty sure v4l2_m2m_last_dst_buf() is correct. We want to affect last job 
in the queue, all jobs before are already properly handled by comparing 
timestamps.

> 
> This might solve the race condition you saw with ffmpeg.

This actually doesn't change anything. ffmpeg currently queues only one job and 
then waits until it's executed. In this case it actually doesn't matter if 
"last" or "next" variant is used.

Best regards,
Jernej

> 
> Regards,
> 
> 	Hans
> 
> > +
> > +	if (out_vb)
> > +		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> > +	else if (cap_vb && cap_vb->is_held)
> > +		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_decoder_cmd);
> > +
> > 
> >  /*
> >  
> >   * v4l2_file_operations helpers. It is assumed here same lock is used
> >   * for the output and the capture buffer queue.
> > 
> > diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> > index c9fa96c8eed1..8ae2f56c7fa3 100644
> > --- a/include/media/v4l2-mem2mem.h
> > +++ b/include/media/v4l2-mem2mem.h
> > @@ -714,6 +714,10 @@ int v4l2_m2m_ioctl_try_encoder_cmd(struct file *file,
> > void *fh,> 
> >  				   struct v4l2_encoder_cmd *ec);
> >  
> >  int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
> >  
> >  				   struct v4l2_decoder_cmd *dc);
> > 
> > +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
> > +					     struct 
v4l2_decoder_cmd *dc);
> > +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
> > +					 struct 
v4l2_decoder_cmd *dc);
> > 
> >  int v4l2_m2m_fop_mmap(struct file *file, struct vm_area_struct *vma);
> >  __poll_t v4l2_m2m_fop_poll(struct file *file, poll_table *wait);





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  2019-10-07  6:02     ` Jernej Škrabec
@ 2019-10-07  8:32       ` Hans Verkuil
  2019-10-07  9:44         ` Hans Verkuil
  0 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-10-07  8:32 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

On 10/7/19 8:02 AM, Jernej Škrabec wrote:
> Dne petek, 04. oktober 2019 ob 11:21:12 CEST je Hans Verkuil napisal(a):
>> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
>>> These helpers are used by stateless codecs when they support multiple
>>> slices per frame and hold capture buffer flag is set. It's expected that
>>> all such codecs will use this code.
>>>
>>> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
>>> ---
>>>
>>>  drivers/media/v4l2-core/v4l2-mem2mem.c | 35 ++++++++++++++++++++++++++
>>>  include/media/v4l2-mem2mem.h           |  4 +++
>>>  2 files changed, 39 insertions(+)
>>>
>>> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
>>> b/drivers/media/v4l2-core/v4l2-mem2mem.c index 19937dd3c6f6..2677a07e4c9b
>>> 100644
>>> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
>>> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
>>> @@ -1154,6 +1154,41 @@ int v4l2_m2m_ioctl_try_decoder_cmd(struct file
>>> *file, void *fh,> 
>>>  }
>>>  EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_try_decoder_cmd);
>>>
>>> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
>>> +					     struct 
> v4l2_decoder_cmd *dc)
>>> +{
>>> +	if (dc->cmd != V4L2_DEC_CMD_FLUSH)
>>> +		return -EINVAL;
>>> +
>>> +	dc->flags = 0;
>>> +
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_try_decoder_cmd);
>>> +
>>> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
>>> +					 struct 
> v4l2_decoder_cmd *dc)
>>> +{
>>> +	struct v4l2_fh *fh = file->private_data;
>>> +	struct vb2_v4l2_buffer *out_vb, *cap_vb;
>>> +	int ret;
>>> +
>>> +	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
>>> +	if (ret < 0)
>>> +		return ret;
>>> +
>>> +	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
>>> +	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
>>
>> I think this should be v4l2_m2m_next_dst_buf. If multiple capture buffers
>> were queued up, then it can only be the first capture buffer that can be
>> 'HELD'.
> 
> I'm pretty sure v4l2_m2m_last_dst_buf() is correct. We want to affect last job 
> in the queue, all jobs before are already properly handled by comparing 
> timestamps.

You're absolutely right.

> 
>>
>> This might solve the race condition you saw with ffmpeg.
> 
> This actually doesn't change anything. ffmpeg currently queues only one job and 
> then waits until it's executed. In this case it actually doesn't matter if 
> "last" or "next" variant is used.

Can you debug this a bit further? I don't want to merge this unless we know what's
going wrong with ffmpeg.

I suspect it is a race condition. I'll reply to patch 6/6 with more info.

Regards,

	Hans

> 
> Best regards,
> Jernej
> 
>>
>> Regards,
>>
>> 	Hans
>>
>>> +
>>> +	if (out_vb)
>>> +		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>>> +	else if (cap_vb && cap_vb->is_held)
>>> +		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>>> +
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_decoder_cmd);
>>> +
>>>
>>>  /*
>>>  
>>>   * v4l2_file_operations helpers. It is assumed here same lock is used
>>>   * for the output and the capture buffer queue.
>>>
>>> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
>>> index c9fa96c8eed1..8ae2f56c7fa3 100644
>>> --- a/include/media/v4l2-mem2mem.h
>>> +++ b/include/media/v4l2-mem2mem.h
>>> @@ -714,6 +714,10 @@ int v4l2_m2m_ioctl_try_encoder_cmd(struct file *file,
>>> void *fh,> 
>>>  				   struct v4l2_encoder_cmd *ec);
>>>  
>>>  int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
>>>  
>>>  				   struct v4l2_decoder_cmd *dc);
>>>
>>> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
>>> +					     struct 
> v4l2_decoder_cmd *dc);
>>> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
>>> +					 struct 
> v4l2_decoder_cmd *dc);
>>>
>>>  int v4l2_m2m_fop_mmap(struct file *file, struct vm_area_struct *vma);
>>>  __poll_t v4l2_m2m_fop_poll(struct file *file, poll_table *wait);
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
  2019-10-07  8:32       ` Hans Verkuil
@ 2019-10-07  9:44         ` Hans Verkuil
  0 siblings, 0 replies; 22+ messages in thread
From: Hans Verkuil @ 2019-10-07  9:44 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

On 10/7/19 10:32 AM, Hans Verkuil wrote:
> On 10/7/19 8:02 AM, Jernej Škrabec wrote:
>> Dne petek, 04. oktober 2019 ob 11:21:12 CEST je Hans Verkuil napisal(a):
>>> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
>>>> These helpers are used by stateless codecs when they support multiple
>>>> slices per frame and hold capture buffer flag is set. It's expected that
>>>> all such codecs will use this code.
>>>>
>>>> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
>>>> ---
>>>>
>>>>  drivers/media/v4l2-core/v4l2-mem2mem.c | 35 ++++++++++++++++++++++++++
>>>>  include/media/v4l2-mem2mem.h           |  4 +++
>>>>  2 files changed, 39 insertions(+)
>>>>
>>>> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
>>>> b/drivers/media/v4l2-core/v4l2-mem2mem.c index 19937dd3c6f6..2677a07e4c9b
>>>> 100644
>>>> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
>>>> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
>>>> @@ -1154,6 +1154,41 @@ int v4l2_m2m_ioctl_try_decoder_cmd(struct file
>>>> *file, void *fh,> 
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_try_decoder_cmd);
>>>>
>>>> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
>>>> +					     struct 
>> v4l2_decoder_cmd *dc)
>>>> +{
>>>> +	if (dc->cmd != V4L2_DEC_CMD_FLUSH)
>>>> +		return -EINVAL;
>>>> +
>>>> +	dc->flags = 0;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_try_decoder_cmd);
>>>> +
>>>> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
>>>> +					 struct 
>> v4l2_decoder_cmd *dc)
>>>> +{
>>>> +	struct v4l2_fh *fh = file->private_data;
>>>> +	struct vb2_v4l2_buffer *out_vb, *cap_vb;
>>>> +	int ret;
>>>> +
>>>> +	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
>>>> +	if (ret < 0)
>>>> +		return ret;
>>>> +
>>>> +	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
>>>> +	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
>>>
>>> I think this should be v4l2_m2m_next_dst_buf. If multiple capture buffers
>>> were queued up, then it can only be the first capture buffer that can be
>>> 'HELD'.
>>
>> I'm pretty sure v4l2_m2m_last_dst_buf() is correct. We want to affect last job 
>> in the queue, all jobs before are already properly handled by comparing 
>> timestamps.
> 
> You're absolutely right.
> 
>>
>>>
>>> This might solve the race condition you saw with ffmpeg.
>>
>> This actually doesn't change anything. ffmpeg currently queues only one job and 
>> then waits until it's executed. In this case it actually doesn't matter if 
>> "last" or "next" variant is used.
> 
> Can you debug this a bit further? I don't want to merge this unless we know what's
> going wrong with ffmpeg.
> 
> I suspect it is a race condition. I'll reply to patch 6/6 with more info.

I've decided to make a v3 of this series. There are major locking issues with this
and this will require a bit of rework.

Regards,

	Hans

> 
> Regards,
> 
> 	Hans
> 
>>
>> Best regards,
>> Jernej
>>
>>>
>>> Regards,
>>>
>>> 	Hans
>>>
>>>> +
>>>> +	if (out_vb)
>>>> +		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>>>> +	else if (cap_vb && cap_vb->is_held)
>>>> +		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(v4l2_m2m_ioctl_stateless_decoder_cmd);
>>>> +
>>>>
>>>>  /*
>>>>  
>>>>   * v4l2_file_operations helpers. It is assumed here same lock is used
>>>>   * for the output and the capture buffer queue.
>>>>
>>>> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
>>>> index c9fa96c8eed1..8ae2f56c7fa3 100644
>>>> --- a/include/media/v4l2-mem2mem.h
>>>> +++ b/include/media/v4l2-mem2mem.h
>>>> @@ -714,6 +714,10 @@ int v4l2_m2m_ioctl_try_encoder_cmd(struct file *file,
>>>> void *fh,> 
>>>>  				   struct v4l2_encoder_cmd *ec);
>>>>  
>>>>  int v4l2_m2m_ioctl_try_decoder_cmd(struct file *file, void *fh,
>>>>  
>>>>  				   struct v4l2_decoder_cmd *dc);
>>>>
>>>> +int v4l2_m2m_ioctl_stateless_try_decoder_cmd(struct file *file, void *fh,
>>>> +					     struct 
>> v4l2_decoder_cmd *dc);
>>>> +int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
>>>> +					 struct 
>> v4l2_decoder_cmd *dc);
>>>>
>>>>  int v4l2_m2m_fop_mmap(struct file *file, struct vm_area_struct *vma);
>>>>  __poll_t v4l2_m2m_fop_poll(struct file *file, poll_table *wait);
>>
>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
                   ` (6 preceding siblings ...)
  2019-09-30  8:10 ` [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Hans Verkuil
@ 2019-10-07 10:44 ` Hans Verkuil
  2019-10-07 19:01   ` Jernej Škrabec
  7 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-10-07 10:44 UTC (permalink / raw)
  To: Jernej Skrabec, mchehab, paul.kocialkowski, mripard, pawel,
	m.szyprowski, kyungmin.park, tfiga, wens
  Cc: gregkh, boris.brezillon, linux-media, linux-kernel, devel,
	linux-arm-kernel, ezequiel, jonas

Hi Jernej,

On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> This series adds support for decoding multi-slice H264 frames along with
> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> 
> Code was tested by modified ffmpeg, which can be found here:
> https://github.com/jernejsk/FFmpeg, branch mainline-test
> It has to be configured with at least following options:
> --enable-v4l2-request --enable-libudev --enable-libdrm
> 
> Samples used for testing:
> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> http://jernej.libreelec.tv/videos/h264/h264.mp4
> 
> Command line used for testing:
> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt bgra -f fbdev /dev/fb0
> 
> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> not sure how. ffmpeg follows exactly which slice is last in frame
> and sets hold flag accordingly. Improper usage of hold flag would
> corrupt ffmpeg assumptions and it would probably crash. Any ideas
> how to test this are welcome!
> 
> Thanks to Jonas for adjusting ffmpeg.
> 
> Please let me know what you think.
> 
> Best regards,
> Jernej
> 
> Changes from v1:
> - added Rb tags
> - updated V4L2_DEC_CMD_FLUSH documentation
> - updated first slice detection in Cedrus
> - hold capture buffer flag is set according to source format
> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> 
> Hans Verkuil (2):
>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
>   videodev2.h: add V4L2_DEC_CMD_FLUSH
> 
> Jernej Skrabec (4):
>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
>   media: cedrus: Detect first slice of a frame
>   media: cedrus: h264: Support multiple slices per frame
>   media: cedrus: Add support for holding capture buffer
> 
>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
>  .../media/videodev2.h.rst.exceptions          |  1 +
>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
>  include/media/videobuf2-core.h                |  3 ++
>  include/media/videobuf2-v4l2.h                |  5 ++
>  include/uapi/linux/videodev2.h                | 14 ++++--
>  15 files changed, 175 insertions(+), 11 deletions(-)
> 

I didn't want to make a v3 of this series, instead I hacked this patch that will
hopefully do all the locking right.

Basically I moved all the 'held' related code into v4l2-mem2mem under job_spinlock.
This simplifies the driver code as well.

But this is a hack that sits on top of this series. If your ffmpeg tests are now
successful, then I'll turn this into a proper series with correct documentation
(a lot of the comments are now wrong with this patch, so just ignore that).

Regards,

	Hans

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 2677a07e4c9b..f81a8f2465ab 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -412,25 +412,24 @@ static void v4l2_m2m_cancel_job(struct v4l2_m2m_ctx *m2m_ctx)
 	}
 }

-void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
-			 struct v4l2_m2m_ctx *m2m_ctx)
+static bool _v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
+			  struct v4l2_m2m_ctx *m2m_ctx)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
 	if (!m2m_dev->curr_ctx || m2m_dev->curr_ctx != m2m_ctx) {
-		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
 		dprintk("Called by an instance not currently running\n");
-		return;
+		return false;
 	}

 	list_del(&m2m_dev->curr_ctx->queue);
 	m2m_dev->curr_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
 	wake_up(&m2m_dev->curr_ctx->finished);
 	m2m_dev->curr_ctx = NULL;
+	return true;
+}

-	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
-
+static void v4l2_m2m_job_next(struct v4l2_m2m_dev *m2m_dev,
+		       struct v4l2_m2m_ctx *m2m_ctx)
+{
 	/* This instance might have more buffers ready, but since we do not
 	 * allow more than one job on the job_queue per instance, each has
 	 * to be scheduled separately after the previous one finishes. */
@@ -441,8 +440,113 @@ void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
 	 */
 	schedule_work(&m2m_dev->job_work);
 }
+
+void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
+			 struct v4l2_m2m_ctx *m2m_ctx)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
+	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
+		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+
+	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
+}
 EXPORT_SYMBOL(v4l2_m2m_job_finish);

+void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
+			 struct v4l2_m2m_ctx *m2m_ctx,
+			 enum vb2_buffer_state state)
+{
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	unsigned long flags;
+
+	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
+	src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);
+	dst_buf = v4l2_m2m_next_dst_buf(m2m_ctx);
+
+	if (!src_buf || !dst_buf) {
+		pr_err("Missing source and/or destination buffers\n");
+		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+		return;
+	}
+	v4l2_m2m_buf_done(src_buf, state);
+	if (!dst_buf->is_held) {
+		v4l2_m2m_dst_buf_remove(m2m_ctx);
+		v4l2_m2m_buf_done(dst_buf, state);
+	}
+	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
+		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+
+	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
+}
+EXPORT_SYMBOL(v4l2_m2m_job_finish_held);
+
+/**
+ * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
+ * released
+ *
+ * @out_vb: the output buffer
+ * @cap_vb: the capture buffer
+ *
+ * This helper function returns true if the current capture buffer should
+ * be released to vb2. This is the case if the output buffer specified that
+ * the capture buffer should be held (i.e. not returned to vb2) AND if the
+ * timestamp of the capture buffer differs from the output buffer timestamp.
+ *
+ * This helper is to be called at the start of the device_run callback:
+ *
+ * .. code-block:: c
+ *
+ *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
+ *		v4l2_m2m_dst_buf_remove(m2m_ctx);
+ *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+ *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
+ *	}
+ *	cap_vb->is_held = out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+ *
+ *	...
+ *
+ *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
+ *	if (!cap_vb->is_held) {
+ *		v4l2_m2m_dst_buf_remove(m2m_ctx);
+ *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+ *	}
+ *
+ * This allows for multiple output buffers to be used to fill in a single
+ * capture buffer. This is typically used by stateless decoders where
+ * multiple e.g. H.264 slices contribute to a single decoded frame.
+ */
+struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx *m2m_ctx)
+{
+	struct v4l2_m2m_dev *m2m_dev = m2m_ctx->m2m_dev;
+	struct vb2_v4l2_buffer *src, *dst;
+	unsigned long flags;
+
+	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
+	src = v4l2_m2m_next_src_buf(m2m_ctx);
+	dst = v4l2_m2m_next_dst_buf(m2m_ctx);
+
+	if (dst->is_held && dst->vb2_buf.copied_timestamp &&
+	    src->vb2_buf.timestamp != dst->vb2_buf.timestamp) {
+		dst->is_held = false;
+		v4l2_m2m_dst_buf_remove(m2m_ctx);
+		v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
+		dst = v4l2_m2m_next_dst_buf(m2m_ctx);
+	}
+	dst->is_held = src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+	src->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
+	return dst;
+}
+EXPORT_SYMBOL(v4l2_m2m_release_capture_buf);
+
 int v4l2_m2m_reqbufs(struct file *file, struct v4l2_m2m_ctx *m2m_ctx,
 		     struct v4l2_requestbuffers *reqbufs)
 {
@@ -1171,19 +1275,28 @@ int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file *file, void *priv,
 {
 	struct v4l2_fh *fh = file->private_data;
 	struct vb2_v4l2_buffer *out_vb, *cap_vb;
+	struct v4l2_m2m_dev *m2m_dev = fh->m2m_ctx->m2m_dev;
+	unsigned long flags;
 	int ret;

 	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
 	if (ret < 0)
 		return ret;

+	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
 	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
 	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);

-	if (out_vb)
+	if (out_vb && (out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)) {
 		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
-	else if (cap_vb && cap_vb->is_held)
-		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+	} else if (cap_vb && cap_vb->is_held) {
+		cap_vb->is_held = false;
+		if (m2m_dev->curr_ctx) {
+			v4l2_m2m_dst_buf_remove(fh->m2m_ctx);
+			v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
+		}
+	}
+	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);

 	return 0;
 }
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index 67f7d4326fc1..4e30f263b427 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -30,14 +30,7 @@ void cedrus_device_run(void *priv)
 	struct media_request *src_req;

 	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
-
-	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
-		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
-		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
-		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
-	}
-	run.dst->is_held = run.src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
+	run.dst = v4l2_m2m_release_capture_buf(ctx->fh.m2m_ctx);

 	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
 		run.src->vb2_buf.timestamp != run.dst->vb2_buf.timestamp;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index 99fedec80224..242cad82cc8c 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -103,7 +103,6 @@ static irqreturn_t cedrus_irq(int irq, void *data)
 {
 	struct cedrus_dev *dev = data;
 	struct cedrus_ctx *ctx;
-	struct vb2_v4l2_buffer *src_buf, *dst_buf;
 	enum vb2_buffer_state state;
 	enum cedrus_irq_status status;

@@ -121,26 +120,12 @@ static irqreturn_t cedrus_irq(int irq, void *data)
 	dev->dec_ops[ctx->current_codec]->irq_disable(ctx);
 	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);

-	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
-
-	if (!src_buf || !dst_buf) {
-		v4l2_err(&dev->v4l2_dev,
-			 "Missing source and/or destination buffers\n");
-		return IRQ_HANDLED;
-	}
-
 	if (status == CEDRUS_IRQ_ERROR)
 		state = VB2_BUF_STATE_ERROR;
 	else
 		state = VB2_BUF_STATE_DONE;

-	v4l2_m2m_buf_done(src_buf, state);
-	if (!dst_buf->is_held) {
-		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
-		v4l2_m2m_buf_done(dst_buf, state);
-	}
-	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
+	v4l2_m2m_job_finish_held(ctx->dev->m2m_dev, ctx->fh.m2m_ctx, state);

 	return IRQ_HANDLED;
 }
diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index 8ae2f56c7fa3..48ca7d3eaa3d 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -173,6 +173,10 @@ void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx *m2m_ctx);
 void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
 			 struct v4l2_m2m_ctx *m2m_ctx);

+void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
+			 struct v4l2_m2m_ctx *m2m_ctx,
+			 enum vb2_buffer_state state);
+
 static inline void
 v4l2_m2m_buf_done(struct vb2_v4l2_buffer *buf, enum vb2_buffer_state state)
 {
@@ -644,47 +648,7 @@ void v4l2_m2m_buf_copy_metadata(const struct vb2_v4l2_buffer *out_vb,
 				struct vb2_v4l2_buffer *cap_vb,
 				bool copy_frame_flags);

-/**
- * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
- * released
- *
- * @out_vb: the output buffer
- * @cap_vb: the capture buffer
- *
- * This helper function returns true if the current capture buffer should
- * be released to vb2. This is the case if the output buffer specified that
- * the capture buffer should be held (i.e. not returned to vb2) AND if the
- * timestamp of the capture buffer differs from the output buffer timestamp.
- *
- * This helper is to be called at the start of the device_run callback:
- *
- * .. code-block:: c
- *
- *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
- *		v4l2_m2m_dst_buf_remove(m2m_ctx);
- *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
- *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
- *	}
- *	cap_vb->is_held = out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
- *
- *	...
- *
- *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
- *	if (!cap_vb->is_held) {
- *		v4l2_m2m_dst_buf_remove(m2m_ctx);
- *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
- *	}
- *
- * This allows for multiple output buffers to be used to fill in a single
- * capture buffer. This is typically used by stateless decoders where
- * multiple e.g. H.264 slices contribute to a single decoded frame.
- */
-static inline bool v4l2_m2m_release_capture_buf(const struct vb2_v4l2_buffer *out_vb,
-						const struct vb2_v4l2_buffer *cap_vb)
-{
-	return cap_vb->is_held && cap_vb->vb2_buf.copied_timestamp &&
-	       out_vb->vb2_buf.timestamp != cap_vb->vb2_buf.timestamp;
-}
+struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx *m2m_ctx);

 /* v4l2 request helper */


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-10-07 10:44 ` Hans Verkuil
@ 2019-10-07 19:01   ` Jernej Škrabec
  2019-10-09 10:18     ` Hans Verkuil
  0 siblings, 1 reply; 22+ messages in thread
From: Jernej Škrabec @ 2019-10-07 19:01 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne ponedeljek, 07. oktober 2019 ob 12:44:24 CEST je Hans Verkuil napisal(a):
> Hi Jernej,
> 
> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> > This series adds support for decoding multi-slice H264 frames along with
> > support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> > 
> > Code was tested by modified ffmpeg, which can be found here:
> > https://github.com/jernejsk/FFmpeg, branch mainline-test
> > It has to be configured with at least following options:
> > --enable-v4l2-request --enable-libudev --enable-libdrm
> > 
> > Samples used for testing:
> > http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> > http://jernej.libreelec.tv/videos/h264/h264.mp4
> > 
> > Command line used for testing:
> > ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
> > bgra -f fbdev /dev/fb0
> > 
> > Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> > not sure how. ffmpeg follows exactly which slice is last in frame
> > and sets hold flag accordingly. Improper usage of hold flag would
> > corrupt ffmpeg assumptions and it would probably crash. Any ideas
> > how to test this are welcome!
> > 
> > Thanks to Jonas for adjusting ffmpeg.
> > 
> > Please let me know what you think.
> > 
> > Best regards,
> > Jernej
> > 
> > Changes from v1:
> > - added Rb tags
> > - updated V4L2_DEC_CMD_FLUSH documentation
> > - updated first slice detection in Cedrus
> > - hold capture buffer flag is set according to source format
> > - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> > 
> > Hans Verkuil (2):
> >   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> >   videodev2.h: add V4L2_DEC_CMD_FLUSH
> > 
> > Jernej Skrabec (4):
> >   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
> >   media: cedrus: Detect first slice of a frame
> >   media: cedrus: h264: Support multiple slices per frame
> >   media: cedrus: Add support for holding capture buffer
> >  
> >  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
> >  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
> >  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
> >  .../media/videodev2.h.rst.exceptions          |  1 +
> >  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
> >  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
> >  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
> >  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
> >  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
> >  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
> >  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
> >  include/media/videobuf2-core.h                |  3 ++
> >  include/media/videobuf2-v4l2.h                |  5 ++
> >  include/uapi/linux/videodev2.h                | 14 ++++--
> >  15 files changed, 175 insertions(+), 11 deletions(-)
> 
> I didn't want to make a v3 of this series, instead I hacked this patch that
> will hopefully do all the locking right.
> 
> Basically I moved all the 'held' related code into v4l2-mem2mem under
> job_spinlock. This simplifies the driver code as well.
> 
> But this is a hack that sits on top of this series. If your ffmpeg tests are
> now successful, then I'll turn this into a proper series with correct
> documentation (a lot of the comments are now wrong with this patch, so just
> ignore that).

Thanks for looking into this! With small fix mentioned below, it works! Note 
that both scenarios I tested (flushing during decoding and flushing after 
decoding is finished) are focused on capture queue. In order to trigger output 
queue flush, ffmpeg would need to queue multiple jobs and call flush before they 
are all processed. This is not something I can do at this time. Maybe Jonas 
can help with modifying ffmpeg appropriately. However, code for case seems 
correct to me.

> 
> Regards,
> 
> 	Hans
> 
> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
> b/drivers/media/v4l2-core/v4l2-mem2mem.c index 2677a07e4c9b..f81a8f2465ab
> 100644
> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> @@ -412,25 +412,24 @@ static void v4l2_m2m_cancel_job(struct v4l2_m2m_ctx
> *m2m_ctx) }
>  }
> 
> -void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> -			 struct v4l2_m2m_ctx *m2m_ctx)
> +static bool _v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> +			  struct v4l2_m2m_ctx *m2m_ctx)
>  {
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>  	if (!m2m_dev->curr_ctx || m2m_dev->curr_ctx != m2m_ctx) {
> -		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>  		dprintk("Called by an instance not currently 
running\n");
> -		return;
> +		return false;
>  	}
> 
>  	list_del(&m2m_dev->curr_ctx->queue);
>  	m2m_dev->curr_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
>  	wake_up(&m2m_dev->curr_ctx->finished);
>  	m2m_dev->curr_ctx = NULL;
> +	return true;
> +}
> 
> -	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> -
> +static void v4l2_m2m_job_next(struct v4l2_m2m_dev *m2m_dev,
> +		       struct v4l2_m2m_ctx *m2m_ctx)
> +{
>  	/* This instance might have more buffers ready, but since we do not
>  	 * allow more than one job on the job_queue per instance, each has
>  	 * to be scheduled separately after the previous one finishes. */
> @@ -441,8 +440,113 @@ void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> */
>  	schedule_work(&m2m_dev->job_work);
>  }
> +
> +void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> +			 struct v4l2_m2m_ctx *m2m_ctx)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +		return;
> +	}
> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +
> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
> +}
>  EXPORT_SYMBOL(v4l2_m2m_job_finish);
> 
> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
> +			 struct v4l2_m2m_ctx *m2m_ctx,
> +			 enum vb2_buffer_state state)
> +{
> +	struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> +	src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);
> +	dst_buf = v4l2_m2m_next_dst_buf(m2m_ctx);
> +
> +	if (!src_buf || !dst_buf) {
> +		pr_err("Missing source and/or destination buffers\n");
> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +		return;
> +	}
> +	v4l2_m2m_buf_done(src_buf, state);
> +	if (!dst_buf->is_held) {
> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
> +		v4l2_m2m_buf_done(dst_buf, state);
> +	}
> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +		return;
> +	}
> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +
> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
> +}
> +EXPORT_SYMBOL(v4l2_m2m_job_finish_held);
> +
> +/**
> + * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
> + * released
> + *
> + * @out_vb: the output buffer
> + * @cap_vb: the capture buffer
> + *
> + * This helper function returns true if the current capture buffer should
> + * be released to vb2. This is the case if the output buffer specified that
> + * the capture buffer should be held (i.e. not returned to vb2) AND if the
> + * timestamp of the capture buffer differs from the output buffer
> timestamp. + *
> + * This helper is to be called at the start of the device_run callback:
> + *
> + * .. code-block:: c
> + *
> + *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> + *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
> + *	}
> + *	cap_vb->is_held = out_vb->flags & 
V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> + *
> + *	...
> + *
> + *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
> + *	if (!cap_vb->is_held) {
> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> + *	}
> + *
> + * This allows for multiple output buffers to be used to fill in a single
> + * capture buffer. This is typically used by stateless decoders where
> + * multiple e.g. H.264 slices contribute to a single decoded frame.
> + */
> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
> *m2m_ctx) +{
> +	struct v4l2_m2m_dev *m2m_dev = m2m_ctx->m2m_dev;
> +	struct vb2_v4l2_buffer *src, *dst;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> +	src = v4l2_m2m_next_src_buf(m2m_ctx);
> +	dst = v4l2_m2m_next_dst_buf(m2m_ctx);
> +
> +	if (dst->is_held && dst->vb2_buf.copied_timestamp &&
> +	    src->vb2_buf.timestamp != dst->vb2_buf.timestamp) {
> +		dst->is_held = false;
> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
> +		v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
> +		dst = v4l2_m2m_next_dst_buf(m2m_ctx);
> +	}
> +	dst->is_held = src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> +	src->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> +	return dst;
> +}
> +EXPORT_SYMBOL(v4l2_m2m_release_capture_buf);
> +
>  int v4l2_m2m_reqbufs(struct file *file, struct v4l2_m2m_ctx *m2m_ctx,
>  		     struct v4l2_requestbuffers *reqbufs)
>  {
> @@ -1171,19 +1275,28 @@ int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file
> *file, void *priv, {
>  	struct v4l2_fh *fh = file->private_data;
>  	struct vb2_v4l2_buffer *out_vb, *cap_vb;
> +	struct v4l2_m2m_dev *m2m_dev = fh->m2m_ctx->m2m_dev;
> +	unsigned long flags;
>  	int ret;
> 
>  	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
>  	if (ret < 0)
>  		return ret;
> 
> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>  	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
>  	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
> 
> -	if (out_vb)
> +	if (out_vb && (out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)) 
{
>  		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> -	else if (cap_vb && cap_vb->is_held)
> -		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> +	} else if (cap_vb && cap_vb->is_held) {
> +		cap_vb->is_held = false;
> +		if (m2m_dev->curr_ctx) {

Above condition should be negated.

Best regards,
Jernej

> +			v4l2_m2m_dst_buf_remove(fh->m2m_ctx);
> +			v4l2_m2m_buf_done(cap_vb, 
VB2_BUF_STATE_DONE);
> +		}
> +	}
> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> 
>  	return 0;
>  }
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c index
> 67f7d4326fc1..4e30f263b427 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> @@ -30,14 +30,7 @@ void cedrus_device_run(void *priv)
>  	struct media_request *src_req;
> 
>  	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> -	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> -
> -	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> -		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
> -		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> -	}
> -	run.dst->is_held = run.src->flags & 
V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> +	run.dst = v4l2_m2m_release_capture_buf(ctx->fh.m2m_ctx);
> 
>  	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
>  		run.src->vb2_buf.timestamp != run.dst-
>vb2_buf.timestamp;
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c index
> 99fedec80224..242cad82cc8c 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> @@ -103,7 +103,6 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>  {
>  	struct cedrus_dev *dev = data;
>  	struct cedrus_ctx *ctx;
> -	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>  	enum vb2_buffer_state state;
>  	enum cedrus_irq_status status;
> 
> @@ -121,26 +120,12 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>  	dev->dec_ops[ctx->current_codec]->irq_disable(ctx);
>  	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
> 
> -	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> -	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> -
> -	if (!src_buf || !dst_buf) {
> -		v4l2_err(&dev->v4l2_dev,
> -			 "Missing source and/or destination 
buffers\n");
> -		return IRQ_HANDLED;
> -	}
> -
>  	if (status == CEDRUS_IRQ_ERROR)
>  		state = VB2_BUF_STATE_ERROR;
>  	else
>  		state = VB2_BUF_STATE_DONE;
> 
> -	v4l2_m2m_buf_done(src_buf, state);
> -	if (!dst_buf->is_held) {
> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> -		v4l2_m2m_buf_done(dst_buf, state);
> -	}
> -	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
> +	v4l2_m2m_job_finish_held(ctx->dev->m2m_dev, ctx->fh.m2m_ctx, state);
> 
>  	return IRQ_HANDLED;
>  }
> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> index 8ae2f56c7fa3..48ca7d3eaa3d 100644
> --- a/include/media/v4l2-mem2mem.h
> +++ b/include/media/v4l2-mem2mem.h
> @@ -173,6 +173,10 @@ void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx
> *m2m_ctx); void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>  			 struct v4l2_m2m_ctx *m2m_ctx);
> 
> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
> +			 struct v4l2_m2m_ctx *m2m_ctx,
> +			 enum vb2_buffer_state state);
> +
>  static inline void
>  v4l2_m2m_buf_done(struct vb2_v4l2_buffer *buf, enum vb2_buffer_state state)
> {
> @@ -644,47 +648,7 @@ void v4l2_m2m_buf_copy_metadata(const struct
> vb2_v4l2_buffer *out_vb, struct vb2_v4l2_buffer *cap_vb,
>  				bool copy_frame_flags);
> 
> -/**
> - * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
> - * released
> - *
> - * @out_vb: the output buffer
> - * @cap_vb: the capture buffer
> - *
> - * This helper function returns true if the current capture buffer should
> - * be released to vb2. This is the case if the output buffer specified that
> - * the capture buffer should be held (i.e. not returned to vb2) AND if the
> - * timestamp of the capture buffer differs from the output buffer
> timestamp. - *
> - * This helper is to be called at the start of the device_run callback:
> - *
> - * .. code-block:: c
> - *
> - *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> - *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
> - *	}
> - *	cap_vb->is_held = out_vb->flags & 
V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> - *
> - *	...
> - *
> - *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
> - *	if (!cap_vb->is_held) {
> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> - *	}
> - *
> - * This allows for multiple output buffers to be used to fill in a single
> - * capture buffer. This is typically used by stateless decoders where
> - * multiple e.g. H.264 slices contribute to a single decoded frame.
> - */
> -static inline bool v4l2_m2m_release_capture_buf(const struct
> vb2_v4l2_buffer *out_vb, -					
	const struct vb2_v4l2_buffer *cap_vb)
> -{
> -	return cap_vb->is_held && cap_vb->vb2_buf.copied_timestamp &&
> -	       out_vb->vb2_buf.timestamp != cap_vb->vb2_buf.timestamp;
> -}
> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
> *m2m_ctx);
> 
>  /* v4l2 request helper */





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-10-07 19:01   ` Jernej Škrabec
@ 2019-10-09 10:18     ` Hans Verkuil
  2019-10-09 14:44       ` Jernej Škrabec
  0 siblings, 1 reply; 22+ messages in thread
From: Hans Verkuil @ 2019-10-09 10:18 UTC (permalink / raw)
  To: Jernej Škrabec
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

On 10/7/19 9:01 PM, Jernej Škrabec wrote:
> Dne ponedeljek, 07. oktober 2019 ob 12:44:24 CEST je Hans Verkuil napisal(a):
>> Hi Jernej,
>>
>> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
>>> This series adds support for decoding multi-slice H264 frames along with
>>> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
>>>
>>> Code was tested by modified ffmpeg, which can be found here:
>>> https://github.com/jernejsk/FFmpeg, branch mainline-test
>>> It has to be configured with at least following options:
>>> --enable-v4l2-request --enable-libudev --enable-libdrm
>>>
>>> Samples used for testing:
>>> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
>>> http://jernej.libreelec.tv/videos/h264/h264.mp4
>>>
>>> Command line used for testing:
>>> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
>>> bgra -f fbdev /dev/fb0
>>>
>>> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
>>> not sure how. ffmpeg follows exactly which slice is last in frame
>>> and sets hold flag accordingly. Improper usage of hold flag would
>>> corrupt ffmpeg assumptions and it would probably crash. Any ideas
>>> how to test this are welcome!
>>>
>>> Thanks to Jonas for adjusting ffmpeg.
>>>
>>> Please let me know what you think.
>>>
>>> Best regards,
>>> Jernej
>>>
>>> Changes from v1:
>>> - added Rb tags
>>> - updated V4L2_DEC_CMD_FLUSH documentation
>>> - updated first slice detection in Cedrus
>>> - hold capture buffer flag is set according to source format
>>> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
>>>
>>> Hans Verkuil (2):
>>>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
>>>   videodev2.h: add V4L2_DEC_CMD_FLUSH
>>>
>>> Jernej Skrabec (4):
>>>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
>>>   media: cedrus: Detect first slice of a frame
>>>   media: cedrus: h264: Support multiple slices per frame
>>>   media: cedrus: Add support for holding capture buffer
>>>  
>>>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
>>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
>>>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
>>>  .../media/videodev2.h.rst.exceptions          |  1 +
>>>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
>>>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
>>>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
>>>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
>>>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
>>>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
>>>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
>>>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
>>>  include/media/videobuf2-core.h                |  3 ++
>>>  include/media/videobuf2-v4l2.h                |  5 ++
>>>  include/uapi/linux/videodev2.h                | 14 ++++--
>>>  15 files changed, 175 insertions(+), 11 deletions(-)
>>
>> I didn't want to make a v3 of this series, instead I hacked this patch that
>> will hopefully do all the locking right.
>>
>> Basically I moved all the 'held' related code into v4l2-mem2mem under
>> job_spinlock. This simplifies the driver code as well.
>>
>> But this is a hack that sits on top of this series. If your ffmpeg tests are
>> now successful, then I'll turn this into a proper series with correct
>> documentation (a lot of the comments are now wrong with this patch, so just
>> ignore that).
> 
> Thanks for looking into this! With small fix mentioned below, it works! Note 
> that both scenarios I tested (flushing during decoding and flushing after 
> decoding is finished) are focused on capture queue. In order to trigger output 
> queue flush, ffmpeg would need to queue multiple jobs and call flush before they 
> are all processed. This is not something I can do at this time. Maybe Jonas 
> can help with modifying ffmpeg appropriately. However, code for case seems 
> correct to me.
> 
>>
>> Regards,
>>
>> 	Hans
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
>> b/drivers/media/v4l2-core/v4l2-mem2mem.c index 2677a07e4c9b..f81a8f2465ab
>> 100644
>> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
>> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
>> @@ -412,25 +412,24 @@ static void v4l2_m2m_cancel_job(struct v4l2_m2m_ctx
>> *m2m_ctx) }
>>  }
>>
>> -void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>> -			 struct v4l2_m2m_ctx *m2m_ctx)
>> +static bool _v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>> +			  struct v4l2_m2m_ctx *m2m_ctx)
>>  {
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>>  	if (!m2m_dev->curr_ctx || m2m_dev->curr_ctx != m2m_ctx) {
>> -		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>>  		dprintk("Called by an instance not currently 
> running\n");
>> -		return;
>> +		return false;
>>  	}
>>
>>  	list_del(&m2m_dev->curr_ctx->queue);
>>  	m2m_dev->curr_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
>>  	wake_up(&m2m_dev->curr_ctx->finished);
>>  	m2m_dev->curr_ctx = NULL;
>> +	return true;
>> +}
>>
>> -	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> -
>> +static void v4l2_m2m_job_next(struct v4l2_m2m_dev *m2m_dev,
>> +		       struct v4l2_m2m_ctx *m2m_ctx)
>> +{
>>  	/* This instance might have more buffers ready, but since we do not
>>  	 * allow more than one job on the job_queue per instance, each has
>>  	 * to be scheduled separately after the previous one finishes. */
>> @@ -441,8 +440,113 @@ void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>> */
>>  	schedule_work(&m2m_dev->job_work);
>>  }
>> +
>> +void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>> +			 struct v4l2_m2m_ctx *m2m_ctx)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
>> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +		return;
>> +	}
>> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +
>> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
>> +}
>>  EXPORT_SYMBOL(v4l2_m2m_job_finish);
>>
>> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
>> +			 struct v4l2_m2m_ctx *m2m_ctx,
>> +			 enum vb2_buffer_state state)
>> +{
>> +	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>> +	src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);
>> +	dst_buf = v4l2_m2m_next_dst_buf(m2m_ctx);
>> +
>> +	if (!src_buf || !dst_buf) {
>> +		pr_err("Missing source and/or destination buffers\n");
>> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +		return;
>> +	}
>> +	v4l2_m2m_buf_done(src_buf, state);
>> +	if (!dst_buf->is_held) {
>> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> +		v4l2_m2m_buf_done(dst_buf, state);
>> +	}
>> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
>> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +		return;
>> +	}
>> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +
>> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
>> +}
>> +EXPORT_SYMBOL(v4l2_m2m_job_finish_held);
>> +
>> +/**
>> + * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
>> + * released
>> + *
>> + * @out_vb: the output buffer
>> + * @cap_vb: the capture buffer
>> + *
>> + * This helper function returns true if the current capture buffer should
>> + * be released to vb2. This is the case if the output buffer specified that
>> + * the capture buffer should be held (i.e. not returned to vb2) AND if the
>> + * timestamp of the capture buffer differs from the output buffer
>> timestamp. + *
>> + * This helper is to be called at the start of the device_run callback:
>> + *
>> + * .. code-block:: c
>> + *
>> + *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
>> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>> + *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
>> + *	}
>> + *	cap_vb->is_held = out_vb->flags & 
> V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> + *
>> + *	...
>> + *
>> + *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
>> + *	if (!cap_vb->is_held) {
>> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>> + *	}
>> + *
>> + * This allows for multiple output buffers to be used to fill in a single
>> + * capture buffer. This is typically used by stateless decoders where
>> + * multiple e.g. H.264 slices contribute to a single decoded frame.
>> + */
>> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
>> *m2m_ctx) +{
>> +	struct v4l2_m2m_dev *m2m_dev = m2m_ctx->m2m_dev;
>> +	struct vb2_v4l2_buffer *src, *dst;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>> +	src = v4l2_m2m_next_src_buf(m2m_ctx);
>> +	dst = v4l2_m2m_next_dst_buf(m2m_ctx);
>> +
>> +	if (dst->is_held && dst->vb2_buf.copied_timestamp &&
>> +	    src->vb2_buf.timestamp != dst->vb2_buf.timestamp) {
>> +		dst->is_held = false;
>> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> +		v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
>> +		dst = v4l2_m2m_next_dst_buf(m2m_ctx);
>> +	}
>> +	dst->is_held = src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> +	src->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>> +	return dst;
>> +}
>> +EXPORT_SYMBOL(v4l2_m2m_release_capture_buf);
>> +
>>  int v4l2_m2m_reqbufs(struct file *file, struct v4l2_m2m_ctx *m2m_ctx,
>>  		     struct v4l2_requestbuffers *reqbufs)
>>  {
>> @@ -1171,19 +1275,28 @@ int v4l2_m2m_ioctl_stateless_decoder_cmd(struct file
>> *file, void *priv, {
>>  	struct v4l2_fh *fh = file->private_data;
>>  	struct vb2_v4l2_buffer *out_vb, *cap_vb;
>> +	struct v4l2_m2m_dev *m2m_dev = fh->m2m_ctx->m2m_dev;
>> +	unsigned long flags;
>>  	int ret;
>>
>>  	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
>>  	if (ret < 0)
>>  		return ret;
>>
>> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
>>  	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
>>  	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
>>
>> -	if (out_vb)
>> +	if (out_vb && (out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF)) 
> {
>>  		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> -	else if (cap_vb && cap_vb->is_held)
>> -		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>> +	} else if (cap_vb && cap_vb->is_held) {
>> +		cap_vb->is_held = false;
>> +		if (m2m_dev->curr_ctx) {
> 
> Above condition should be negated.

Close. It should check that this buffer isn't currently being processed.
So:

		if (m2m_dev->curr_ctx != fh->m2m_ctx) {

Can you test with this change? If this works, then I'll post a proper
series for this.

Thanks!

	Hans

> 
> Best regards,
> Jernej
> 
>> +			v4l2_m2m_dst_buf_remove(fh->m2m_ctx);
>> +			v4l2_m2m_buf_done(cap_vb, 
> VB2_BUF_STATE_DONE);
>> +		}
>> +	}
>> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
>>
>>  	return 0;
>>  }
>> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
>> b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c index
>> 67f7d4326fc1..4e30f263b427 100644
>> --- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
>> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
>> @@ -30,14 +30,7 @@ void cedrus_device_run(void *priv)
>>  	struct media_request *src_req;
>>
>>  	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
>> -	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>> -
>> -	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
>> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
>> -		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
>> -		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>> -	}
>> -	run.dst->is_held = run.src->flags & 
> V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> +	run.dst = v4l2_m2m_release_capture_buf(ctx->fh.m2m_ctx);
>>
>>  	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
>>  		run.src->vb2_buf.timestamp != run.dst-
>> vb2_buf.timestamp;
>> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
>> b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c index
>> 99fedec80224..242cad82cc8c 100644
>> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
>> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
>> @@ -103,7 +103,6 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>>  {
>>  	struct cedrus_dev *dev = data;
>>  	struct cedrus_ctx *ctx;
>> -	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>>  	enum vb2_buffer_state state;
>>  	enum cedrus_irq_status status;
>>
>> @@ -121,26 +120,12 @@ static irqreturn_t cedrus_irq(int irq, void *data)
>>  	dev->dec_ops[ctx->current_codec]->irq_disable(ctx);
>>  	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
>>
>> -	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
>> -	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>> -
>> -	if (!src_buf || !dst_buf) {
>> -		v4l2_err(&dev->v4l2_dev,
>> -			 "Missing source and/or destination 
> buffers\n");
>> -		return IRQ_HANDLED;
>> -	}
>> -
>>  	if (status == CEDRUS_IRQ_ERROR)
>>  		state = VB2_BUF_STATE_ERROR;
>>  	else
>>  		state = VB2_BUF_STATE_DONE;
>>
>> -	v4l2_m2m_buf_done(src_buf, state);
>> -	if (!dst_buf->is_held) {
>> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
>> -		v4l2_m2m_buf_done(dst_buf, state);
>> -	}
>> -	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
>> +	v4l2_m2m_job_finish_held(ctx->dev->m2m_dev, ctx->fh.m2m_ctx, state);
>>
>>  	return IRQ_HANDLED;
>>  }
>> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
>> index 8ae2f56c7fa3..48ca7d3eaa3d 100644
>> --- a/include/media/v4l2-mem2mem.h
>> +++ b/include/media/v4l2-mem2mem.h
>> @@ -173,6 +173,10 @@ void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx
>> *m2m_ctx); void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
>>  			 struct v4l2_m2m_ctx *m2m_ctx);
>>
>> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
>> +			 struct v4l2_m2m_ctx *m2m_ctx,
>> +			 enum vb2_buffer_state state);
>> +
>>  static inline void
>>  v4l2_m2m_buf_done(struct vb2_v4l2_buffer *buf, enum vb2_buffer_state state)
>> {
>> @@ -644,47 +648,7 @@ void v4l2_m2m_buf_copy_metadata(const struct
>> vb2_v4l2_buffer *out_vb, struct vb2_v4l2_buffer *cap_vb,
>>  				bool copy_frame_flags);
>>
>> -/**
>> - * v4l2_m2m_release_capture_buf() - check if the capture buffer should be
>> - * released
>> - *
>> - * @out_vb: the output buffer
>> - * @cap_vb: the capture buffer
>> - *
>> - * This helper function returns true if the current capture buffer should
>> - * be released to vb2. This is the case if the output buffer specified that
>> - * the capture buffer should be held (i.e. not returned to vb2) AND if the
>> - * timestamp of the capture buffer differs from the output buffer
>> timestamp. - *
>> - * This helper is to be called at the start of the device_run callback:
>> - *
>> - * .. code-block:: c
>> - *
>> - *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
>> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>> - *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
>> - *	}
>> - *	cap_vb->is_held = out_vb->flags & 
> V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
>> - *
>> - *	...
>> - *
>> - *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
>> - *	if (!cap_vb->is_held) {
>> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
>> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
>> - *	}
>> - *
>> - * This allows for multiple output buffers to be used to fill in a single
>> - * capture buffer. This is typically used by stateless decoders where
>> - * multiple e.g. H.264 slices contribute to a single decoded frame.
>> - */
>> -static inline bool v4l2_m2m_release_capture_buf(const struct
>> vb2_v4l2_buffer *out_vb, -					
> 	const struct vb2_v4l2_buffer *cap_vb)
>> -{
>> -	return cap_vb->is_held && cap_vb->vb2_buf.copied_timestamp &&
>> -	       out_vb->vb2_buf.timestamp != cap_vb->vb2_buf.timestamp;
>> -}
>> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
>> *m2m_ctx);
>>
>>  /* v4l2 request helper */
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames
  2019-10-09 10:18     ` Hans Verkuil
@ 2019-10-09 14:44       ` Jernej Škrabec
  0 siblings, 0 replies; 22+ messages in thread
From: Jernej Škrabec @ 2019-10-09 14:44 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: mchehab, paul.kocialkowski, mripard, pawel, m.szyprowski,
	kyungmin.park, tfiga, wens, gregkh, boris.brezillon, linux-media,
	linux-kernel, devel, linux-arm-kernel, ezequiel, jonas

Dne sreda, 09. oktober 2019 ob 12:18:45 CEST je Hans Verkuil napisal(a):
> On 10/7/19 9:01 PM, Jernej Škrabec wrote:
> > Dne ponedeljek, 07. oktober 2019 ob 12:44:24 CEST je Hans Verkuil 
napisal(a):
> >> Hi Jernej,
> >> 
> >> On 9/29/19 10:00 PM, Jernej Skrabec wrote:
> >>> This series adds support for decoding multi-slice H264 frames along with
> >>> support for V4L2_DEC_CMD_FLUSH and V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF.
> >>> 
> >>> Code was tested by modified ffmpeg, which can be found here:
> >>> https://github.com/jernejsk/FFmpeg, branch mainline-test
> >>> It has to be configured with at least following options:
> >>> --enable-v4l2-request --enable-libudev --enable-libdrm
> >>> 
> >>> Samples used for testing:
> >>> http://jernej.libreelec.tv/videos/h264/BA1_FT_C.mp4
> >>> http://jernej.libreelec.tv/videos/h264/h264.mp4
> >>> 
> >>> Command line used for testing:
> >>> ffmpeg -hwaccel drm -hwaccel_device /dev/dri/card0 -i h264.mp4 -pix_fmt
> >>> bgra -f fbdev /dev/fb0
> >>> 
> >>> Please note that V4L2_DEC_CMD_FLUSH was not tested because I'm
> >>> not sure how. ffmpeg follows exactly which slice is last in frame
> >>> and sets hold flag accordingly. Improper usage of hold flag would
> >>> corrupt ffmpeg assumptions and it would probably crash. Any ideas
> >>> how to test this are welcome!
> >>> 
> >>> Thanks to Jonas for adjusting ffmpeg.
> >>> 
> >>> Please let me know what you think.
> >>> 
> >>> Best regards,
> >>> Jernej
> >>> 
> >>> Changes from v1:
> >>> - added Rb tags
> >>> - updated V4L2_DEC_CMD_FLUSH documentation
> >>> - updated first slice detection in Cedrus
> >>> - hold capture buffer flag is set according to source format
> >>> - added v4l m2m stateless_(try_)decoder_cmd ioctl helpers
> >>> 
> >>> Hans Verkuil (2):
> >>>   vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF
> >>>   videodev2.h: add V4L2_DEC_CMD_FLUSH
> >>> 
> >>> Jernej Skrabec (4):
> >>>   media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers
> >>>   media: cedrus: Detect first slice of a frame
> >>>   media: cedrus: h264: Support multiple slices per frame
> >>>   media: cedrus: Add support for holding capture buffer
> >>>  
> >>>  Documentation/media/uapi/v4l/buffer.rst       | 13 ++++++
> >>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     | 10 +++-
> >>>  .../media/uapi/v4l/vidioc-reqbufs.rst         |  6 +++
> >>>  .../media/videodev2.h.rst.exceptions          |  1 +
> >>>  .../media/common/videobuf2/videobuf2-v4l2.c   |  8 +++-
> >>>  drivers/media/v4l2-core/v4l2-mem2mem.c        | 35 ++++++++++++++
> >>>  drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
> >>>  .../staging/media/sunxi/cedrus/cedrus_dec.c   | 11 +++++
> >>>  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 11 ++++-
> >>>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  8 ++--
> >>>  .../staging/media/sunxi/cedrus/cedrus_video.c | 14 ++++++
> >>>  include/media/v4l2-mem2mem.h                  | 46 +++++++++++++++++++
> >>>  include/media/videobuf2-core.h                |  3 ++
> >>>  include/media/videobuf2-v4l2.h                |  5 ++
> >>>  include/uapi/linux/videodev2.h                | 14 ++++--
> >>>  15 files changed, 175 insertions(+), 11 deletions(-)
> >> 
> >> I didn't want to make a v3 of this series, instead I hacked this patch
> >> that
> >> will hopefully do all the locking right.
> >> 
> >> Basically I moved all the 'held' related code into v4l2-mem2mem under
> >> job_spinlock. This simplifies the driver code as well.
> >> 
> >> But this is a hack that sits on top of this series. If your ffmpeg tests
> >> are now successful, then I'll turn this into a proper series with
> >> correct documentation (a lot of the comments are now wrong with this
> >> patch, so just ignore that).
> > 
> > Thanks for looking into this! With small fix mentioned below, it works!
> > Note that both scenarios I tested (flushing during decoding and flushing
> > after decoding is finished) are focused on capture queue. In order to
> > trigger output queue flush, ffmpeg would need to queue multiple jobs and
> > call flush before they are all processed. This is not something I can do
> > at this time. Maybe Jonas can help with modifying ffmpeg appropriately.
> > However, code for case seems correct to me.
> > 
> >> Regards,
> >> 
> >> 	Hans
> >> 
> >> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
> >> b/drivers/media/v4l2-core/v4l2-mem2mem.c index 2677a07e4c9b..f81a8f2465ab
> >> 100644
> >> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> >> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> >> @@ -412,25 +412,24 @@ static void v4l2_m2m_cancel_job(struct v4l2_m2m_ctx
> >> *m2m_ctx) }
> >> 
> >>  }
> >> 
> >> -void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> >> -			 struct v4l2_m2m_ctx *m2m_ctx)
> >> +static bool _v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> >> +			  struct v4l2_m2m_ctx *m2m_ctx)
> >> 
> >>  {
> >> 
> >> -	unsigned long flags;
> >> -
> >> -	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> >> 
> >>  	if (!m2m_dev->curr_ctx || m2m_dev->curr_ctx != m2m_ctx) {
> >> 
> >> -		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> 
> >>  		dprintk("Called by an instance not currently
> > 
> > running\n");
> > 
> >> -		return;
> >> +		return false;
> >> 
> >>  	}
> >>  	
> >>  	list_del(&m2m_dev->curr_ctx->queue);
> >>  	m2m_dev->curr_ctx->job_flags &= ~(TRANS_QUEUED | TRANS_RUNNING);
> >>  	wake_up(&m2m_dev->curr_ctx->finished);
> >>  	m2m_dev->curr_ctx = NULL;
> >> 
> >> +	return true;
> >> +}
> >> 
> >> -	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> -
> >> +static void v4l2_m2m_job_next(struct v4l2_m2m_dev *m2m_dev,
> >> +		       struct v4l2_m2m_ctx *m2m_ctx)
> >> +{
> >> 
> >>  	/* This instance might have more buffers ready, but since we do not
> >>  	
> >>  	 * allow more than one job on the job_queue per instance, each has
> >>  	 * to be scheduled separately after the previous one finishes. */
> >> 
> >> @@ -441,8 +440,113 @@ void v4l2_m2m_job_finish(struct v4l2_m2m_dev
> >> *m2m_dev, */
> >> 
> >>  	schedule_work(&m2m_dev->job_work);
> >>  
> >>  }
> >> 
> >> +
> >> +void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> >> +			 struct v4l2_m2m_ctx *m2m_ctx)
> >> +{
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> >> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
> >> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +		return;
> >> +	}
> >> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +
> >> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
> >> +}
> >> 
> >>  EXPORT_SYMBOL(v4l2_m2m_job_finish);
> >> 
> >> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
> >> +			 struct v4l2_m2m_ctx *m2m_ctx,
> >> +			 enum vb2_buffer_state state)
> >> +{
> >> +	struct vb2_v4l2_buffer *src_buf, *dst_buf;
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> >> +	src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);
> >> +	dst_buf = v4l2_m2m_next_dst_buf(m2m_ctx);
> >> +
> >> +	if (!src_buf || !dst_buf) {
> >> +		pr_err("Missing source and/or destination buffers\n");
> >> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +		return;
> >> +	}
> >> +	v4l2_m2m_buf_done(src_buf, state);
> >> +	if (!dst_buf->is_held) {
> >> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> +		v4l2_m2m_buf_done(dst_buf, state);
> >> +	}
> >> +	if (!_v4l2_m2m_job_finish(m2m_dev, m2m_ctx)) {
> >> +		spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +		return;
> >> +	}
> >> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +
> >> +	v4l2_m2m_job_next(m2m_dev, m2m_ctx);
> >> +}
> >> +EXPORT_SYMBOL(v4l2_m2m_job_finish_held);
> >> +
> >> +/**
> >> + * v4l2_m2m_release_capture_buf() - check if the capture buffer should
> >> be
> >> + * released
> >> + *
> >> + * @out_vb: the output buffer
> >> + * @cap_vb: the capture buffer
> >> + *
> >> + * This helper function returns true if the current capture buffer
> >> should
> >> + * be released to vb2. This is the case if the output buffer specified
> >> that + * the capture buffer should be held (i.e. not returned to vb2)
> >> AND if the + * timestamp of the capture buffer differs from the output
> >> buffer timestamp. + *
> >> + * This helper is to be called at the start of the device_run callback:
> >> + *
> >> + * .. code-block:: c
> >> + *
> >> + *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
> >> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> >> + *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
> >> + *	}
> >> + *	cap_vb->is_held = out_vb->flags &
> > 
> > V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> > 
> >> + *
> >> + *	...
> >> + *
> >> + *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
> >> + *	if (!cap_vb->is_held) {
> >> + *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> + *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> >> + *	}
> >> + *
> >> + * This allows for multiple output buffers to be used to fill in a
> >> single
> >> + * capture buffer. This is typically used by stateless decoders where
> >> + * multiple e.g. H.264 slices contribute to a single decoded frame.
> >> + */
> >> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
> >> *m2m_ctx) +{
> >> +	struct v4l2_m2m_dev *m2m_dev = m2m_ctx->m2m_dev;
> >> +	struct vb2_v4l2_buffer *src, *dst;
> >> +	unsigned long flags;
> >> +
> >> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> >> +	src = v4l2_m2m_next_src_buf(m2m_ctx);
> >> +	dst = v4l2_m2m_next_dst_buf(m2m_ctx);
> >> +
> >> +	if (dst->is_held && dst->vb2_buf.copied_timestamp &&
> >> +	    src->vb2_buf.timestamp != dst->vb2_buf.timestamp) {
> >> +		dst->is_held = false;
> >> +		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> +		v4l2_m2m_buf_done(dst, VB2_BUF_STATE_DONE);
> >> +		dst = v4l2_m2m_next_dst_buf(m2m_ctx);
> >> +	}
> >> +	dst->is_held = src->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> >> +	src->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> >> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> +	return dst;
> >> +}
> >> +EXPORT_SYMBOL(v4l2_m2m_release_capture_buf);
> >> +
> >> 
> >>  int v4l2_m2m_reqbufs(struct file *file, struct v4l2_m2m_ctx *m2m_ctx,
> >>  
> >>  		     struct v4l2_requestbuffers *reqbufs)
> >>  
> >>  {
> >> 
> >> @@ -1171,19 +1275,28 @@ int v4l2_m2m_ioctl_stateless_decoder_cmd(struct
> >> file *file, void *priv, {
> >> 
> >>  	struct v4l2_fh *fh = file->private_data;
> >>  	struct vb2_v4l2_buffer *out_vb, *cap_vb;
> >> 
> >> +	struct v4l2_m2m_dev *m2m_dev = fh->m2m_ctx->m2m_dev;
> >> +	unsigned long flags;
> >> 
> >>  	int ret;
> >>  	
> >>  	ret = v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv, dc);
> >>  	if (ret < 0)
> >>  	
> >>  		return ret;
> >> 
> >> +	spin_lock_irqsave(&m2m_dev->job_spinlock, flags);
> >> 
> >>  	out_vb = v4l2_m2m_last_src_buf(fh->m2m_ctx);
> >>  	cap_vb = v4l2_m2m_last_dst_buf(fh->m2m_ctx);
> >> 
> >> -	if (out_vb)
> >> +	if (out_vb && (out_vb->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF))
> > 
> > {
> > 
> >>  		out_vb->flags &= ~V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> >> 
> >> -	else if (cap_vb && cap_vb->is_held)
> >> -		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> >> +	} else if (cap_vb && cap_vb->is_held) {
> >> +		cap_vb->is_held = false;
> >> +		if (m2m_dev->curr_ctx) {
> > 
> > Above condition should be negated.
> 
> Close. It should check that this buffer isn't currently being processed.
> So:
> 
> 		if (m2m_dev->curr_ctx != fh->m2m_ctx) {
> 
> Can you test with this change? If this works, then I'll post a proper
> series for this.

I confirm that it works.

Best regards,
Jernej

> 
> Thanks!
> 
> 	Hans
> 
> > Best regards,
> > Jernej
> > 
> >> +			v4l2_m2m_dst_buf_remove(fh->m2m_ctx);
> >> +			v4l2_m2m_buf_done(cap_vb,
> > 
> > VB2_BUF_STATE_DONE);
> > 
> >> +		}
> >> +	}
> >> +	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> >> 
> >>  	return 0;
> >>  
> >>  }
> >> 
> >> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> >> b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c index
> >> 67f7d4326fc1..4e30f263b427 100644
> >> --- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> >> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
> >> @@ -30,14 +30,7 @@ void cedrus_device_run(void *priv)
> >> 
> >>  	struct media_request *src_req;
> >>  	
> >>  	run.src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
> >> 
> >> -	run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> >> -
> >> -	if (v4l2_m2m_release_capture_buf(run.src, run.dst)) {
> >> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> >> -		v4l2_m2m_buf_done(run.dst, VB2_BUF_STATE_DONE);
> >> -		run.dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> >> -	}
> >> -	run.dst->is_held = run.src->flags &
> > 
> > V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> > 
> >> +	run.dst = v4l2_m2m_release_capture_buf(ctx->fh.m2m_ctx);
> >> 
> >>  	run.first_slice = !run.dst->vb2_buf.copied_timestamp ||
> >>  	
> >>  		run.src->vb2_buf.timestamp != run.dst-
> >> 
> >> vb2_buf.timestamp;
> >> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> >> b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c index
> >> 99fedec80224..242cad82cc8c 100644
> >> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> >> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> >> @@ -103,7 +103,6 @@ static irqreturn_t cedrus_irq(int irq, void *data)
> >> 
> >>  {
> >>  
> >>  	struct cedrus_dev *dev = data;
> >>  	struct cedrus_ctx *ctx;
> >> 
> >> -	struct vb2_v4l2_buffer *src_buf, *dst_buf;
> >> 
> >>  	enum vb2_buffer_state state;
> >>  	enum cedrus_irq_status status;
> >> 
> >> @@ -121,26 +120,12 @@ static irqreturn_t cedrus_irq(int irq, void *data)
> >> 
> >>  	dev->dec_ops[ctx->current_codec]->irq_disable(ctx);
> >>  	dev->dec_ops[ctx->current_codec]->irq_clear(ctx);
> >> 
> >> -	src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> >> -	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
> >> -
> >> -	if (!src_buf || !dst_buf) {
> >> -		v4l2_err(&dev->v4l2_dev,
> >> -			 "Missing source and/or destination
> > 
> > buffers\n");
> > 
> >> -		return IRQ_HANDLED;
> >> -	}
> >> -
> >> 
> >>  	if (status == CEDRUS_IRQ_ERROR)
> >>  	
> >>  		state = VB2_BUF_STATE_ERROR;
> >>  	
> >>  	else
> >>  	
> >>  		state = VB2_BUF_STATE_DONE;
> >> 
> >> -	v4l2_m2m_buf_done(src_buf, state);
> >> -	if (!dst_buf->is_held) {
> >> -		v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> >> -		v4l2_m2m_buf_done(dst_buf, state);
> >> -	}
> >> -	v4l2_m2m_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx);
> >> +	v4l2_m2m_job_finish_held(ctx->dev->m2m_dev, ctx->fh.m2m_ctx, 
state);
> >> 
> >>  	return IRQ_HANDLED;
> >>  
> >>  }
> >> 
> >> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> >> index 8ae2f56c7fa3..48ca7d3eaa3d 100644
> >> --- a/include/media/v4l2-mem2mem.h
> >> +++ b/include/media/v4l2-mem2mem.h
> >> @@ -173,6 +173,10 @@ void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx
> >> *m2m_ctx); void v4l2_m2m_job_finish(struct v4l2_m2m_dev *m2m_dev,
> >> 
> >>  			 struct v4l2_m2m_ctx *m2m_ctx);
> >> 
> >> +void v4l2_m2m_job_finish_held(struct v4l2_m2m_dev *m2m_dev,
> >> +			 struct v4l2_m2m_ctx *m2m_ctx,
> >> +			 enum vb2_buffer_state state);
> >> +
> >> 
> >>  static inline void
> >>  v4l2_m2m_buf_done(struct vb2_v4l2_buffer *buf, enum vb2_buffer_state
> >>  state)
> >> 
> >> {
> >> @@ -644,47 +648,7 @@ void v4l2_m2m_buf_copy_metadata(const struct
> >> vb2_v4l2_buffer *out_vb, struct vb2_v4l2_buffer *cap_vb,
> >> 
> >>  				bool copy_frame_flags);
> >> 
> >> -/**
> >> - * v4l2_m2m_release_capture_buf() - check if the capture buffer should
> >> be
> >> - * released
> >> - *
> >> - * @out_vb: the output buffer
> >> - * @cap_vb: the capture buffer
> >> - *
> >> - * This helper function returns true if the current capture buffer
> >> should
> >> - * be released to vb2. This is the case if the output buffer specified
> >> that - * the capture buffer should be held (i.e. not returned to vb2)
> >> AND if the - * timestamp of the capture buffer differs from the output
> >> buffer timestamp. - *
> >> - * This helper is to be called at the start of the device_run callback:
> >> - *
> >> - * .. code-block:: c
> >> - *
> >> - *	if (v4l2_m2m_release_capture_buf(out_vb, cap_vb)) {
> >> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> >> - *		cap_vb = v4l2_m2m_next_dst_buf(m2m_ctx);
> >> - *	}
> >> - *	cap_vb->is_held = out_vb->flags &
> > 
> > V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
> > 
> >> - *
> >> - *	...
> >> - *
> >> - *	v4l2_m2m_buf_done(out_vb, VB2_BUF_STATE_DONE);
> >> - *	if (!cap_vb->is_held) {
> >> - *		v4l2_m2m_dst_buf_remove(m2m_ctx);
> >> - *		v4l2_m2m_buf_done(cap_vb, VB2_BUF_STATE_DONE);
> >> - *	}
> >> - *
> >> - * This allows for multiple output buffers to be used to fill in a
> >> single
> >> - * capture buffer. This is typically used by stateless decoders where
> >> - * multiple e.g. H.264 slices contribute to a single decoded frame.
> >> - */
> >> -static inline bool v4l2_m2m_release_capture_buf(const struct
> >> vb2_v4l2_buffer *out_vb, -
> >> 
> > 	const struct vb2_v4l2_buffer *cap_vb)
> > 	
> >> -{
> >> -	return cap_vb->is_held && cap_vb->vb2_buf.copied_timestamp &&
> >> -	       out_vb->vb2_buf.timestamp != cap_vb->vb2_buf.timestamp;
> >> -}
> >> +struct vb2_v4l2_buffer *v4l2_m2m_release_capture_buf(struct v4l2_m2m_ctx
> >> *m2m_ctx);
> >> 
> >>  /* v4l2 request helper */





^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-10-09 14:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-29 20:00 [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Jernej Skrabec
2019-09-29 20:00 ` [PATCH v2 1/6] vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF Jernej Skrabec
2019-09-29 20:00 ` [PATCH v2 2/6] videodev2.h: add V4L2_DEC_CMD_FLUSH Jernej Skrabec
2019-09-29 20:00 ` [PATCH v2 3/6] media: v4l2-mem2mem: add stateless_(try_)decoder_cmd ioctl helpers Jernej Skrabec
2019-10-04  9:21   ` Hans Verkuil
2019-10-07  6:02     ` Jernej Škrabec
2019-10-07  8:32       ` Hans Verkuil
2019-10-07  9:44         ` Hans Verkuil
2019-09-29 20:00 ` [PATCH v2 4/6] media: cedrus: Detect first slice of a frame Jernej Skrabec
2019-09-29 20:00 ` [PATCH v2 5/6] media: cedrus: h264: Support multiple slices per frame Jernej Skrabec
2019-09-29 20:00 ` [PATCH v2 6/6] media: cedrus: Add support for holding capture buffer Jernej Skrabec
2019-09-30  8:14   ` Hans Verkuil
2019-09-30 16:44     ` Jernej Škrabec
2019-09-30  8:10 ` [PATCH v2 0/6] media: cedrus: h264: Support multi-slice frames Hans Verkuil
2019-09-30 22:27   ` Jernej Škrabec
2019-09-30 22:43     ` Hans Verkuil
2019-09-30 22:58       ` Jernej Škrabec
2019-10-01  5:33       ` Jernej Škrabec
2019-10-07 10:44 ` Hans Verkuil
2019-10-07 19:01   ` Jernej Škrabec
2019-10-09 10:18     ` Hans Verkuil
2019-10-09 14:44       ` Jernej Škrabec

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).