linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/11] media: hantro: Add support for H264 decoding
@ 2019-08-12 19:35 Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 01/11] lib/sort.c: implement sort() variant taking context argument Ezequiel Garcia
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

A new version of H264 uAPI review and Hantro G1 H264 decoding support.
Compared to previous version, this version takes care of the
suggestions from Hans Verkuil clarifying the uAPI spec.

This series consolidates the two recent H264 series submitted
by Boris [1] [2]. Some patches from [2] have been merged (namely,
helpers for the Hantro driver), and so I'm adding the remanining
bits required to support H264 on Hantro G1 VPU.

* Patch 1 adds support for the sort_r() variant and has
  been posted separately by Rasmus. It would be good to merge this patch
  via the media tree, ideally as soon as possible, to avoid the
  synchronisation burden that might appear if we decide to delay it.

* Patch 2 to 4 extends the H264 uAPI, introducing frame-based vs slice-based
  decoding granularity, and also support for different NALU start codes.
  Currently, Annex B and no start codes are the supported options.

  With the introduction of the start code control, the H264 parsed
  slices pixel format should be renamed, dropping the _RAW suffix,
  which is now meaningless.

* Patch 5 removes the P0/B0/B1 ref lists from the decode_params control.
  These lists are no longer needed since we build them on the
  kernel side based on the DPB.

* Patch 6 and 7 exposes the proper decoding mode and start code
  on the cedrus driver. The driver functionality is not changed,
  and only the Cedrus support is now being properly exposed to
  userspace.

* Patch 8 is needed to properly propagate the OUTPUT buffer timestamp to
  the CAPTURE buffer one, which is required for intra-frame references.

* Patches 9 to 11 adds H264 support for Hantro G1 and then enable
  H264 decoding on RK3288.

This is based on media master and tested on Rockchip RK3288 for Hantro and
Allwinner H3 boards for Cedrus. Philipp Zabel tested on i.MX8MQ EVK using [3].

The Ffmpeg branch used to test is based on the great work of Jonas and Boris,
and is available in [4]. Instructions to build and run are as follows:

./configure --enable-v4l2-request --enable-libdrm
make -j4

(test via framebuffer rendering):

./ffmpeg -loglevel debug -hwaccel drm -hwaccel_device /dev/dri/card0 -i $some_file.avi -pix_fmt bgra -f fbdev /dev/fb0

[1] https://www.mail-archive.com/linux-media@vger.kernel.org/msg148299.html
[2] https://lkml.org/lkml/2019/6/19/379
[3] git://git.pengutronix.de/git/pza/linux.git hantro/imx8m-wip
[4] https://gitlab.collabora.com/ezequiel/ffmpeg/tree/stateless-mpeg2-vp8-h264-v4

Boris Brezillon (3):
  media: uapi: h264: Add the concept of decoding mode
  media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  media: hantro: Move copy_metadata() before doing a decode operation

Ezequiel Garcia (4):
  media: uapi: h264: Rename pixel format
  media: uapi: h264: Add the concept of start code
  media: cedrus: Cleanup control initialization
  media: cedrus: Specify H264 startcode and decoding mode

Hertz Wong (3):
  media: hantro: Add core bits to support H264 decoding
  media: hantro: Add support for H264 decoding on G1
  media: hantro: Enable H264 decoding on rk3288

Rasmus Villemoes (1):
  lib/sort.c: implement sort() variant taking context argument

 .../media/uapi/v4l/ext-ctrls-codec.rst        |  89 ++-
 .../media/uapi/v4l/pixfmt-compressed.rst      |   8 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  18 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |   2 +-
 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |   9 +-
 drivers/staging/media/hantro/hantro_drv.c     |  50 +-
 .../staging/media/hantro/hantro_g1_h264_dec.c | 292 ++++++++
 drivers/staging/media/hantro/hantro_h264.c    | 651 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  56 ++
 drivers/staging/media/hantro/hantro_v4l2.c    |  10 +
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |  21 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |  65 +-
 drivers/staging/media/sunxi/cedrus/cedrus.h   |   3 +-
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_video.c |   6 +-
 include/linux/sort.h                          |   5 +
 include/media/h264-ctrls.h                    |  22 +-
 lib/sort.c                                    |  34 +-
 19 files changed, 1290 insertions(+), 55 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_g1_h264_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_h264.c

-- 
2.22.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 01/11] lib/sort.c: implement sort() variant taking context argument
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 02/11] media: uapi: h264: Rename pixel format Ezequiel Garcia
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Rasmus Villemoes, Andrew Morton

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Our list_sort() utility has always supported a context argument that
is passed through to the comparison routine. Now there's a use case
for the similar thing for sort().

This implements sort_r by simply extending the existing sort function
in the obvious way. To avoid code duplication, we want to implement
sort() in terms of sort_r(). The naive way to do that is

static int cmp_wrapper(const void *a, const void *b, const void *ctx)
{
  int (*real_cmp)(const void*, const void*) = ctx;
  return real_cmp(a, b);
}

sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }

but this would do two indirect calls for each comparison. Instead, do
as is done for the default swap functions - that only adds a cost of a
single easily predicted branch to each comparison call.

Aside from introducing support for the context argument, this also
serves as preparation for patches that will eliminate the indirect
comparison calls in common cases.

Requested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
 include/linux/sort.h |  5 +++++
 lib/sort.c           | 34 ++++++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/include/linux/sort.h b/include/linux/sort.h
index 2b99a5dd073d..61b96d0ebc44 100644
--- a/include/linux/sort.h
+++ b/include/linux/sort.h
@@ -4,6 +4,11 @@
 
 #include <linux/types.h>
 
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp)(const void *, const void *, const void *),
+	    void (*swap)(void *, void *, int),
+	    const void *priv);
+
 void sort(void *base, size_t num, size_t size,
 	  int (*cmp)(const void *, const void *),
 	  void (*swap)(void *, void *, int));
diff --git a/lib/sort.c b/lib/sort.c
index cf408aec3733..d54cf97e9548 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
 		swap_func(a, b, (int)size);
 }
 
+typedef int (*cmp_func_t)(const void *, const void *);
+typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
+#define _CMP_WRAPPER ((cmp_r_func_t)0L)
+
+static int do_cmp(const void *a, const void *b,
+		  cmp_r_func_t cmp, const void *priv)
+{
+	if (cmp == _CMP_WRAPPER)
+		return ((cmp_func_t)(priv))(a, b);
+	return cmp(a, b, priv);
+}
+
 /**
  * parent - given the offset of the child, find the offset of the parent.
  * @i: the offset of the heap element whose parent is sought.  Non-zero.
@@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
 }
 
 /**
- * sort - sort an array of elements
+ * sort_r - sort an array of elements
  * @base: pointer to data to sort
  * @num: number of elements
  * @size: size of each element
  * @cmp_func: pointer to comparison function
  * @swap_func: pointer to swap function or NULL
+ * @priv: third argument passed to comparison function
  *
  * This function does a heapsort on the given array.  You may provide
  * a swap_func function if you need to do something more than a memory
@@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
  * O(n*n) worst-case behavior and extra memory requirements that make
  * it less suitable for kernel use.
  */
-void sort(void *base, size_t num, size_t size,
-	  int (*cmp_func)(const void *, const void *),
-	  void (*swap_func)(void *, void *, int size))
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp_func)(const void *, const void *, const void *),
+	    void (*swap_func)(void *, void *, int size),
+	    const void *priv)
 {
 	/* pre-scale counters for performance */
 	size_t n = num * size, a = (num/2) * size;
@@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
 		 * average, 3/4 worst-case.)
 		 */
 		for (b = a; c = 2*b + size, (d = c + size) < n;)
-			b = cmp_func(base + c, base + d) >= 0 ? c : d;
+			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
 		if (d == n)	/* Special case last leaf with no sibling */
 			b = c;
 
 		/* Now backtrack from "b" to the correct location for "a" */
-		while (b != a && cmp_func(base + a, base + b) >= 0)
+		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
 			b = parent(b, lsbit, size);
 		c = b;			/* Where "a" belongs */
 		while (b != a) {	/* Shift it into place */
@@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
 		}
 	}
 }
+EXPORT_SYMBOL(sort_r);
+
+void sort(void *base, size_t num, size_t size,
+	  int (*cmp_func)(const void *, const void *),
+	  void (*swap_func)(void *, void *, int size))
+{
+	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
+}
 EXPORT_SYMBOL(sort);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 02/11] media: uapi: h264: Rename pixel format
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 01/11] lib/sort.c: implement sort() variant taking context argument Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode Ezequiel Garcia
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

The V4L2_PIX_FMT_H264_SLICE_RAW name was originally suggested
because the pixel format would represent H264 slices without any
start code.

However, as we will now introduce a start code menu control,
give the pixel format a more meaningful name, while it's
still early enough to do so.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* None.
Changes in v4:
* New patch.
---
 Documentation/media/uapi/v4l/pixfmt-compressed.rst | 4 ++--
 drivers/media/v4l2-core/v4l2-ioctl.c               | 2 +-
 drivers/staging/media/sunxi/cedrus/cedrus_dec.c    | 2 +-
 drivers/staging/media/sunxi/cedrus/cedrus_video.c  | 6 +++---
 include/media/h264-ctrls.h                         | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/media/uapi/v4l/pixfmt-compressed.rst b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
index f52a7b67023d..9b65473a2288 100644
--- a/Documentation/media/uapi/v4l/pixfmt-compressed.rst
+++ b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
@@ -52,9 +52,9 @@ Compressed Formats
       - ``V4L2_PIX_FMT_H264_MVC``
       - 'M264'
       - H264 MVC video elementary stream.
-    * .. _V4L2-PIX-FMT-H264-SLICE-RAW:
+    * .. _V4L2-PIX-FMT-H264-SLICE:
 
-      - ``V4L2_PIX_FMT_H264_SLICE_RAW``
+      - ``V4L2_PIX_FMT_H264_SLICE``
       - 'S264'
       - H264 parsed slice data, without the start code and as
 	extracted from the H264 bitstream.  This format is adapted for
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index bb5b4926538a..39f10621c91b 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1343,7 +1343,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 		case V4L2_PIX_FMT_H264:		descr = "H.264"; break;
 		case V4L2_PIX_FMT_H264_NO_SC:	descr = "H.264 (No Start Codes)"; break;
 		case V4L2_PIX_FMT_H264_MVC:	descr = "H.264 MVC"; break;
-		case V4L2_PIX_FMT_H264_SLICE_RAW:	descr = "H.264 Parsed Slice Data"; break;
+		case V4L2_PIX_FMT_H264_SLICE:	descr = "H.264 Parsed Slice Data"; break;
 		case V4L2_PIX_FMT_H263:		descr = "H.263"; break;
 		case V4L2_PIX_FMT_MPEG1:	descr = "MPEG-1 ES"; break;
 		case V4L2_PIX_FMT_MPEG2:	descr = "MPEG-2 ES"; break;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index bdad87eb9d79..56ca4c9ad01c 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -46,7 +46,7 @@ void cedrus_device_run(void *priv)
 			V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION);
 		break;
 
-	case V4L2_PIX_FMT_H264_SLICE_RAW:
+	case V4L2_PIX_FMT_H264_SLICE:
 		run.h264.decode_params = cedrus_find_control_data(ctx,
 			V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS);
 		run.h264.pps = cedrus_find_control_data(ctx,
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
index e2b530b1a956..06c695615703 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c
@@ -38,7 +38,7 @@ static struct cedrus_format cedrus_formats[] = {
 		.directions	= CEDRUS_DECODE_SRC,
 	},
 	{
-		.pixelformat	= V4L2_PIX_FMT_H264_SLICE_RAW,
+		.pixelformat	= V4L2_PIX_FMT_H264_SLICE,
 		.directions	= CEDRUS_DECODE_SRC,
 	},
 	{
@@ -104,7 +104,7 @@ static void cedrus_prepare_format(struct v4l2_pix_format *pix_fmt)
 
 	switch (pix_fmt->pixelformat) {
 	case V4L2_PIX_FMT_MPEG2_SLICE:
-	case V4L2_PIX_FMT_H264_SLICE_RAW:
+	case V4L2_PIX_FMT_H264_SLICE:
 		/* Zero bytes per line for encoded source. */
 		bytesperline = 0;
 
@@ -469,7 +469,7 @@ static int cedrus_start_streaming(struct vb2_queue *vq, unsigned int count)
 		ctx->current_codec = CEDRUS_CODEC_MPEG2;
 		break;
 
-	case V4L2_PIX_FMT_H264_SLICE_RAW:
+	case V4L2_PIX_FMT_H264_SLICE:
 		ctx->current_codec = CEDRUS_CODEC_H264;
 		break;
 
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index e1404d78d6ff..6160a69c0143 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -14,7 +14,7 @@
 #include <linux/videodev2.h>
 
 /* Our pixel format isn't stable at the moment */
-#define V4L2_PIX_FMT_H264_SLICE_RAW v4l2_fourcc('S', '2', '6', '4') /* H264 parsed slices */
+#define V4L2_PIX_FMT_H264_SLICE v4l2_fourcc('S', '2', '6', '4') /* H264 parsed slices */
 
 /*
  * This is put insanely high to avoid conflicting with controls that
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 01/11] lib/sort.c: implement sort() variant taking context argument Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 02/11] media: uapi: h264: Rename pixel format Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-13  6:46   ` Hans Verkuil
  2019-08-14 12:23   ` Paul Kocialkowski
  2019-08-12 19:35 ` [PATCH v5 04/11] media: uapi: h264: Add the concept of start code Ezequiel Garcia
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel

From: Boris Brezillon <boris.brezillon@collabora.com>

Some stateless decoders don't support per-slice decoding granularity
(or at least not in a way that would make them efficient or easy to use).

Expose a menu to control the supported decoding modes. Drivers are
allowed to support only one decoding but they can support both too.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* Improve specification as suggested by Hans.
Changes in v4:
* Typos/rewording fixes
Changes in v3:
* s/per-{slice,frame} decoding/{slice,frame}-based decoding/
* Add Paul's R-b
Changes in v2:
* Allow decoding multiple slices in per-slice decoding mode
* Minor doc improvement/fixes
---
 .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
 .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
 include/media/h264-ctrls.h                    | 11 +++++
 4 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index c5f39dd50043..568390273fde 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -1747,6 +1747,11 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - __u32
       - ``size``
       -
+    * - __u32
+      - ``start_byte_offset``
+      - Where the slice payload starts in the output buffer. Useful when the
+        OUTPUT buffer contains more than one slice (some codecs need to know
+        where each slice starts in this buffer).
     * - __u32
       - ``header_bit_size``
       -
@@ -1930,7 +1935,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
       -
     * - __u16
       - ``num_slices``
-      - Number of slices needed to decode the current frame
+      - Number of slices needed to decode the current frame/field. When
+        operating in slice-based decoding mode (see
+        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
+        should always be set to one.
     * - __u16
       - ``nal_ref_idc``
       - NAL reference ID value coming from the NAL Unit header
@@ -2021,6 +2029,43 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
       - 0x00000004
       - The DPB entry is a long term reference frame
 
+``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
+    Specifies the decoding mode to use. Currently exposes slice-based and
+    frame-based decoding but new modes might be added later on.
+    This control is used to complement V4L2_PIX_FMT_H264_SLICE
+    pixel format. Applications that support V4L2_PIX_FMT_H264_SLICE
+    are required to set this control in order to specify the decoding mode
+    that is expected for the buffer.
+    Drivers may expose a single or multiple decoding modes, depending
+    on what they can support.
+
+    .. note::
+
+       This menu control is not yet part of the public kernel API and
+       it is expected to change.
+
+.. c:type:: v4l2_mpeg_video_h264_decoding_mode
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
+      - 0
+      - The decoding is done at the slice granularity.
+        v4l2_ctrl_h264_decode_params->num_slices should be set to 1.
+        The OUTPUT buffer must contain a single slice.
+    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
+      - 1
+      - The decoding is done at the frame granularity.
+        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
+        slices forming a frame.
+        The OUTPUT buffer must contain all slices needed to decode the
+        frame. The OUTPUT buffer must also contain both fields.
+
 .. _v4l2-mpeg-mpeg2:
 
 ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
diff --git a/Documentation/media/uapi/v4l/pixfmt-compressed.rst b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
index 9b65473a2288..50557c85d99d 100644
--- a/Documentation/media/uapi/v4l/pixfmt-compressed.rst
+++ b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
@@ -61,7 +61,8 @@ Compressed Formats
 	stateless video decoders that implement an H264 pipeline
 	(using the :ref:`mem2mem` and :ref:`media-request-api`).
 	Metadata associated with the frame to decode are required to
-	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
+	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE``,
+        ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
 	``V4L2_CID_MPEG_VIDEO_H264_PPS``,
 	``V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX``,
 	``V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS`` and
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index cd1ae016706f..c3194299bfac 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -402,6 +402,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		"Explicit",
 		NULL,
 	};
+	static const char * const h264_decoding_mode[] = {
+		"Slice-Based",
+		"Frame-Based",
+		NULL,
+	};
 	static const char * const mpeg_mpeg2_level[] = {
 		"Low",
 		"Main",
@@ -633,6 +638,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		return h264_fp_arrangement_type;
 	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
 		return h264_fmo_map_type;
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
+		return h264_decoding_mode;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 		return mpeg_mpeg2_level;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
@@ -852,6 +859,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
 	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
 	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
 	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
@@ -1220,6 +1228,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
 	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
 	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
 	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index 6160a69c0143..e6c510877f67 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -26,6 +26,7 @@
 #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
 #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
 #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
+#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
 
 /* enum v4l2_ctrl_type type values */
 #define V4L2_CTRL_TYPE_H264_SPS			0x0110
@@ -33,6 +34,12 @@
 #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
 #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
 #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
+#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
+
+enum v4l2_mpeg_video_h264_decoding_mode {
+	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
+	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
+};
 
 #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
 #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
@@ -125,6 +132,10 @@ struct v4l2_h264_pred_weight_table {
 struct v4l2_ctrl_h264_slice_params {
 	/* Size in bytes, including header */
 	__u32 size;
+
+	/* Offset in bytes to the start of slice in the OUTPUT buffer. */
+	__u32 start_byte_offset;
+
 	/* Offset in bits to slice_data() from the beginning of this slice. */
 	__u32 header_bit_size;
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 04/11] media: uapi: h264: Add the concept of start code
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (2 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-14  8:11   ` Hans Verkuil
  2019-08-12 19:35 ` [PATCH v5 05/11] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Ezequiel Garcia
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

Stateless decoders have different expectations about the
start code that is prepended on H264 slices. Add a
menu control to express the supported start code types
(including no start code).

Drivers are allowed to support only one start code type,
but they can support both too.

Note that this is independent of the H264 decoding mode,
which specifies the granularity of the decoding operations.
Either in frame-based or slice-based mode, this new control
will allow to define the start code expected on H264 slices.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* Improve specification as suggested by Hans.
Changes in v4:
* New patch.
---
 .../media/uapi/v4l/ext-ctrls-codec.rst        | 33 +++++++++++++++++++
 .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  9 +++++
 include/media/h264-ctrls.h                    |  6 ++++
 4 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index 568390273fde..ec0078e299a5 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -2066,6 +2066,39 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
         The OUTPUT buffer must contain all slices needed to decode the
         frame. The OUTPUT buffer must also contain both fields.
 
+``V4L2_CID_MPEG_VIDEO_H264_STARTCODE (enum)``
+    Specifies the H264 slice start code expected for each slice.
+    This control is used to complement V4L2_PIX_FMT_H264_SLICE
+    pixel format. Applications that support V4L2_PIX_FMT_H264_SLICE
+    are required to set this control in order to specify the start code
+    that is expected for the buffer.
+    Drivers may expose a single or multiple start codes, depending
+    on what they can support.
+
+    .. note::
+
+       This menu control is not yet part of the public kernel API and
+       it is expected to change.
+
+.. c:type:: v4l2_mpeg_video_h264_startcode
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_MPEG_VIDEO_H264_NO_STARTCODE``
+      - 0
+      - Selecting this value specifies that H264 slices are passed
+        to the driver without any start code.
+    * - ``V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE``
+      - 1
+      - Selecting this value specifies that H264 slices are expected
+        to be prefixed by Annex B start codes. According to :ref:`h264`
+        valid start codes can be 3-bytes 0x000001, or 4-bytes 0x00000001.
+
 .. _v4l2-mpeg-mpeg2:
 
 ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
diff --git a/Documentation/media/uapi/v4l/pixfmt-compressed.rst b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
index 50557c85d99d..5030d326f0db 100644
--- a/Documentation/media/uapi/v4l/pixfmt-compressed.rst
+++ b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
@@ -61,7 +61,8 @@ Compressed Formats
 	stateless video decoders that implement an H264 pipeline
 	(using the :ref:`mem2mem` and :ref:`media-request-api`).
 	Metadata associated with the frame to decode are required to
-	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE``,
+	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_STARTCODE``,
+        ``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE``,
         ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
 	``V4L2_CID_MPEG_VIDEO_H264_PPS``,
 	``V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX``,
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index c3194299bfac..01fddb4ec640 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -407,6 +407,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		"Frame-Based",
 		NULL,
 	};
+	static const char * const h264_start_code[] = {
+		"No Start Code",
+		"Annex B Start Code",
+		NULL,
+	};
 	static const char * const mpeg_mpeg2_level[] = {
 		"Low",
 		"Main",
@@ -640,6 +645,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		return h264_fmo_map_type;
 	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
 		return h264_decoding_mode;
+	case V4L2_CID_MPEG_VIDEO_H264_STARTCODE:
+		return h264_start_code;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 		return mpeg_mpeg2_level;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
@@ -860,6 +867,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
 	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
 	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
+	case V4L2_CID_MPEG_VIDEO_H264_STARTCODE:		return "H264 Start Code";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
 	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
@@ -1229,6 +1237,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
 	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
 	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
+	case V4L2_CID_MPEG_VIDEO_H264_STARTCODE:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
 	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index e6c510877f67..31555c99f64a 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -27,6 +27,7 @@
 #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
 #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
 #define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
+#define V4L2_CID_MPEG_VIDEO_H264_STARTCODE	(V4L2_CID_MPEG_BASE+1006)
 
 /* enum v4l2_ctrl_type type values */
 #define V4L2_CTRL_TYPE_H264_SPS			0x0110
@@ -41,6 +42,11 @@ enum v4l2_mpeg_video_h264_decoding_mode {
 	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
 };
 
+enum v4l2_mpeg_video_h264_start_code {
+	V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
+	V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,
+};
+
 #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
 #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
 #define V4L2_H264_SPS_CONSTRAINT_SET2_FLAG			0x04
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 05/11] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (3 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 04/11] media: uapi: h264: Add the concept of start code Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 06/11] media: cedrus: Cleanup control initialization Ezequiel Garcia
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

From: Boris Brezillon <boris.brezillon@collabora.com>

Those lists can be extracted from the dpb, let's simplify userspace
life and build that list kernel-side (generic helpers will be provided
for drivers that need this list).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* New patch.
Changes in v4:
* Add R-b
* Remove the reflist from the struct
Changes in v3:
* None
Changes in v2:
* None
---
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
 include/media/h264-ctrls.h                       | 3 ---
 2 files changed, 12 deletions(-)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index ec0078e299a5..63796b39da43 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -1942,15 +1942,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - __u16
       - ``nal_ref_idc``
       - NAL reference ID value coming from the NAL Unit header
-    * - __u8
-      - ``ref_pic_list_p0[32]``
-      - Backward reference list used by P-frames in the original bitstream order
-    * - __u8
-      - ``ref_pic_list_b0[32]``
-      - Backward reference list used by B-frames in the original bitstream order
-    * - __u8
-      - ``ref_pic_list_b1[32]``
-      - Forward reference list used by B-frames in the original bitstream order
     * - __s32
       - ``top_field_order_cnt``
       - Picture Order Count for the coded top field
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index 31555c99f64a..2de438393025 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -203,9 +203,6 @@ struct v4l2_ctrl_h264_decode_params {
 	struct v4l2_h264_dpb_entry dpb[16];
 	__u16 num_slices;
 	__u16 nal_ref_idc;
-	__u8 ref_pic_list_p0[32];
-	__u8 ref_pic_list_b0[32];
-	__u8 ref_pic_list_b1[32];
 	__s32 top_field_order_cnt;
 	__s32 bottom_field_order_cnt;
 	__u32 flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 06/11] media: cedrus: Cleanup control initialization
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (4 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 05/11] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 07/11] media: cedrus: Specify H264 startcode and decoding mode Ezequiel Garcia
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

In order to introduce other controls, the control initialization
needs to support an initial struct v4l2_ctrl_control.

While here, let's cleanup the control initialization,
removing unneeded fields.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
Changes in v5:
* None.
Changes in v4:
* New patch.
---
 drivers/staging/media/sunxi/cedrus/cedrus.c | 45 +++++++++++----------
 drivers/staging/media/sunxi/cedrus/cedrus.h |  3 +-
 2 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.c b/drivers/staging/media/sunxi/cedrus/cedrus.c
index 370937edfc14..7bdc413bf727 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.c
@@ -29,44 +29,51 @@
 
 static const struct cedrus_control cedrus_controls[] = {
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
-		.elem_size	= sizeof(struct v4l2_ctrl_mpeg2_slice_params),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
+		},
 		.codec		= CEDRUS_CODEC_MPEG2,
 		.required	= true,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
-		.elem_size	= sizeof(struct v4l2_ctrl_mpeg2_quantization),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
+		},
 		.codec		= CEDRUS_CODEC_MPEG2,
 		.required	= false,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS,
-		.elem_size	= sizeof(struct v4l2_ctrl_h264_decode_params),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS,
+		},
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS,
-		.elem_size	= sizeof(struct v4l2_ctrl_h264_slice_params),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS,
+		},
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_H264_SPS,
-		.elem_size	= sizeof(struct v4l2_ctrl_h264_sps),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_SPS,
+		},
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_H264_PPS,
-		.elem_size	= sizeof(struct v4l2_ctrl_h264_pps),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_PPS,
+		},
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
 	{
-		.id		= V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX,
-		.elem_size	= sizeof(struct v4l2_ctrl_h264_scaling_matrix),
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX,
+		},
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
@@ -106,12 +113,8 @@ static int cedrus_init_ctrls(struct cedrus_dev *dev, struct cedrus_ctx *ctx)
 		return -ENOMEM;
 
 	for (i = 0; i < CEDRUS_CONTROLS_COUNT; i++) {
-		struct v4l2_ctrl_config cfg = {};
-
-		cfg.elem_size = cedrus_controls[i].elem_size;
-		cfg.id = cedrus_controls[i].id;
-
-		ctrl = v4l2_ctrl_new_custom(hdl, &cfg, NULL);
+		ctrl = v4l2_ctrl_new_custom(hdl, &cedrus_controls[i].cfg,
+					    NULL);
 		if (hdl->error) {
 			v4l2_err(&dev->v4l2_dev,
 				 "Failed to create new custom control\n");
@@ -178,7 +181,7 @@ static int cedrus_request_validate(struct media_request *req)
 			continue;
 
 		ctrl_test = v4l2_ctrl_request_hdl_ctrl_find(hdl,
-							    cedrus_controls[i].id);
+							    cedrus_controls[i].cfg.id);
 		if (!ctrl_test) {
 			v4l2_info(&ctx->dev->v4l2_dev,
 				  "Missing required codec control\n");
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.h b/drivers/staging/media/sunxi/cedrus/cedrus.h
index 3f476d0fd981..283827656872 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.h
@@ -49,8 +49,7 @@ enum cedrus_h264_pic_type {
 };
 
 struct cedrus_control {
-	u32			id;
-	u32			elem_size;
+	struct v4l2_ctrl_config cfg;
 	enum cedrus_codec	codec;
 	unsigned char		required:1;
 };
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 07/11] media: cedrus: Specify H264 startcode and decoding mode
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (5 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 06/11] media: cedrus: Cleanup control initialization Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 08/11] media: hantro: Move copy_metadata() before doing a decode operation Ezequiel Garcia
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Ezequiel Garcia

The cedrus VPU is slice-based and expects V4L2_PIX_FMT_H264_SLICE
buffers to contain H264 slices with no start code.

Expose this to userspace with the newly added menu control.

These two controls are specified as mandatory for applications,
but we mark them as non-required on the driver side for
backwards compatibility.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
---
Changes in v5:
* Clarify commit log.
Changes in v4:
* New patch.
---
 drivers/staging/media/sunxi/cedrus/cedrus.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.c b/drivers/staging/media/sunxi/cedrus/cedrus.c
index 7bdc413bf727..10be2bd9507e 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.c
@@ -77,6 +77,26 @@ static const struct cedrus_control cedrus_controls[] = {
 		.codec		= CEDRUS_CODEC_H264,
 		.required	= true,
 	},
+	{
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE,
+			.max	= V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
+			.def	= V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
+			.menu_skip_mask = BIT(V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING),
+		},
+		.codec		= CEDRUS_CODEC_H264,
+		.required	= false,
+	},
+	{
+		.cfg = {
+			.id	= V4L2_CID_MPEG_VIDEO_H264_STARTCODE,
+			.max	= V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
+			.def	= V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
+			.menu_skip_mask = BIT(V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE),
+		},
+		.codec		= CEDRUS_CODEC_H264,
+		.required	= false,
+	},
 };
 
 #define CEDRUS_CONTROLS_COUNT	ARRAY_SIZE(cedrus_controls)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 08/11] media: hantro: Move copy_metadata() before doing a decode operation
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (6 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 07/11] media: cedrus: Specify H264 startcode and decoding mode Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 09/11] media: hantro: Add core bits to support H264 decoding Ezequiel Garcia
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel

From: Boris Brezillon <boris.brezillon@collabora.com>

Some decoders use intra slice/frame references. The capture buffer
pointed by these references might be new and thus have invalid
timestamp which prevents the decoder logic from retrieving the
vb2_buffer object based on the output buf timestamp.
Copy all metadata (including the timestamp) before starting the decode
operation.

Suggested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* None.
Changes in v4:
* None.
---
 drivers/staging/media/hantro/hantro_drv.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 4af6ee80229e..6e2351e46750 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -111,8 +111,6 @@ static void hantro_job_finish(struct hantro_dev *vpu,
 	src->sequence = ctx->sequence_out++;
 	dst->sequence = ctx->sequence_cap++;
 
-	v4l2_m2m_buf_copy_metadata(src, dst, true);
-
 	ret = ctx->buf_finish(ctx, &dst->vb2_buf, bytesused);
 	if (ret)
 		result = VB2_BUF_STATE_ERROR;
@@ -178,8 +176,12 @@ void hantro_finish_run(struct hantro_ctx *ctx)
 static void device_run(void *priv)
 {
 	struct hantro_ctx *ctx = priv;
+	struct vb2_v4l2_buffer *src, *dst;
 	int ret;
 
+	src = hantro_get_src_buf(ctx);
+	dst = hantro_get_dst_buf(ctx);
+
 	ret = clk_bulk_enable(ctx->dev->variant->num_clocks, ctx->dev->clocks);
 	if (ret)
 		goto err_cancel_job;
@@ -187,6 +189,8 @@ static void device_run(void *priv)
 	if (ret < 0)
 		goto err_cancel_job;
 
+	v4l2_m2m_buf_copy_metadata(src, dst, true);
+
 	ctx->codec_ops->run(ctx);
 	return;
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 09/11] media: hantro: Add core bits to support H264 decoding
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (7 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 08/11] media: hantro: Move copy_metadata() before doing a decode operation Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 10/11] media: hantro: Add support for H264 decoding on G1 Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 11/11] media: hantro: Enable H264 decoding on rk3288 Ezequiel Garcia
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Hertz Wong

From: Hertz Wong <hertz.wong@rock-chips.com>

Add helpers and patch hantro_{drv,v4l2}.c to prepare addition of H264
decoding support.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes in v5:
* None.
Changes in v4:
* Rework extra_size0, exposing the size via TRY_FMT/S_FMT
  to allow buffer importation, as suggested by Tomasz.
* Drop max slice limit.
* Use a ternary operator instead of substracting POCs,
  to avoid an overflow as pointed out by Rasmus.
* Specify annex B slice start code.
* Add missing extra_size0 to sizeimage
* Swap the first 2 entries of list B1 when B0 and B1 match (mandated by
  the spec)
* Move the update_dpb() call before the prepare_table() one to make the
  POCs stored in the private table match the content of the new DPB
---
 drivers/staging/media/hantro/Makefile      |   1 +
 drivers/staging/media/hantro/hantro.h      |   9 +-
 drivers/staging/media/hantro/hantro_drv.c  |  42 ++
 drivers/staging/media/hantro/hantro_h264.c | 651 +++++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h   |  55 ++
 drivers/staging/media/hantro/hantro_v4l2.c |  10 +
 6 files changed, 767 insertions(+), 1 deletion(-)
 create mode 100644 drivers/staging/media/hantro/hantro_h264.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index f5ec597d9e08..0f0d3afb1cca 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -10,6 +10,7 @@ hantro-vpu-y += \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		rk3399_vpu_hw_vp8_dec.o \
 		hantro_jpeg.o \
+		hantro_h264.o \
 		hantro_mpeg2.o \
 		hantro_vp8.o
 
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index c4c86c32ea2d..f670bbde4159 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -30,6 +30,10 @@
 #define VP8_MB_WIDTH(w)			DIV_ROUND_UP(w, VP8_MB_DIM)
 #define VP8_MB_HEIGHT(h)		DIV_ROUND_UP(h, VP8_MB_DIM)
 
+#define H264_MB_DIM			16
+#define H264_MB_WIDTH(w)		DIV_ROUND_UP(w, H264_MB_DIM)
+#define H264_MB_HEIGHT(h)		DIV_ROUND_UP(h, H264_MB_DIM)
+
 #define MPEG2_MB_DIM			16
 #define MPEG2_MB_WIDTH(w)		DIV_ROUND_UP(w, MPEG2_MB_DIM)
 #define MPEG2_MB_HEIGHT(h)		DIV_ROUND_UP(h, MPEG2_MB_DIM)
@@ -43,9 +47,9 @@ struct hantro_codec_ops;
 
 #define HANTRO_JPEG_ENCODER	BIT(0)
 #define HANTRO_ENCODERS		0x0000ffff
-
 #define HANTRO_MPEG2_DECODER	BIT(16)
 #define HANTRO_VP8_DECODER	BIT(17)
+#define HANTRO_H264_DECODER	BIT(18)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -102,12 +106,14 @@ struct hantro_variant {
  * enum hantro_codec_mode - codec operating mode.
  * @HANTRO_MODE_NONE:  No operating mode. Used for RAW video formats.
  * @HANTRO_MODE_JPEG_ENC: JPEG encoder.
+ * @HANTRO_MODE_H264_DEC: H264 decoder.
  * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
  * @HANTRO_MODE_VP8_DEC: VP8 decoder.
  */
 enum hantro_codec_mode {
 	HANTRO_MODE_NONE = -1,
 	HANTRO_MODE_JPEG_ENC,
+	HANTRO_MODE_H264_DEC,
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
 };
@@ -246,6 +252,7 @@ struct hantro_ctx {
 
 	/* Specific for particular codec modes. */
 	union {
+		struct hantro_h264_dec_hw_ctx h264_dec;
 		struct hantro_jpeg_enc_hw_ctx jpeg_enc;
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 6e2351e46750..40055af4d862 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -314,6 +314,48 @@ static const struct hantro_ctrl controls[] = {
 		.cfg = {
 			.id = V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER,
 		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SPS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_PPS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE,
+			.max = V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
+			.def = V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
+			.menu_skip_mask = BIT(V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING),
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_STARTCODE,
+			.max = V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,
+			.def = V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,
+			.menu_skip_mask = BIT(V4L2_MPEG_VIDEO_H264_NO_STARTCODE),
+		},
+	}, {
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
new file mode 100644
index 000000000000..cc51cc0f61ad
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -0,0 +1,651 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip RK3288 VPU codec driver
+ *
+ * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
+ *	Hertz Wong <hertz.wong@rock-chips.com>
+ *	Herman Chen <herman.chen@rock-chips.com>
+ *
+ * Copyright (C) 2014 Google, Inc.
+ *	Tomasz Figa <tfiga@chromium.org>
+ */
+
+#include <linux/types.h>
+#include <linux/sort.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+
+/* Size with u32 units. */
+#define CABAC_INIT_BUFFER_SIZE		(460 * 2)
+#define POC_BUFFER_SIZE			34
+#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
+
+#define POC_CMP(p0, p1) ((p0) < (p1) ? -1 : 1)
+
+/* Data structure describing auxiliary buffer format. */
+struct hantro_h264_dec_priv_tbl {
+	u32 cabac_table[CABAC_INIT_BUFFER_SIZE];
+	u32 poc[POC_BUFFER_SIZE];
+	u8 scaling_list[SCALING_LIST_SIZE];
+};
+
+/* Constant CABAC table. */
+static const u32 h264_cabac_table[] = {
+	0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07330000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948, 0x0d29033e, 0x000b0137,
+	0x0045ef7f, 0xf3660052, 0xf94aeb6b, 0xe57fe17f, 0xe87fee5f, 0xe57feb72,
+	0xe27fef7b, 0xf473f07a, 0xf573f43f, 0xfe44f154, 0xf368fd46, 0xf85df65a,
+	0xe27fff4a, 0xfa61f95b, 0xec7ffc38, 0xfb52f94c, 0xea7df95d, 0xf557fd4d,
+	0xfb47fc3f, 0xfc44f454, 0xf93ef941, 0x083d0538, 0xfe420140, 0x003dfe4e,
+	0x01320734, 0x0a23002c, 0x0b26012d, 0x002e052c, 0x1f110133, 0x07321c13,
+	0x10210e3e, 0xf36cf164, 0xf365f35b, 0xf45ef658, 0xf054f656, 0xf953f357,
+	0xed5e0146, 0x0048fb4a, 0x123bf866, 0xf164005f, 0xfc4b0248, 0xf54bfd47,
+	0x0f2ef345, 0x003e0041, 0x1525f148, 0x09391036, 0x003e0c48, 0x18000f09,
+	0x08190d12, 0x0f090d13, 0x0a250c12, 0x061d1421, 0x0f1e042d, 0x013a003e,
+	0x073d0c26, 0x0b2d0f27, 0x0b2a0d2c, 0x102d0c29, 0x0a311e22, 0x122a0a37,
+	0x1133112e, 0x00591aed, 0x16ef1aef, 0x1ee71cec, 0x21e925e5, 0x21e928e4,
+	0x26ef21f5, 0x28f129fa, 0x26012911, 0x1efa1b03, 0x1a1625f0, 0x23fc26f8,
+	0x26fd2503, 0x26052a00, 0x23102716, 0x0e301b25, 0x153c0c44, 0x0261fd47,
+	0xfa2afb32, 0xfd36fe3e, 0x003a013f, 0xfe48ff4a, 0xf75bfb43, 0xfb1bfd27,
+	0xfe2c002e, 0xf040f844, 0xf64efa4d, 0xf656f45c, 0xf137f63c, 0xfa3efc41,
+	0xf449f84c, 0xf950f758, 0xef6ef561, 0xec54f54f, 0xfa49fc4a, 0xf356f360,
+	0xf561ed75, 0xf84efb21, 0xfc30fe35, 0xfd3ef347, 0xf64ff456, 0xf35af261,
+	0x0000fa5d, 0xfa54f84f, 0x0042ff47, 0x003efe3c, 0xfe3bfb4b, 0xfd3efc3a,
+	0xf742ff4f, 0x00470344, 0x0a2cf93e, 0x0f240e28, 0x101b0c1d, 0x012c1424,
+	0x1220052a, 0x01300a3e, 0x112e0940, 0xf468f561, 0xf060f958, 0xf855f955,
+	0xf755f358, 0x0442fd4d, 0xfd4cfa4c, 0x0a3aff4c, 0xff53f963, 0xf25f025f,
+	0x004cfb4a, 0x0046f54b, 0x01440041, 0xf249033e, 0x043eff44, 0xf34b0b37,
+	0x05400c46, 0x0f060613, 0x07100c0e, 0x120d0d0b, 0x0d0f0f10, 0x0c170d17,
+	0x0f140e1a, 0x0e2c1128, 0x112f1811, 0x15151916, 0x1f1b161d, 0x13230e32,
+	0x0a39073f, 0xfe4dfc52, 0xfd5e0945, 0xf46d24dd, 0x24de20e6, 0x25e22ce0,
+	0x22ee22f1, 0x28f121f9, 0x23fb2100, 0x2602210d, 0x17230d3a, 0x1dfd1a00,
+	0x161e1ff9, 0x23f122fd, 0x220324ff, 0x2205200b, 0x2305220c, 0x270b1e1d,
+	0x221a1d27, 0x13421f15, 0x1f1f1932, 0xef78ec70, 0xee72f555, 0xf15cf259,
+	0xe647f151, 0xf2500044, 0xf246e838, 0xe944e832, 0xf54a17f3, 0x1af328f1,
+	0x31f22c03, 0x2d062c22, 0x21361352, 0xfd4bff17, 0x0122012b, 0x0036fe37,
+	0x003d0140, 0x0044f75c, 0xf26af361, 0xf15af45a, 0xee58f649, 0xf74ff256,
+	0xf649f646, 0xf645fb42, 0xf740fb3a, 0x023b15f6, 0x18f51cf8, 0x1cff1d03,
+	0x1d092314, 0x1d240e43, 0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968,
+	0xfa35ff36, 0x07331721, 0x17021500, 0x01090031, 0xdb760539, 0xf34ef541,
+	0x013e0c31, 0xfc491132, 0x1240092b, 0x1d001a43, 0x105a0968, 0xd27fec68,
+	0x0143f34e, 0xf541013e, 0xfa56ef5f, 0xfa3d092d, 0xfd45fa51, 0xf5600637,
+	0x0743fb56, 0x0258003a, 0xfd4cf65e, 0x05360445, 0xfd510058, 0xf943fb4a,
+	0xfc4afb50, 0xf948013a, 0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948,
+	0x0d29033e, 0x002dfc4e, 0xfd60e57e, 0xe462e765, 0xe943e452, 0xec5ef053,
+	0xea6eeb5b, 0xee66f35d, 0xe37ff95c, 0xfb59f960, 0xf36cfd2e, 0xff41ff39,
+	0xf75dfd4a, 0xf75cf857, 0xe97e0536, 0x063c063b, 0x0645ff30, 0x0044fc45,
+	0xf858fe55, 0xfa4eff4b, 0xf94d0236, 0x0532fd44, 0x0132062a, 0xfc51013f,
+	0xfc460043, 0x0239fe4c, 0x0b230440, 0x013d0b23, 0x12190c18, 0x0d1d0d24,
+	0xf65df949, 0xfe490d2e, 0x0931f964, 0x09350235, 0x0535fe3d, 0x00380038,
+	0xf33ffb3c, 0xff3e0439, 0xfa450439, 0x0e270433, 0x0d440340, 0x013d093f,
+	0x07321027, 0x052c0434, 0x0b30fb3c, 0xff3b003b, 0x1621052c, 0x0e2bff4e,
+	0x003c0945, 0x0b1c0228, 0x032c0031, 0x002e022c, 0x0233002f, 0x0427023e,
+	0x062e0036, 0x0336023a, 0x043f0633, 0x06390735, 0x06340637, 0x0b2d0e24,
+	0x0835ff52, 0x0737fd4e, 0x0f2e161f, 0xff541907, 0x1ef91c03, 0x1c042000,
+	0x22ff1e06, 0x1e062009, 0x1f131a1b, 0x1a1e2514, 0x1c221146, 0x0143053b,
+	0x0943101e, 0x12201223, 0x161d181f, 0x1726122b, 0x14290b3f, 0x093b0940,
+	0xff5efe59, 0xf76cfa4c, 0xfe2c002d, 0x0034fd40, 0xfe3bfc46, 0xfc4bf852,
+	0xef66f74d, 0x0318002a, 0x00300037, 0xfa3bf947, 0xf453f557, 0xe277013a,
+	0xfd1dff24, 0x0126022b, 0xfa37003a, 0x0040fd4a, 0xf65a0046, 0xfc1d051f,
+	0x072a013b, 0xfe3afd48, 0xfd51f561, 0x003a0805, 0x0a0e0e12, 0x0d1b0228,
+	0x003afd46, 0xfa4ff855, 0x0000f36a, 0xf06af657, 0xeb72ee6e, 0xf262ea6e,
+	0xeb6aee67, 0xeb6be96c, 0xe670f660, 0xf45ffb5b, 0xf75dea5e, 0xfb560943,
+	0xfc50f655, 0xff46073c, 0x093a053d, 0x0c320f32, 0x12311136, 0x0a29072e,
+	0xff330731, 0x08340929, 0x062f0237, 0x0d290a2c, 0x06320535, 0x0d31043f,
+	0x0640fe45, 0xfe3b0646, 0x0a2c091f, 0x0c2b0335, 0x0e220a26, 0xfd340d28,
+	0x1120072c, 0x07260d32, 0x0a391a2b, 0x0e0b0b0e, 0x090b120b, 0x150917fe,
+	0x20f120f1, 0x22eb27e9, 0x2adf29e1, 0x2ee426f4, 0x151d2de8, 0x35d330e6,
+	0x41d52bed, 0x27f61e09, 0x121a141b, 0x0039f252, 0xfb4bed61, 0xdd7d1b00,
+	0x1c001ffc, 0x1b062208, 0x1e0a1816, 0x21131620, 0x1a1f1529, 0x1a2c172f,
+	0x10410e47, 0x083c063f, 0x11411518, 0x17141a17, 0x1b201c17, 0x1c181728,
+	0x18201c1d, 0x172a1339, 0x1635163d, 0x0b560c28, 0x0b330e3b, 0xfc4ff947,
+	0xfb45f746, 0xf842f644, 0xed49f445, 0xf046f143, 0xec3eed46, 0xf042ea41,
+	0xec3f09fe, 0x1af721f7, 0x27f929fe, 0x2d033109, 0x2d1b243b, 0xfa42f923,
+	0xf92af82d, 0xfb30f438, 0xfa3cfb3e, 0xf842f84c, 0xfb55fa51, 0xf64df951,
+	0xef50ee49, 0xfc4af653, 0xf747f743, 0xff3df842, 0xf242003b, 0x023b15f3,
+	0x21f227f9, 0x2efe3302, 0x3c063d11, 0x37222a3e, 0x14f10236, 0x034a14f1,
+	0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07331619, 0x22001000, 0xfe090429,
+	0xe3760241, 0xfa47f34f, 0x05340932, 0xfd460a36, 0x1a221316, 0x28003902,
+	0x29241a45, 0xd37ff165, 0xfc4cfa47, 0xf34f0534, 0x0645f35a, 0x0034082b,
+	0xfe45fb52, 0xf660023b, 0x024bfd57, 0xfd640138, 0xfd4afa55, 0x003bfd51,
+	0xf956fb5f, 0xff42ff4d, 0x0146fe56, 0xfb48003d, 0x0029003f, 0x003f003f,
+	0xf7530456, 0x0061f948, 0x0d29033e, 0x0d0f0733, 0x0250d97f, 0xee5bef60,
+	0xe651dd62, 0xe866e961, 0xe577e863, 0xeb6eee66, 0xdc7f0050, 0xfb59f95e,
+	0xfc5c0027, 0x0041f154, 0xdd7ffe49, 0xf468f75b, 0xe17f0337, 0x07380737,
+	0x083dfd35, 0x0044f94a, 0xf758f367, 0xf35bf759, 0xf25cf84c, 0xf457e96e,
+	0xe869f64e, 0xec70ef63, 0xb27fba7f, 0xce7fd27f, 0xfc42fb4e, 0xfc47f848,
+	0x023bff37, 0xf946fa4b, 0xf859de77, 0xfd4b2014, 0x1e16d47f, 0x0036fb3d,
+	0x003aff3c, 0xfd3df843, 0xe754f24a, 0xfb410534, 0x0239003d, 0xf745f546,
+	0x1237fc47, 0x003a073d, 0x09291219, 0x0920052b, 0x092f002c, 0x0033022e,
+	0x1326fc42, 0x0f260c2a, 0x09220059, 0x042d0a1c, 0x0a1f21f5, 0x34d5120f,
+	0x1c0023ea, 0x26e72200, 0x27ee20f4, 0x66a20000, 0x38f121fc, 0x1d0a25fb,
+	0x33e327f7, 0x34de45c6, 0x43c12cfb, 0x200737e3, 0x20010000, 0x1b2421e7,
+	0x22e224e4, 0x26e426e5, 0x22ee23f0, 0x22f220f8, 0x25fa2300, 0x1e0a1c12,
+	0x1a191d29, 0x004b0248, 0x084d0e23, 0x121f1123, 0x151e112d, 0x142a122d,
+	0x1b1a1036, 0x07421038, 0x0b490a43, 0xf674e970, 0xf147f93d, 0x0035fb42,
+	0xf54df750, 0xf754f657, 0xde7feb65, 0xfd27fb35, 0xf93df54b, 0xf14def5b,
+	0xe76be76f, 0xe47af54c, 0xf62cf634, 0xf639f73a, 0xf048f945, 0xfc45fb4a,
+	0xf7560242, 0xf7220120, 0x0b1f0534, 0xfe37fe43, 0x0049f859, 0x03340704,
+	0x0a081108, 0x10130325, 0xff3dfb49, 0xff46fc4e, 0x0000eb7e, 0xe97cec6e,
+	0xe67ee77c, 0xef69e579, 0xe575ef66, 0xe675e574, 0xdf7af65f, 0xf264f85f,
+	0xef6fe472, 0xfa59fe50, 0xfc52f755, 0xf851ff48, 0x05400143, 0x09380045,
+	0x01450745, 0xf945fa43, 0xf04dfe40, 0x023dfa43, 0xfd400239, 0xfd41fd42,
+	0x003e0933, 0xff42fe47, 0xfe4bff46, 0xf7480e3c, 0x1025002f, 0x12230b25,
+	0x0c290a29, 0x02300c29, 0x0d29003b, 0x03321328, 0x03421232, 0x13fa12fa,
+	0x0e001af4, 0x1ff021e7, 0x21ea25e4, 0x27e22ae2, 0x2fd62ddc, 0x31de29ef,
+	0x200945b9, 0x3fc142c0, 0x4db636d9, 0x34dd29f6, 0x240028ff, 0x1e0e1c1a,
+	0x17250c37, 0x0b4125df, 0x27dc28db, 0x26e22edf, 0x2ae228e8, 0x31e326f4,
+	0x28f626fd, 0x2efb1f14, 0x1d1e192c, 0x0c300b31, 0x1a2d1616, 0x17161b15,
+	0x21141a1c, 0x1e181b22, 0x122a1927, 0x12320c46, 0x15360e47, 0x0b531920,
+	0x15311536, 0xfb55fa51, 0xf64df951, 0xef50ee49, 0xfc4af653, 0xf747f743,
+	0xff3df842, 0xf242003b, 0x023b11f6, 0x20f32af7, 0x31fb3500, 0x4003440a,
+	0x421b2f39, 0xfb470018, 0xff24fe2a, 0xfe34f739, 0xfa3ffc41, 0xfc43f952,
+	0xfd51fd4c, 0xf948fa4e, 0xf448f244, 0xfd46fa4c, 0xfb42fb3e, 0x0039fc3d,
+	0xf73c0136, 0x023a11f6, 0x20f32af7, 0x31fb3500, 0x4003440a, 0x421b2f39,
+	0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07331d10,
+	0x19000e00, 0xf633fd3e, 0xe5631a10, 0xfc55e866, 0x05390639, 0xef490e39,
+	0x1428140a, 0x1d003600, 0x252a0c61, 0xe07fea75, 0xfe4afc55, 0xe8660539,
+	0xfa5df258, 0xfa2c0437, 0xf559f167, 0xeb741339, 0x143a0454, 0x0660013f,
+	0xfb55f36a, 0x053f064b, 0xfd5aff65, 0x0337fc4f, 0xfe4bf461, 0xf932013c,
+	0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948, 0x0d29033e, 0x0722f758,
+	0xec7fdc7f, 0xef5bf25f, 0xe754e756, 0xf459ef5b, 0xe17ff24c, 0xee67f35a,
+	0xdb7f0b50, 0x054c0254, 0x054efa37, 0x043df253, 0xdb7ffb4f, 0xf568f55b,
+	0xe27f0041, 0xfe4f0048, 0xfc5cfa38, 0x0344f847, 0xf362fc56, 0xf458fb52,
+	0xfd48fc43, 0xf848f059, 0xf745ff3b, 0x05420439, 0xfc47fe47, 0x023aff4a,
+	0xfc2cff45, 0x003ef933, 0xfc2ffa2a, 0xfd29fa35, 0x084cf74e, 0xf5530934,
+	0x0043fb5a, 0x0143f148, 0xfb4bf850, 0xeb53eb40, 0xf31fe740, 0xe35e094b,
+	0x113ff84a, 0xfb23fe1b, 0x0d5b0341, 0xf945084d, 0xf642033e, 0xfd44ec51,
+	0x001e0107, 0xfd17eb4a, 0x1042e97c, 0x11252cee, 0x32deea7f, 0x0427002a,
+	0x07220b1d, 0x081f0625, 0x072a0328, 0x08210d2b, 0x0d24042f, 0x0337023a,
+	0x063c082c, 0x0b2c0e2a, 0x07300438, 0x04340d25, 0x0931133a, 0x0a300c2d,
+	0x00451421, 0x083f23ee, 0x21e71cfd, 0x180a1b00, 0x22f234d4, 0x27e81311,
+	0x1f19241d, 0x1821220f, 0x1e141649, 0x1422131f, 0x1b2c1310, 0x0f240f24,
+	0x151c1915, 0x1e141f0c, 0x1b10182a, 0x005d0e38, 0x0f391a26, 0xe87fe873,
+	0xea52f73e, 0x0035003b, 0xf255f359, 0xf35ef55c, 0xe37feb64, 0xf239f443,
+	0xf547f64d, 0xeb55f058, 0xe968f162, 0xdb7ff652, 0xf830f83d, 0xf842f946,
+	0xf24bf64f, 0xf753f45c, 0xee6cfc4f, 0xea45f04b, 0xfe3a013a, 0xf34ef753,
+	0xfc51f363, 0xf351fa26, 0xf33efa3a, 0xfe3bf049, 0xf64cf356, 0xf753f657,
+	0x0000ea7f, 0xe77fe778, 0xe57fed72, 0xe975e776, 0xe675e871, 0xe476e178,
+	0xdb7cf65e, 0xf166f663, 0xf36ace7f, 0xfb5c1139, 0xfb56f35e, 0xf45bfe4d,
+	0x0047ff49, 0x0440f951, 0x05400f39, 0x01430044, 0xf6430144, 0x004d0240,
+	0x0044fb4e, 0x0737053b, 0x02410e36, 0x0f2c053c, 0x0246fe4c, 0xee560c46,
+	0x0540f446, 0x0b370538, 0x00450241, 0xfa4a0536, 0x0736fa4c, 0xf552fe4d,
+	0xfe4d192a, 0x11f310f7, 0x11f41beb, 0x25e229d8, 0x2ad730d1, 0x27e02ed8,
+	0x34cd2ed7, 0x34d92bed, 0x200b3dc9, 0x38d23ece, 0x51bd2dec, 0x23fe1c0f,
+	0x22012701, 0x1e111426, 0x122d0f36, 0x004f24f0, 0x25f225ef, 0x2001220f,
+	0x1d0f1819, 0x22161f10, 0x23121f1c, 0x2129241c, 0x1b2f153e, 0x121f131a,
+	0x24181817, 0x1b10181e, 0x1f1d1629, 0x162a103c, 0x0f340e3c, 0x034ef07b,
+	0x15351638, 0x193d1521, 0x1332113d, 0xfd4ef84a, 0xf748f648, 0xee4bf447,
+	0xf53ffb46, 0xef4bf248, 0xf043f835, 0xf23bf734, 0xf54409fe, 0x1ef61ffc,
+	0x21ff2107, 0x1f0c2517, 0x1f261440, 0xf747f925, 0xf82cf531, 0xf638f43b,
+	0xf83ff743, 0xfa44f64f, 0xfd4ef84a, 0xf748f648, 0xee4bf447, 0xf53ffb46,
+	0xef4bf248, 0xf043f835, 0xf23bf734, 0xf54409fe, 0x1ef61ffc, 0x21ff2107,
+	0x1f0c2517, 0x1f261440
+};
+
+/*
+ * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
+ * to get the values in matrix order. In addition, the hardware requires bytes
+ * swapped within each subsequent 4 bytes. Both arrays below include both
+ * transformations.
+ */
+static const u32 zig_zag_4x4[] = {
+	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
+};
+
+static const u32 zig_zag_8x8[] = {
+	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
+	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
+	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
+	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
+};
+
+static void
+reorder_scaling_list(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
+	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
+	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
+	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
+	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
+	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
+	u8 *dst = tbl->scaling_list;
+	const u8 *src;
+	int i, j;
+
+	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
+	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
+	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
+		     num_list_4x4 * list_len_4x4 +
+		     num_list_8x8 * list_len_8x8);
+
+	src = &scaling->scaling_list_4x4[0][0];
+	for (i = 0; i < num_list_4x4; ++i) {
+		for (j = 0; j < list_len_4x4; ++j)
+			dst[zig_zag_4x4[j]] = src[j];
+		src += list_len_4x4;
+		dst += list_len_4x4;
+	}
+
+	src = &scaling->scaling_list_8x8[0][0];
+	for (i = 0; i < num_list_8x8; ++i) {
+		for (j = 0; j < list_len_8x8; ++j)
+			dst[zig_zag_8x8[j]] = src[j];
+		src += list_len_8x8;
+		dst += list_len_8x8;
+	}
+}
+
+static void prepare_table(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
+	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	int i;
+
+	/*
+	 * Prepare auxiliary buffer.
+	 *
+	 * TODO: The CABAC table never changes, but maybe it would be better
+	 * to have it as a control, which is set by userspace once?
+	 */
+	memcpy(tbl->cabac_table, h264_cabac_table, sizeof(tbl->cabac_table));
+
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
+		tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
+		tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
+	}
+
+	tbl->poc[32] = dec_param->top_field_order_cnt;
+	tbl->poc[33] = dec_param->bottom_field_order_cnt;
+
+	reorder_scaling_list(ctx);
+}
+
+struct hantro_h264_reflist_builder {
+	const struct v4l2_h264_dpb_entry *dpb;
+	s32 pocs[HANTRO_H264_DPB_SIZE];
+	u8 unordered_reflist[HANTRO_H264_DPB_SIZE];
+	s32 curpoc;
+	u8 num_valid;
+};
+
+static s32 get_poc(enum v4l2_field field, s32 top_field_order_cnt,
+		   s32 bottom_field_order_cnt)
+{
+	switch (field) {
+	case V4L2_FIELD_TOP:
+		return top_field_order_cnt;
+
+	case V4L2_FIELD_BOTTOM:
+		return bottom_field_order_cnt;
+
+	default:
+		break;
+	}
+
+	return min(top_field_order_cnt, bottom_field_order_cnt);
+}
+
+static void
+init_reflist_builder(struct hantro_ctx *ctx,
+		     struct hantro_h264_reflist_builder *b)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	struct vb2_v4l2_buffer *buf = hantro_get_dst_buf(ctx);
+	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
+	unsigned int i;
+
+	dec_param = ctx->h264_dec.ctrls.decode;
+
+	memset(b, 0, sizeof(*b));
+	b->dpb = dpb;
+	b->curpoc = get_poc(buf->field, dec_param->top_field_order_cnt,
+			    dec_param->bottom_field_order_cnt);
+
+	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
+		int buf_idx;
+
+		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		buf_idx = vb2_find_timestamp(cap_q, dpb[i].reference_ts, 0);
+		if (buf_idx < 0)
+			continue;
+
+		buf = to_vb2_v4l2_buffer(vb2_get_buffer(cap_q, buf_idx));
+		b->pocs[i] = get_poc(buf->field, dpb[i].top_field_order_cnt,
+				     dpb[i].bottom_field_order_cnt);
+		b->unordered_reflist[b->num_valid] = i;
+		b->num_valid++;
+	}
+
+	for (i = b->num_valid; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
+		b->unordered_reflist[i] = i;
+}
+
+static int p_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/*
+	 * Short term pics in descending pic num order, long term ones in
+	 * ascending order.
+	 */
+	if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+		return b->frame_num - a->frame_num;
+
+	return a->pic_num - b->pic_num;
+}
+
+static int b0_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	s32 poca, pocb;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/* Long term pics in ascending pic num order. */
+	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+		return a->pic_num - b->pic_num;
+
+	poca = builder->pocs[idxa];
+	pocb = builder->pocs[idxb];
+
+	/*
+	 * Short term pics with POC < cur POC first in POC descending order
+	 * followed by short term pics with POC > cur POC in POC ascending
+	 * order.
+	 */
+	if (poca < builder->curpoc != pocb < builder->curpoc)
+		return POC_CMP(poca, pocb);
+	else if (poca < builder->curpoc)
+		return POC_CMP(pocb, poca);
+
+	return POC_CMP(poca, pocb);
+}
+
+static int b1_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	s32 poca, pocb;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/* Long term pics in ascending pic num order. */
+	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+		return a->pic_num - b->pic_num;
+
+	poca = builder->pocs[idxa];
+	pocb = builder->pocs[idxb];
+
+	/*
+	 * Short term pics with POC > cur POC first in POC ascending order
+	 * followed by short term pics with POC > cur POC in POC descending
+	 * order.
+	 */
+	if (poca < builder->curpoc != pocb < builder->curpoc)
+		return POC_CMP(pocb, poca);
+	else if (poca < builder->curpoc)
+		return POC_CMP(pocb, poca);
+
+	return POC_CMP(poca, pocb);
+}
+
+static void
+build_p_ref_list(const struct hantro_h264_reflist_builder *builder,
+		 u8 *reflist)
+{
+	memcpy(reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(reflist, builder->num_valid, sizeof(*reflist),
+	       p_ref_list_cmp, NULL, builder);
+}
+
+static void
+build_b_ref_lists(const struct hantro_h264_reflist_builder *builder,
+		  u8 *b0_reflist, u8 *b1_reflist)
+{
+	memcpy(b0_reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(b0_reflist, builder->num_valid, sizeof(*b0_reflist),
+	       b0_ref_list_cmp, NULL, builder);
+
+	memcpy(b1_reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(b1_reflist, builder->num_valid, sizeof(*b1_reflist),
+	       b1_ref_list_cmp, NULL, builder);
+
+	if (builder->num_valid > 1 &&
+	    !memcmp(b1_reflist, b0_reflist, builder->num_valid))
+		swap(b1_reflist[0], b1_reflist[1]);
+}
+
+static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
+			    const struct v4l2_h264_dpb_entry *b)
+{
+	return a->top_field_order_cnt == b->top_field_order_cnt &&
+	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+}
+
+static void update_dpb(struct hantro_ctx *ctx)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	unsigned int i, j;
+
+	dec_param = ctx->h264_dec.ctrls.decode;
+
+	/* Disable all entries by default. */
+	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
+		ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+
+	/* Try to match new DPB entries with existing ones by their POCs. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		/*
+		 * To cut off some comparisons, iterate only on target DPB
+		 * entries which are not used yet.
+		 */
+		for_each_clear_bit(j, used, ARRAY_SIZE(ctx->h264_dec.dpb)) {
+			struct v4l2_h264_dpb_entry *cdpb;
+
+			cdpb = &ctx->h264_dec.dpb[j];
+			if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
+			    !dpb_entry_match(cdpb, ndpb))
+				continue;
+
+			*cdpb = *ndpb;
+			set_bit(j, used);
+			break;
+		}
+
+		if (j == ARRAY_SIZE(ctx->h264_dec.dpb))
+			set_bit(i, new);
+	}
+
+	/* For entries that could not be matched, use remaining free slots. */
+	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+		struct v4l2_h264_dpb_entry *cdpb;
+
+		/*
+		 * Both arrays are of the same sizes, so there is no way
+		 * we can end up with no space in target array, unless
+		 * something is buggy.
+		 */
+		j = find_first_zero_bit(used, ARRAY_SIZE(ctx->h264_dec.dpb));
+		if (WARN_ON(j >= ARRAY_SIZE(ctx->h264_dec.dpb)))
+			return;
+
+		cdpb = &ctx->h264_dec.dpb[j];
+		*cdpb = *ndpb;
+		set_bit(j, used);
+	}
+}
+
+struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
+					   unsigned int dpb_idx)
+{
+	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
+	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	struct vb2_buffer *buf;
+	int buf_idx = -1;
+
+	if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+		buf_idx = vb2_find_timestamp(cap_q,
+					     dpb[dpb_idx].reference_ts, 0);
+
+	if (buf_idx >= 0) {
+		buf = vb2_get_buffer(cap_q, buf_idx);
+	} else {
+		struct vb2_v4l2_buffer *dst_buf;
+
+		/*
+		 * If a DPB entry is unused or invalid, address of current
+		 * destination buffer is returned.
+		 */
+		dst_buf = hantro_get_dst_buf(ctx);
+		buf = &dst_buf->vb2_buf;
+	}
+
+	return buf;
+}
+
+int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
+{
+	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
+	struct hantro_h264_dec_ctrls *ctrls = &h264_ctx->ctrls;
+	struct hantro_h264_reflist_builder reflist_builder;
+
+	hantro_prepare_run(ctx);
+
+	ctrls->scaling =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX);
+	if (WARN_ON(!ctrls->scaling))
+		return -EINVAL;
+
+	ctrls->decode =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS);
+	if (WARN_ON(!ctrls->decode))
+		return -EINVAL;
+
+	ctrls->slices =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS);
+	if (WARN_ON(!ctrls->slices))
+		return -EINVAL;
+
+	ctrls->sps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_SPS);
+	if (WARN_ON(!ctrls->sps))
+		return -EINVAL;
+
+	ctrls->pps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_PPS);
+	if (WARN_ON(!ctrls->pps))
+		return -EINVAL;
+
+	/* Update the DPB with new refs. */
+	update_dpb(ctx);
+
+	/* Prepare data in memory. */
+	prepare_table(ctx);
+
+	/* Build the P/B{0,1} ref lists. */
+	init_reflist_builder(ctx, &reflist_builder);
+	build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
+	build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
+			  h264_ctx->reflists.b1);
+	return 0;
+}
+
+void hantro_h264_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_h264_dec_hw_ctx *h264_dec = &ctx->h264_dec;
+	struct hantro_aux_buf *priv = &h264_dec->priv;
+
+	dma_free_coherent(vpu->dev, priv->size, priv->cpu, priv->dma);
+}
+
+int hantro_h264_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_h264_dec_hw_ctx *h264_dec = &ctx->h264_dec;
+	struct hantro_aux_buf *priv = &h264_dec->priv;
+	struct hantro_h264_dec_priv_tbl *tbl;
+	struct v4l2_pix_format_mplane pix_mp;
+
+	priv->cpu = dma_alloc_coherent(vpu->dev, sizeof(*tbl), &priv->dma,
+				       GFP_KERNEL);
+	if (!priv->cpu)
+		return -ENOMEM;
+
+	priv->size = sizeof(*tbl);
+	tbl = priv->cpu;
+	memcpy(tbl->cabac_table, h264_cabac_table, sizeof(*tbl->cabac_table));
+
+	v4l2_fill_pixfmt_mp(&pix_mp, ctx->dst_fmt.pixelformat,
+			    ctx->dst_fmt.width, ctx->dst_fmt.height);
+	h264_dec->pic_size = pix_mp.plane_fmt[0].sizeimage;
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 2b8029674a75..75f1ce45a21f 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -11,6 +11,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/v4l2-controls.h>
+#include <media/h264-ctrls.h>
 #include <media/mpeg2-ctrls.h>
 #include <media/vp8-ctrls.h>
 #include <media/videobuf2-core.h>
@@ -42,6 +43,54 @@ struct hantro_jpeg_enc_hw_ctx {
 	struct hantro_aux_buf bounce_buffer;
 };
 
+/* Max. number of entries in the DPB (HW limitation). */
+#define HANTRO_H264_DPB_SIZE		16
+
+/**
+ * struct hantro_h264_dec_ctrls
+ * @decode:	Decode params
+ * @scaling:	Scaling info
+ * @slice:	Slice params
+ * @sps:	SPS info
+ * @pps:	PPS info
+ */
+struct hantro_h264_dec_ctrls {
+	const struct v4l2_ctrl_h264_decode_params *decode;
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling;
+	const struct v4l2_ctrl_h264_slice_params *slices;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
+};
+
+/**
+ * struct hantro_h264_dec_reflists
+ * @p:		P reflist
+ * @b0:		B0 reflist
+ * @b1:		B1 reflist
+ */
+struct hantro_h264_dec_reflists {
+	u8 p[HANTRO_H264_DPB_SIZE];
+	u8 b0[HANTRO_H264_DPB_SIZE];
+	u8 b1[HANTRO_H264_DPB_SIZE];
+};
+
+/**
+ * struct hantro_h264_dec_hw_ctx
+ * @priv:	Private auxiliary buffer for hardware.
+ * @dpb:	DPB
+ * @reflists:	P/B0/B1 reflists
+ * @ctrls:	V4L2 controls attached to a run
+ * @pic_size:	Size in bytes of decoded picture, this is needed
+ *		to pass the location of motion vectors.
+ */
+struct hantro_h264_dec_hw_ctx {
+	struct hantro_aux_buf priv;
+	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
+	struct hantro_h264_dec_reflists reflists;
+	struct hantro_h264_dec_ctrls ctrls;
+	size_t pic_size;
+};
+
 /**
  * struct hantro_mpeg2_dec_hw_ctx
  * @qtable:		Quantization table
@@ -109,6 +158,12 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
 int hantro_jpeg_enc_init(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 
+struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
+					   unsigned int dpb_idx);
+int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
+int hantro_h264_dec_init(struct hantro_ctx *ctx);
+void hantro_h264_dec_exit(struct hantro_ctx *ctx);
+
 void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
 void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index cd4eaa256e8b..3dae52abb96c 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -239,6 +239,15 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f,
 		/* Fill remaining fields */
 		v4l2_fill_pixfmt_mp(pix_mp, fmt->fourcc, pix_mp->width,
 				    pix_mp->height);
+		/*
+		 * The H264 decoder needs extra space on the output buffers
+		 * to store motion vectors. This is needed for reference
+		 * frames.
+		 */
+		if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
+			pix_mp->plane_fmt[0].sizeimage +=
+				128 * DIV_ROUND_UP(pix_mp->width, 16) *
+				      DIV_ROUND_UP(pix_mp->height, 16);
 	} else if (!pix_mp->plane_fmt[0].sizeimage) {
 		/*
 		 * For coded formats the application can specify
@@ -345,6 +354,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 		break;
 	case V4L2_PIX_FMT_MPEG2_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
+	case V4L2_PIX_FMT_H264_SLICE:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 10/11] media: hantro: Add support for H264 decoding on G1
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (8 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 09/11] media: hantro: Add core bits to support H264 decoding Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  2019-08-12 19:35 ` [PATCH v5 11/11] media: hantro: Enable H264 decoding on rk3288 Ezequiel Garcia
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Hertz Wong

From: Hertz Wong <hertz.wong@rock-chips.com>

Add the G1 specific bits to support H264 decoding.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
---
Changes from v5:
* None.
Changes from v4:
* Update to use pic_size from the H264 private context struct.
* Remove unused variable.
---
 drivers/staging/media/hantro/Makefile         |   1 +
 .../staging/media/hantro/hantro_g1_h264_dec.c | 292 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |   1 +
 3 files changed, 294 insertions(+)
 create mode 100644 drivers/staging/media/hantro/hantro_g1_h264_dec.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 0f0d3afb1cca..5d6b0383d280 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -4,6 +4,7 @@ hantro-vpu-y += \
 		hantro_drv.o \
 		hantro_v4l2.o \
 		hantro_h1_jpeg_enc.o \
+		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
new file mode 100644
index 000000000000..7ab534936843
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -0,0 +1,292 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip RK3288 VPU codec driver
+ *
+ * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
+ *	Hertz Wong <hertz.wong@rock-chips.com>
+ *	Herman Chen <herman.chen@rock-chips.com>
+ *
+ * Copyright (C) 2014 Google, Inc.
+ *	Tomasz Figa <tfiga@chromium.org>
+ */
+
+#include <linux/types.h>
+#include <linux/sort.h>
+
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro_g1_regs.h"
+#include "hantro_hw.h"
+#include "hantro_v4l2.h"
+
+static void set_params(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
+	const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_h264_pps *pps = ctrls->pps;
+	struct vb2_v4l2_buffer *src_buf = hantro_get_src_buf(ctx);
+	struct hantro_dev *vpu = ctx->dev;
+	u32 reg;
+
+	/* Decoder control register 0. */
+	reg = G1_REG_DEC_CTRL0_DEC_AXI_WR_ID(0x0);
+	if (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)
+		reg |= G1_REG_DEC_CTRL0_SEQ_MBAFF_E;
+	reg |= G1_REG_DEC_CTRL0_PICORD_COUNT_E;
+	if (dec_param->nal_ref_idc)
+		reg |= G1_REG_DEC_CTRL0_WRITE_MVS_E;
+
+	if (!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY) &&
+	    (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD ||
+	     slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
+		reg |= G1_REG_DEC_CTRL0_PIC_INTERLACE_E;
+	if (slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)
+		reg |= G1_REG_DEC_CTRL0_PIC_FIELDMODE_E;
+	if (!(slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD))
+		reg |= G1_REG_DEC_CTRL0_PIC_TOPFIELD_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
+
+	/* Decoder control register 1. */
+	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
+	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
+	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
+
+	/* Decoder control register 2. */
+	reg = G1_REG_DEC_CTRL2_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
+	      G1_REG_DEC_CTRL2_CH_QP_OFFSET2(pps->second_chroma_qp_index_offset);
+
+	/* always use the matrix sent from userspace */
+	reg |= G1_REG_DEC_CTRL2_TYPE1_QUANT_E;
+
+	if (slices[0].flags &  V4L2_H264_SLICE_FLAG_FIELD_PIC)
+		reg |= G1_REG_DEC_CTRL2_FIELDPIC_FLAG_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL2);
+
+	/* Decoder control register 3. */
+	reg = G1_REG_DEC_CTRL3_START_CODE_E |
+	      G1_REG_DEC_CTRL3_INIT_QP(pps->pic_init_qp_minus26 + 26) |
+	      G1_REG_DEC_CTRL3_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL3);
+
+	/* Decoder control register 4. */
+	reg = G1_REG_DEC_CTRL4_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) |
+	      G1_REG_DEC_CTRL4_FRAMENUM(slices[0].frame_num) |
+	      G1_REG_DEC_CTRL4_WEIGHT_BIPR_IDC(pps->weighted_bipred_idc);
+	if (pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE)
+		reg |= G1_REG_DEC_CTRL4_CABAC_E;
+	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
+		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
+	if (sps->chroma_format_idc == 0)
+		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
+	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
+		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL4);
+
+	/* Decoder control register 5. */
+	reg = G1_REG_DEC_CTRL5_REFPIC_MK_LEN(slices[0].dec_ref_pic_marking_bit_size) |
+	      G1_REG_DEC_CTRL5_IDR_PIC_ID(slices[0].idr_pic_id);
+	if (pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED)
+		reg |= G1_REG_DEC_CTRL5_CONST_INTRA_E;
+	if (pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT)
+		reg |= G1_REG_DEC_CTRL5_FILT_CTRL_PRES;
+	if (pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT)
+		reg |= G1_REG_DEC_CTRL5_RDPIC_CNT_PRES;
+	if (pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE)
+		reg |= G1_REG_DEC_CTRL5_8X8TRANS_FLAG_E;
+	if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC)
+		reg |= G1_REG_DEC_CTRL5_IDR_PIC_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL5);
+
+	/* Decoder control register 6. */
+	reg = G1_REG_DEC_CTRL6_PPS_ID(slices[0].pic_parameter_set_id) |
+	      G1_REG_DEC_CTRL6_REFIDX0_ACTIVE(pps->num_ref_idx_l0_default_active_minus1 + 1) |
+	      G1_REG_DEC_CTRL6_REFIDX1_ACTIVE(pps->num_ref_idx_l1_default_active_minus1 + 1) |
+	      G1_REG_DEC_CTRL6_POC_LENGTH(slices[0].pic_order_cnt_bit_size);
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL6);
+
+	/* Error concealment register. */
+	vdpu_write_relaxed(vpu, 0, G1_REG_ERR_CONC);
+
+	/* Prediction filter tap register. */
+	vdpu_write_relaxed(vpu,
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_0(1) |
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_1(-5 & 0x3ff) |
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_2(20),
+			   G1_REG_PRED_FLT);
+
+	/* Reference picture buffer control register. */
+	vdpu_write_relaxed(vpu, 0, G1_REG_REF_BUF_CTRL);
+
+	/* Reference picture buffer control register 2. */
+	vdpu_write_relaxed(vpu, G1_REG_REF_BUF_CTRL2_APF_THRESHOLD(8),
+			   G1_REG_REF_BUF_CTRL2);
+}
+
+static void set_ref(struct hantro_ctx *ctx)
+{
+	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	const u8 *b0_reflist, *b1_reflist, *p_reflist;
+	struct hantro_dev *vpu = ctx->dev;
+	u32 dpb_longterm = 0;
+	u32 dpb_valid = 0;
+	int reg_num;
+	u32 reg;
+	int i;
+
+	/*
+	 * Set up bit maps of valid and long term DPBs.
+	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
+	 */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+			dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+	}
+	vdpu_write_relaxed(vpu, dpb_valid << 16, G1_REG_VALID_REF);
+	vdpu_write_relaxed(vpu, dpb_longterm << 16, G1_REG_LT_REF);
+
+	/*
+	 * Set up reference frame picture numbers.
+	 *
+	 * Each G1_REG_REF_PIC(x) register contains numbers of two
+	 * subsequential reference pictures.
+	 */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; i += 2) {
+		reg = 0;
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].pic_num);
+		else
+			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].frame_num);
+
+		if (dpb[i + 1].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].pic_num);
+		else
+			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].frame_num);
+
+		vdpu_write_relaxed(vpu, reg, G1_REG_REF_PIC(i / 2));
+	}
+
+	b0_reflist = ctx->h264_dec.reflists.b0;
+	b1_reflist = ctx->h264_dec.reflists.b1;
+	p_reflist = ctx->h264_dec.reflists.p;
+
+	/*
+	 * Each G1_REG_BD_REF_PIC(x) register contains three entries
+	 * of each forward and backward picture list.
+	 */
+	reg_num = 0;
+	for (i = 0; i < 15; i += 3) {
+		reg = G1_REG_BD_REF_PIC_BINIT_RLIST_F0(b0_reflist[i]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_F1(b0_reflist[i + 1]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_F2(b0_reflist[i + 2]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B0(b1_reflist[i]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B1(b1_reflist[i + 1]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B2(b1_reflist[i + 2]);
+		vdpu_write_relaxed(vpu, reg, G1_REG_BD_REF_PIC(reg_num++));
+	}
+
+	/*
+	 * G1_REG_BD_P_REF_PIC register contains last entries (index 15)
+	 * of forward and backward reference picture lists and first 4 entries
+	 * of P forward picture list.
+	 */
+	reg = G1_REG_BD_P_REF_PIC_BINIT_RLIST_F15(b0_reflist[15]) |
+	      G1_REG_BD_P_REF_PIC_BINIT_RLIST_B15(b1_reflist[15]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F0(p_reflist[0]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F1(p_reflist[1]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F2(p_reflist[2]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F3(p_reflist[3]);
+	vdpu_write_relaxed(vpu, reg, G1_REG_BD_P_REF_PIC);
+
+	/*
+	 * Each G1_REG_FWD_PIC(x) register contains six consecutive
+	 * entries of P forward picture list, starting from index 4.
+	 */
+	reg_num = 0;
+	for (i = 4; i < HANTRO_H264_DPB_SIZE; i += 6) {
+		reg = G1_REG_FWD_PIC_PINIT_RLIST_F0(p_reflist[i]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F1(p_reflist[i + 1]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F2(p_reflist[i + 2]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F3(p_reflist[i + 3]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F4(p_reflist[i + 4]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F5(p_reflist[i + 5]);
+		vdpu_write_relaxed(vpu, reg, G1_REG_FWD_PIC(reg_num++));
+	}
+
+	/* Set up addresses of DPB buffers. */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
+		struct vb2_buffer *buf =  hantro_h264_get_ref_buf(ctx, i);
+
+		vdpu_write_relaxed(vpu, vb2_dma_contig_plane_dma_addr(buf, 0),
+				   G1_REG_ADDR_REF(i));
+	}
+}
+
+static void set_buffers(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct hantro_dev *vpu = ctx->dev;
+	dma_addr_t src_dma, dst_dma;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/* Source (stream) buffer. */
+	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, src_dma, G1_REG_ADDR_STR);
+
+	/* Destination (decoded frame) buffer. */
+	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
+
+	/* Higher profiles require DMV buffer appended to reference frames. */
+	if (ctrls->sps->profile_idc > 66) {
+		size_t pic_size = ctx->h264_dec.pic_size;
+		size_t mv_offset = round_up(pic_size, 8);
+
+		if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+			mv_offset += 32 * H264_MB_WIDTH(ctx->dst_fmt.width);
+
+		vdpu_write_relaxed(vpu, dst_dma + mv_offset,
+				   G1_REG_ADDR_DIR_MV);
+	}
+
+	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
+	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
+}
+
+void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	/* Prepare the H264 decoder context. */
+	if (hantro_h264_dec_prepare_run(ctx))
+		return;
+
+	/* Configure hardware registers. */
+	set_params(ctx);
+	set_ref(ctx);
+	set_buffers(ctx);
+
+	hantro_finish_run(ctx);
+
+	/* Start decoding! */
+	vdpu_write_relaxed(vpu,
+			   G1_REG_CONFIG_DEC_AXI_RD_ID(0xffu) |
+			   G1_REG_CONFIG_DEC_TIMEOUT_E |
+			   G1_REG_CONFIG_DEC_OUT_ENDIAN |
+			   G1_REG_CONFIG_DEC_STRENDIAN_E |
+			   G1_REG_CONFIG_DEC_MAX_BURST(16) |
+			   G1_REG_CONFIG_DEC_OUTSWAP32_E |
+			   G1_REG_CONFIG_DEC_INSWAP32_E |
+			   G1_REG_CONFIG_DEC_STRSWAP32_E |
+			   G1_REG_CONFIG_DEC_CLK_GATE_E,
+			   G1_REG_CONFIG);
+	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 75f1ce45a21f..2fab655bf098 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -161,6 +161,7 @@ void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 					   unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
+void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 11/11] media: hantro: Enable H264 decoding on rk3288
  2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
                   ` (9 preceding siblings ...)
  2019-08-12 19:35 ` [PATCH v5 10/11] media: hantro: Add support for H264 decoding on G1 Ezequiel Garcia
@ 2019-08-12 19:35 ` Ezequiel Garcia
  10 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-12 19:35 UTC (permalink / raw)
  To: linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel,
	Hertz Wong

From: Hertz Wong <hertz.wong@rock-chips.com>

Now that the generic bits have been added, we can activate H264 decoding
on rk3288.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Changes in v5:
* None.
Changes in v4:
* None.
---
 drivers/staging/media/hantro/rk3288_vpu_hw.c | 21 +++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c b/drivers/staging/media/hantro/rk3288_vpu_hw.c
index f1b573a006ae..6bfcc47d1e58 100644
--- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
@@ -61,6 +61,19 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
 		.fourcc = V4L2_PIX_FMT_NV12,
 		.codec_mode = HANTRO_MODE_NONE,
 	},
+	{
+		.fourcc = V4L2_PIX_FMT_H264_SLICE,
+		.codec_mode = HANTRO_MODE_H264_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = H264_MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = H264_MB_DIM,
+		},
+	},
 	{
 		.fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
 		.codec_mode = HANTRO_MODE_MPEG2_DEC,
@@ -162,6 +175,12 @@ static const struct hantro_codec_ops rk3288_vpu_codec_ops[] = {
 		.init = hantro_jpeg_enc_init,
 		.exit = hantro_jpeg_enc_exit,
 	},
+	[HANTRO_MODE_H264_DEC] = {
+		.run = hantro_g1_h264_dec_run,
+		.reset = rk3288_vpu_dec_reset,
+		.init = hantro_h264_dec_init,
+		.exit = hantro_h264_dec_exit,
+	},
 	[HANTRO_MODE_MPEG2_DEC] = {
 		.run = hantro_g1_mpeg2_dec_run,
 		.reset = rk3288_vpu_dec_reset,
@@ -197,7 +216,7 @@ const struct hantro_variant rk3288_vpu_variant = {
 	.dec_fmts = rk3288_vpu_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(rk3288_vpu_dec_fmts),
 	.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
-		 HANTRO_VP8_DECODER,
+		 HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
 	.codec_ops = rk3288_vpu_codec_ops,
 	.irqs = rk3288_irqs,
 	.num_irqs = ARRAY_SIZE(rk3288_irqs),
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode
  2019-08-12 19:35 ` [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode Ezequiel Garcia
@ 2019-08-13  6:46   ` Hans Verkuil
  2019-08-14 12:23   ` Paul Kocialkowski
  1 sibling, 0 replies; 18+ messages in thread
From: Hans Verkuil @ 2019-08-13  6:46 UTC (permalink / raw)
  To: Ezequiel Garcia, linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel

On 8/12/19 9:35 PM, Ezequiel Garcia wrote:
> From: Boris Brezillon <boris.brezillon@collabora.com>
> 
> Some stateless decoders don't support per-slice decoding granularity
> (or at least not in a way that would make them efficient or easy to use).
> 
> Expose a menu to control the supported decoding modes. Drivers are
> allowed to support only one decoding but they can support both too.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> ---
> Changes in v5:
> * Improve specification as suggested by Hans.
> Changes in v4:
> * Typos/rewording fixes
> Changes in v3:
> * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
> * Add Paul's R-b
> Changes in v2:
> * Allow decoding multiple slices in per-slice decoding mode
> * Minor doc improvement/fixes
> ---
>  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
>  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
>  include/media/h264-ctrls.h                    | 11 +++++
>  4 files changed, 68 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index c5f39dd50043..568390273fde 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1747,6 +1747,11 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - __u32
>        - ``size``
>        -
> +    * - __u32
> +      - ``start_byte_offset``
> +      - Where the slice payload starts in the output buffer. Useful when the
> +        OUTPUT buffer contains more than one slice (some codecs need to know
> +        where each slice starts in this buffer).
>      * - __u32
>        - ``header_bit_size``
>        -
> @@ -1930,7 +1935,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        -
>      * - __u16
>        - ``num_slices``
> -      - Number of slices needed to decode the current frame
> +      - Number of slices needed to decode the current frame/field. When
> +        operating in slice-based decoding mode (see
> +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
> +        should always be set to one.
>      * - __u16
>        - ``nal_ref_idc``
>        - NAL reference ID value coming from the NAL Unit header
> @@ -2021,6 +2029,43 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        - 0x00000004
>        - The DPB entry is a long term reference frame
>  
> +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
> +    Specifies the decoding mode to use. Currently exposes slice-based and
> +    frame-based decoding but new modes might be added later on.
> +    This control is used to complement V4L2_PIX_FMT_H264_SLICE
> +    pixel format. Applications that support V4L2_PIX_FMT_H264_SLICE
> +    are required to set this control in order to specify the decoding mode
> +    that is expected for the buffer.
> +    Drivers may expose a single or multiple decoding modes, depending
> +    on what they can support.
> +
> +    .. note::
> +
> +       This menu control is not yet part of the public kernel API and
> +       it is expected to change.
> +
> +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
> +
> +.. cssclass:: longtable
> +
> +.. flat-table::
> +    :header-rows:  0
> +    :stub-columns: 0
> +    :widths:       1 1 2
> +
> +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
> +      - 0
> +      - The decoding is done at the slice granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices should be set to 1.
> +        The OUTPUT buffer must contain a single slice.
> +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
> +      - 1
> +      - The decoding is done at the frame granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
> +        slices forming a frame.
> +        The OUTPUT buffer must contain all slices needed to decode the
> +        frame. The OUTPUT buffer must also contain both fields.
> +
>  .. _v4l2-mpeg-mpeg2:
>  
>  ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
> diff --git a/Documentation/media/uapi/v4l/pixfmt-compressed.rst b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> index 9b65473a2288..50557c85d99d 100644
> --- a/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> +++ b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> @@ -61,7 +61,8 @@ Compressed Formats
>  	stateless video decoders that implement an H264 pipeline
>  	(using the :ref:`mem2mem` and :ref:`media-request-api`).
>  	Metadata associated with the frame to decode are required to
> -	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
> +	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE``,

No, that's not right. H264_DECODING_MODE isn't metadata that has to be
passed via a request. It's a pixelformat modifier and has to be set only
once to describe the format that's going to be passed to the HW.

That should be made clear here. Same goes for the next patch.

Regards,

	Hans

> +        ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
>  	``V4L2_CID_MPEG_VIDEO_H264_PPS``,
>  	``V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX``,
>  	``V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS`` and
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> index cd1ae016706f..c3194299bfac 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> @@ -402,6 +402,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		"Explicit",
>  		NULL,
>  	};
> +	static const char * const h264_decoding_mode[] = {
> +		"Slice-Based",
> +		"Frame-Based",
> +		NULL,
> +	};
>  	static const char * const mpeg_mpeg2_level[] = {
>  		"Low",
>  		"Main",
> @@ -633,6 +638,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		return h264_fp_arrangement_type;
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
>  		return h264_fmo_map_type;
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
> +		return h264_decoding_mode;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  		return mpeg_mpeg2_level;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
> @@ -852,6 +859,7 @@ const char *v4l2_ctrl_get_name(u32 id)
>  	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
>  	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
>  	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
> @@ -1220,6 +1228,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
>  	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
>  	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index 6160a69c0143..e6c510877f67 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -26,6 +26,7 @@
>  #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
>  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
>  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> +#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
>  
>  /* enum v4l2_ctrl_type type values */
>  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> @@ -33,6 +34,12 @@
>  #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
>  #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
>  #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
> +#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
> +
> +enum v4l2_mpeg_video_h264_decoding_mode {
> +	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
> +	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> +};
>  
>  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
>  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
> @@ -125,6 +132,10 @@ struct v4l2_h264_pred_weight_table {
>  struct v4l2_ctrl_h264_slice_params {
>  	/* Size in bytes, including header */
>  	__u32 size;
> +
> +	/* Offset in bytes to the start of slice in the OUTPUT buffer. */
> +	__u32 start_byte_offset;
> +
>  	/* Offset in bits to slice_data() from the beginning of this slice. */
>  	__u32 header_bit_size;
>  
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 04/11] media: uapi: h264: Add the concept of start code
  2019-08-12 19:35 ` [PATCH v5 04/11] media: uapi: h264: Add the concept of start code Ezequiel Garcia
@ 2019-08-14  8:11   ` Hans Verkuil
  2019-08-14 11:49     ` Paul Kocialkowski
  0 siblings, 1 reply; 18+ messages in thread
From: Hans Verkuil @ 2019-08-14  8:11 UTC (permalink / raw)
  To: Ezequiel Garcia, linux-media
  Cc: kernel, Nicolas Dufresne, Tomasz Figa, linux-rockchip,
	Heiko Stuebner, Jonas Karlman, Philipp Zabel, Boris Brezillon,
	Paul Kocialkowski, Alexandre Courbot, fbuergisser, linux-kernel

On 8/12/19 9:35 PM, Ezequiel Garcia wrote:
> Stateless decoders have different expectations about the
> start code that is prepended on H264 slices. Add a
> menu control to express the supported start code types
> (including no start code).
> 
> Drivers are allowed to support only one start code type,
> but they can support both too.
> 
> Note that this is independent of the H264 decoding mode,
> which specifies the granularity of the decoding operations.
> Either in frame-based or slice-based mode, this new control
> will allow to define the start code expected on H264 slices.
> 
> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> ---
> Changes in v5:
> * Improve specification as suggested by Hans.
> Changes in v4:
> * New patch.
> ---
>  .../media/uapi/v4l/ext-ctrls-codec.rst        | 33 +++++++++++++++++++
>  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 +++++
>  include/media/h264-ctrls.h                    |  6 ++++
>  4 files changed, 50 insertions(+), 1 deletion(-)
> 

<snip>

> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index e6c510877f67..31555c99f64a 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -27,6 +27,7 @@
>  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
>  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
>  #define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
> +#define V4L2_CID_MPEG_VIDEO_H264_STARTCODE	(V4L2_CID_MPEG_BASE+1006)

I almost forgot: can you change this to _START_CODE? Since it is two words?

Thanks!

	Hans

>  
>  /* enum v4l2_ctrl_type type values */
>  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> @@ -41,6 +42,11 @@ enum v4l2_mpeg_video_h264_decoding_mode {
>  	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
>  };
>  
> +enum v4l2_mpeg_video_h264_start_code {
> +	V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
> +	V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,
> +};
> +
>  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
>  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
>  #define V4L2_H264_SPS_CONSTRAINT_SET2_FLAG			0x04
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 04/11] media: uapi: h264: Add the concept of start code
  2019-08-14  8:11   ` Hans Verkuil
@ 2019-08-14 11:49     ` Paul Kocialkowski
  2019-08-14 14:12       ` Ezequiel Garcia
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Kocialkowski @ 2019-08-14 11:49 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Ezequiel Garcia, linux-media, kernel, Nicolas Dufresne,
	Tomasz Figa, linux-rockchip, Heiko Stuebner, Jonas Karlman,
	Philipp Zabel, Boris Brezillon, Alexandre Courbot, fbuergisser,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2957 bytes --]

Hi,

On Wed 14 Aug 19, 10:11, Hans Verkuil wrote:
> On 8/12/19 9:35 PM, Ezequiel Garcia wrote:
> > Stateless decoders have different expectations about the
> > start code that is prepended on H264 slices. Add a
> > menu control to express the supported start code types
> > (including no start code).
> > 
> > Drivers are allowed to support only one start code type,
> > but they can support both too.
> > 
> > Note that this is independent of the H264 decoding mode,
> > which specifies the granularity of the decoding operations.
> > Either in frame-based or slice-based mode, this new control
> > will allow to define the start code expected on H264 slices.
> > 
> > Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> > Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> > ---
> > Changes in v5:
> > * Improve specification as suggested by Hans.
> > Changes in v4:
> > * New patch.
> > ---
> >  .../media/uapi/v4l/ext-ctrls-codec.rst        | 33 +++++++++++++++++++
> >  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
> >  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 +++++
> >  include/media/h264-ctrls.h                    |  6 ++++
> >  4 files changed, 50 insertions(+), 1 deletion(-)
> > 
> 
> <snip>
> 
> > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > index e6c510877f67..31555c99f64a 100644
> > --- a/include/media/h264-ctrls.h
> > +++ b/include/media/h264-ctrls.h
> > @@ -27,6 +27,7 @@
> >  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
> >  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> >  #define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
> > +#define V4L2_CID_MPEG_VIDEO_H264_STARTCODE	(V4L2_CID_MPEG_BASE+1006)
> 
> I almost forgot: can you change this to _START_CODE? Since it is two words?

Agreed, I like it better this way too.

> Thanks!
> 
> 	Hans
> 
> >  
> >  /* enum v4l2_ctrl_type type values */
> >  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> > @@ -41,6 +42,11 @@ enum v4l2_mpeg_video_h264_decoding_mode {
> >  	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> >  };
> >  
> > +enum v4l2_mpeg_video_h264_start_code {
> > +	V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
> > +	V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,

Could we apply the same START_CODE renaming here too?

I was also thinking that it would be slightly more readable put like this,
with START_CODE as a prefix since it's common to both options and the name of
the enum:

- V4L2_MPEG_VIDEO_H264_START_CODE_NONE
- V4L2_MPEG_VIDEO_H264_START_CODE_ANNEX_B

What do you think?

Cheers,

Paul

> > +};
> > +
> >  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
> >  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
> >  #define V4L2_H264_SPS_CONSTRAINT_SET2_FLAG			0x04
> > 
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode
  2019-08-12 19:35 ` [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode Ezequiel Garcia
  2019-08-13  6:46   ` Hans Verkuil
@ 2019-08-14 12:23   ` Paul Kocialkowski
  2019-08-14 14:08     ` Ezequiel Garcia
  1 sibling, 1 reply; 18+ messages in thread
From: Paul Kocialkowski @ 2019-08-14 12:23 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: linux-media, kernel, Nicolas Dufresne, Tomasz Figa,
	linux-rockchip, Heiko Stuebner, Jonas Karlman, Philipp Zabel,
	Boris Brezillon, Alexandre Courbot, fbuergisser, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 11379 bytes --]

Hi,

On Mon 12 Aug 19, 16:35, Ezequiel Garcia wrote:
> From: Boris Brezillon <boris.brezillon@collabora.com>
> 
> Some stateless decoders don't support per-slice decoding granularity
> (or at least not in a way that would make them efficient or easy to use).
> 
> Expose a menu to control the supported decoding modes. Drivers are
> allowed to support only one decoding but they can support both too.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> ---
> Changes in v5:
> * Improve specification as suggested by Hans.
> Changes in v4:
> * Typos/rewording fixes
> Changes in v3:
> * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
> * Add Paul's R-b
> Changes in v2:
> * Allow decoding multiple slices in per-slice decoding mode
> * Minor doc improvement/fixes
> ---
>  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
>  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
>  include/media/h264-ctrls.h                    | 11 +++++
>  4 files changed, 68 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index c5f39dd50043..568390273fde 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1747,6 +1747,11 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - __u32
>        - ``size``
>        -
> +    * - __u32
> +      - ``start_byte_offset``
> +      - Where the slice payload starts in the output buffer. Useful when the
> +        OUTPUT buffer contains more than one slice (some codecs need to know
> +        where each slice starts in this buffer).

I think there is a possibility for misunderstanding here: does the
"slice payload" include the start code when it (annex-b) is selected?

I'd suspect that the hardware needs to know where the start code begings rather
than where the NAL unit starts when there is a start code. So I'd suggest
specifying it as the offset to the beginning of the start code (if present) or
to the slice NAL unit start otherwise.

The clarification should probably also concern header_bit_size, to make it clear
where it is counted from. I think it makes sense to count it starting from
the start_byte_offset position.

>      * - __u32
>        - ``header_bit_size``
>        -
> @@ -1930,7 +1935,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        -
>      * - __u16
>        - ``num_slices``
> -      - Number of slices needed to decode the current frame
> +      - Number of slices needed to decode the current frame/field. When
> +        operating in slice-based decoding mode (see
> +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
> +        should always be set to one.

I am fine with the current proposal for now, but I believe that we should keep
discussing whether we want per-slice mode to strictly enforce having a
single-slice, outside of this series.

One reason for that would be to gather multiple slices (up to what amounts to
a frame) for efficient use of OUTPUT buffers and to avoid allocating a huge
number of small ones (which is certainly quite inefficient due to CMA
alignment).

Apparently, passing multiple slice_params controls is an issue so per-frame mode
currently has to rely on the hardware being able to parse the slice header on
its own. The issue exists in the V4L2 API, where we need to know in advance
the maximum number of slices we want to use to make the control an array and
copy all the controls with each request, even if only a single-one is used.

Maybe one way to solve that would be to use multiple requests with the same
OUTPUT buffer, one per slice, so that we only have to pass a single slice_params
control per-request. And the first slice of the frame would get the other
controls too, while the others wouldn't (assuming we can trust that controls
won't change in-between and I'm not sure how true that is). Worst case, all
the controls have to be attached with each request, which is maybe still better
than passing a big number of unused slice_params controls each time.

Again, I believe the current proposal is good to go for now as it allows the
hantro driver to get-in staging, which is the priority.

But the issue I'm discussing should certainly be fixed before unstaging the API.

>      * - __u16
>        - ``nal_ref_idc``
>        - NAL reference ID value coming from the NAL Unit header
> @@ -2021,6 +2029,43 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        - 0x00000004
>        - The DPB entry is a long term reference frame
>  
> +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
> +    Specifies the decoding mode to use. Currently exposes slice-based and
> +    frame-based decoding but new modes might be added later on.
> +    This control is used to complement V4L2_PIX_FMT_H264_SLICE
> +    pixel format. Applications that support V4L2_PIX_FMT_H264_SLICE
> +    are required to set this control in order to specify the decoding mode
> +    that is expected for the buffer.
> +    Drivers may expose a single or multiple decoding modes, depending
> +    on what they can support.
> +
> +    .. note::
> +
> +       This menu control is not yet part of the public kernel API and
> +       it is expected to change.
> +
> +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
> +
> +.. cssclass:: longtable
> +
> +.. flat-table::
> +    :header-rows:  0
> +    :stub-columns: 0
> +    :widths:       1 1 2
> +
> +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
> +      - 0
> +      - The decoding is done at the slice granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices should be set to 1.
> +        The OUTPUT buffer must contain a single slice.
> +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
> +      - 1
> +      - The decoding is done at the frame granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
> +        slices forming a frame.
> +        The OUTPUT buffer must contain all slices needed to decode the
> +        frame. The OUTPUT buffer must also contain both fields.
> +

Nitpick/suggestion here: could we use the name of the enum as a prefix for the
modes, like:
- V4L2_MPEG_VIDEO_H264_DECODING_MODE_PER_SLICE
- V4L2_MPEG_VIDEO_H264_DECODING_MODE_PER_FRAME

or
- V4L2_MPEG_VIDEO_H264_DECODING_MODE_SLICE_BASED
- V4L2_MPEG_VIDEO_H264_DECODING_MODE_FRAME_BASED

I personally find it more readable this way, since the same prefix is kept.

Cheers,

Paul

>  .. _v4l2-mpeg-mpeg2:
>  
>  ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
> diff --git a/Documentation/media/uapi/v4l/pixfmt-compressed.rst b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> index 9b65473a2288..50557c85d99d 100644
> --- a/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> +++ b/Documentation/media/uapi/v4l/pixfmt-compressed.rst
> @@ -61,7 +61,8 @@ Compressed Formats
>  	stateless video decoders that implement an H264 pipeline
>  	(using the :ref:`mem2mem` and :ref:`media-request-api`).
>  	Metadata associated with the frame to decode are required to
> -	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
> +	be passed through the ``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE``,
> +        ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
>  	``V4L2_CID_MPEG_VIDEO_H264_PPS``,
>  	``V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX``,
>  	``V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS`` and
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> index cd1ae016706f..c3194299bfac 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> @@ -402,6 +402,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		"Explicit",
>  		NULL,
>  	};
> +	static const char * const h264_decoding_mode[] = {
> +		"Slice-Based",
> +		"Frame-Based",
> +		NULL,
> +	};
>  	static const char * const mpeg_mpeg2_level[] = {
>  		"Low",
>  		"Main",
> @@ -633,6 +638,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		return h264_fp_arrangement_type;
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
>  		return h264_fmo_map_type;
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
> +		return h264_decoding_mode;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  		return mpeg_mpeg2_level;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
> @@ -852,6 +859,7 @@ const char *v4l2_ctrl_get_name(u32 id)
>  	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
>  	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
>  	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
> @@ -1220,6 +1228,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
>  	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
>  	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index 6160a69c0143..e6c510877f67 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -26,6 +26,7 @@
>  #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
>  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
>  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> +#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
>  
>  /* enum v4l2_ctrl_type type values */
>  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> @@ -33,6 +34,12 @@
>  #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
>  #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
>  #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
> +#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
> +
> +enum v4l2_mpeg_video_h264_decoding_mode {
> +	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
> +	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> +};
>  
>  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
>  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
> @@ -125,6 +132,10 @@ struct v4l2_h264_pred_weight_table {
>  struct v4l2_ctrl_h264_slice_params {
>  	/* Size in bytes, including header */
>  	__u32 size;
> +
> +	/* Offset in bytes to the start of slice in the OUTPUT buffer. */
> +	__u32 start_byte_offset;
> +
>  	/* Offset in bits to slice_data() from the beginning of this slice. */
>  	__u32 header_bit_size;
>  
> -- 
> 2.22.0
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode
  2019-08-14 12:23   ` Paul Kocialkowski
@ 2019-08-14 14:08     ` Ezequiel Garcia
  0 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-14 14:08 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: linux-media, kernel, Nicolas Dufresne, Tomasz Figa,
	linux-rockchip, Heiko Stuebner, Jonas Karlman, Philipp Zabel,
	Boris Brezillon, Alexandre Courbot, fbuergisser, linux-kernel

On Wed, 2019-08-14 at 14:23 +0200, Paul Kocialkowski wrote:
> Hi,
> 
> On Mon 12 Aug 19, 16:35, Ezequiel Garcia wrote:
> > From: Boris Brezillon <boris.brezillon@collabora.com>
> > 
> > Some stateless decoders don't support per-slice decoding granularity
> > (or at least not in a way that would make them efficient or easy to use).
> > 
> > Expose a menu to control the supported decoding modes. Drivers are
> > allowed to support only one decoding but they can support both too.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> > ---
> > Changes in v5:
> > * Improve specification as suggested by Hans.
> > Changes in v4:
> > * Typos/rewording fixes
> > Changes in v3:
> > * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
> > * Add Paul's R-b
> > Changes in v2:
> > * Allow decoding multiple slices in per-slice decoding mode
> > * Minor doc improvement/fixes
> > ---
> >  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
> >  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
> >  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
> >  include/media/h264-ctrls.h                    | 11 +++++
> >  4 files changed, 68 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > index c5f39dd50043..568390273fde 100644
> > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > @@ -1747,6 +1747,11 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >      * - __u32
> >        - ``size``
> >        -
> > +    * - __u32
> > +      - ``start_byte_offset``
> > +      - Where the slice payload starts in the output buffer. Useful when the
> > +        OUTPUT buffer contains more than one slice (some codecs need to know
> > +        where each slice starts in this buffer).
> 
> I think there is a possibility for misunderstanding here: does the
> "slice payload" include the start code when it (annex-b) is selected?
> 
> I'd suspect that the hardware needs to know where the start code begings rather
> than where the NAL unit starts when there is a start code. So I'd suggest
> specifying it as the offset to the beginning of the start code (if present) or
> to the slice NAL unit start otherwise.
> 

Yes, start_byte_offset is the offset in bytes from the beginning
of the OUTPUT buffer to the start of this slice (which may start
with a start code as per the start code control).

I guess the field could be named simply "offset", but at the same
time, this is bikeshedding.

> The clarification should probably also concern header_bit_size, to make it clear
> where it is counted from. I think it makes sense to count it starting from
> the start_byte_offset position.
> 

The header_bit_size is currently counted from the start of slice.
It seems this needs to be better documented, but not necessarily
as part of this patch.

> >      * - __u32
> >        - ``header_bit_size``
> >        -
> > @@ -1930,7 +1935,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >        -
> >      * - __u16
> >        - ``num_slices``
> > -      - Number of slices needed to decode the current frame
> > +      - Number of slices needed to decode the current frame/field. When
> > +        operating in slice-based decoding mode (see
> > +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
> > +        should always be set to one.
> 
> I am fine with the current proposal for now, but I believe that we should keep
> discussing whether we want per-slice mode to strictly enforce having a
> single-slice, outside of this series.
> 

Sounds good.

> One reason for that would be to gather multiple slices (up to what amounts to
> a frame) for efficient use of OUTPUT buffers and to avoid allocating a huge
> number of small ones (which is certainly quite inefficient due to CMA
> alignment).
> 
> Apparently, passing multiple slice_params controls is an issue so per-frame mode
> currently has to rely on the hardware being able to parse the slice header on
> its own. The issue exists in the V4L2 API, where we need to know in advance
> the maximum number of slices we want to use to make the control an array and
> copy all the controls with each request, even if only a single-one is used.
> 
> Maybe one way to solve that would be to use multiple requests with the same
> OUTPUT buffer, one per slice, so that we only have to pass a single slice_params
> control per-request. And the first slice of the frame would get the other
> controls too, while the others wouldn't (assuming we can trust that controls
> won't change in-between and I'm not sure how true that is). Worst case, all
> the controls have to be attached with each request, which is maybe still better
> than passing a big number of unused slice_params controls each time.
> 
> Again, I believe the current proposal is good to go for now as it allows the
> hantro driver to get-in staging, which is the priority.
> 
> But the issue I'm discussing should certainly be fixed before unstaging the API.
> 

Totally, I think the next step is to add multi-slice support to the cedrus
driver. It will help consolidate the API.

> >      * - __u16
> >        - ``nal_ref_idc``
> >        - NAL reference ID value coming from the NAL Unit header
> > @@ -2021,6 +2029,43 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >        - 0x00000004
> >        - The DPB entry is a long term reference frame
> >  
> > +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
> > +    Specifies the decoding mode to use. Currently exposes slice-based and
> > +    frame-based decoding but new modes might be added later on.
> > +    This control is used to complement V4L2_PIX_FMT_H264_SLICE
> > +    pixel format. Applications that support V4L2_PIX_FMT_H264_SLICE
> > +    are required to set this control in order to specify the decoding mode
> > +    that is expected for the buffer.
> > +    Drivers may expose a single or multiple decoding modes, depending
> > +    on what they can support.
> > +
> > +    .. note::
> > +
> > +       This menu control is not yet part of the public kernel API and
> > +       it is expected to change.
> > +
> > +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
> > +
> > +.. cssclass:: longtable
> > +
> > +.. flat-table::
> > +    :header-rows:  0
> > +    :stub-columns: 0
> > +    :widths:       1 1 2
> > +
> > +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
> > +      - 0
> > +      - The decoding is done at the slice granularity.
> > +        v4l2_ctrl_h264_decode_params->num_slices should be set to 1.
> > +        The OUTPUT buffer must contain a single slice.
> > +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
> > +      - 1
> > +      - The decoding is done at the frame granularity.
> > +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
> > +        slices forming a frame.
> > +        The OUTPUT buffer must contain all slices needed to decode the
> > +        frame. The OUTPUT buffer must also contain both fields.
> > +
> 
> Nitpick/suggestion here: could we use the name of the enum as a prefix for the
> modes, like:
> - V4L2_MPEG_VIDEO_H264_DECODING_MODE_PER_SLICE
> - V4L2_MPEG_VIDEO_H264_DECODING_MODE_PER_FRAME
> 
> or
> - V4L2_MPEG_VIDEO_H264_DECODING_MODE_SLICE_BASED
> - V4L2_MPEG_VIDEO_H264_DECODING_MODE_FRAME_BASED
> 
> I personally find it more readable this way, since the same prefix is kept.
> 

Yes, that looks better.

Thanks for reviewing,
Ezequiel


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 04/11] media: uapi: h264: Add the concept of start code
  2019-08-14 11:49     ` Paul Kocialkowski
@ 2019-08-14 14:12       ` Ezequiel Garcia
  0 siblings, 0 replies; 18+ messages in thread
From: Ezequiel Garcia @ 2019-08-14 14:12 UTC (permalink / raw)
  To: Paul Kocialkowski, Hans Verkuil
  Cc: linux-media, kernel, Nicolas Dufresne, Tomasz Figa,
	linux-rockchip, Heiko Stuebner, Jonas Karlman, Philipp Zabel,
	Boris Brezillon, Alexandre Courbot, fbuergisser, linux-kernel

On Wed, 2019-08-14 at 13:49 +0200, Paul Kocialkowski wrote:
> Hi,
> 
> On Wed 14 Aug 19, 10:11, Hans Verkuil wrote:
> > On 8/12/19 9:35 PM, Ezequiel Garcia wrote:
> > > Stateless decoders have different expectations about the
> > > start code that is prepended on H264 slices. Add a
> > > menu control to express the supported start code types
> > > (including no start code).
> > > 
> > > Drivers are allowed to support only one start code type,
> > > but they can support both too.
> > > 
> > > Note that this is independent of the H264 decoding mode,
> > > which specifies the granularity of the decoding operations.
> > > Either in frame-based or slice-based mode, this new control
> > > will allow to define the start code expected on H264 slices.
> > > 
> > > Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> > > Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> > > ---
> > > Changes in v5:
> > > * Improve specification as suggested by Hans.
> > > Changes in v4:
> > > * New patch.
> > > ---
> > >  .../media/uapi/v4l/ext-ctrls-codec.rst        | 33 +++++++++++++++++++
> > >  .../media/uapi/v4l/pixfmt-compressed.rst      |  3 +-
> > >  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 +++++
> > >  include/media/h264-ctrls.h                    |  6 ++++
> > >  4 files changed, 50 insertions(+), 1 deletion(-)
> > > 
> > 
> > <snip>
> > 
> > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > index e6c510877f67..31555c99f64a 100644
> > > --- a/include/media/h264-ctrls.h
> > > +++ b/include/media/h264-ctrls.h
> > > @@ -27,6 +27,7 @@
> > >  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
> > >  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> > >  #define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
> > > +#define V4L2_CID_MPEG_VIDEO_H264_STARTCODE	(V4L2_CID_MPEG_BASE+1006)
> > 
> > I almost forgot: can you change this to _START_CODE? Since it is two words?
> 
> Agreed, I like it better this way too.
> 

The reason to have STARTCODE instead of START_CODE was to have some
consistency with other controls (namely, V4L2_CID_MPEG_VIDEO_HEVC_WITHOUT_STARTCODE).

Consistency is really important in a API,
but at the same time, I agree START_CODE looks better.

> > Thanks!
> > 
> > 	Hans
> > 
> > >  
> > >  /* enum v4l2_ctrl_type type values */
> > >  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> > > @@ -41,6 +42,11 @@ enum v4l2_mpeg_video_h264_decoding_mode {
> > >  	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> > >  };
> > >  
> > > +enum v4l2_mpeg_video_h264_start_code {
> > > +	V4L2_MPEG_VIDEO_H264_NO_STARTCODE,
> > > +	V4L2_MPEG_VIDEO_H264_ANNEX_B_STARTCODE,
> 
> Could we apply the same START_CODE renaming here too?
> 
> I was also thinking that it would be slightly more readable put like this,
> with START_CODE as a prefix since it's common to both options and the name of
> the enum:
> 
> - V4L2_MPEG_VIDEO_H264_START_CODE_NONE
> - V4L2_MPEG_VIDEO_H264_START_CODE_ANNEX_B
> 

Yes, that looks much better.

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-08-14 14:12 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-12 19:35 [PATCH v5 00/11] media: hantro: Add support for H264 decoding Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 01/11] lib/sort.c: implement sort() variant taking context argument Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 02/11] media: uapi: h264: Rename pixel format Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 03/11] media: uapi: h264: Add the concept of decoding mode Ezequiel Garcia
2019-08-13  6:46   ` Hans Verkuil
2019-08-14 12:23   ` Paul Kocialkowski
2019-08-14 14:08     ` Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 04/11] media: uapi: h264: Add the concept of start code Ezequiel Garcia
2019-08-14  8:11   ` Hans Verkuil
2019-08-14 11:49     ` Paul Kocialkowski
2019-08-14 14:12       ` Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 05/11] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 06/11] media: cedrus: Cleanup control initialization Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 07/11] media: cedrus: Specify H264 startcode and decoding mode Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 08/11] media: hantro: Move copy_metadata() before doing a decode operation Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 09/11] media: hantro: Add core bits to support H264 decoding Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 10/11] media: hantro: Add support for H264 decoding on G1 Ezequiel Garcia
2019-08-12 19:35 ` [PATCH v5 11/11] media: hantro: Enable H264 decoding on rk3288 Ezequiel Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).