linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/12] media: hantro: H264 fixes and improvements
@ 2019-09-01 12:42 Jonas Karlman
  2019-09-01 12:45 ` [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288 Jonas Karlman
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:42 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

This series contains fixes and improvements for the hantro H264 decoder.

Patch 1-6 fixes issues and limitations observed when preparing support
for field encoded content.

Patch 7 introduce new DPB entry flags that is used to signal how a reference
frame is referenced. This information is needed to correctly build a
reference list for field encoded content.

Patch 8 adds bits to handle field encoded content, this is a rough patch
and should be reworked with proper code style and formatting.
Please get back with feedback on how to improve this.

The following samples from [1] are now playable with patch 1-8
- H264_1080i-25-interlace_Kaesescheibchen.mkv
- H264_10_1080i_50_AC3-Astra19.2_ProSieben_HD.ts
- big_buck_bunny_1080p_H264_AAC_25fps_7200K.mp4
- h264_tivo_sample.ts

The rest of the patches refactors G1 H264 code to more closely match
the code generated by my rockchip-vpu-regtool at [2] and then adds
support for H264 decoding on RK3399/RK3328 using the VPU2 block.
This code is early work and needs proper code style and formatting,
I just wanted to share the early work and get some initial feedback.

This series has been tested using ffmpeg v4l2 request hwaccel at [3] [4]

[1] http://kwiboo.libreelec.tv/test/samples/
[2] https://github.com/Kwiboo/rockchip-vpu-regtool
[3] https://github.com/Kwiboo/FFmpeg/commits/v4l2-request-hwaccel-4.0.4
[4] https://github.com/Kwiboo/FFmpeg/compare/4.0.4-Leia-18.4...45df99d31062e068073cf899dce559e334c9127f

Regards,
Jonas

Jonas Karlman (12):
  media: hantro: Fix H264 max frmsize supported on RK3288
  media: hantro: Do not reorder H264 scaling list
  media: hantro: Fix H264 motion vector buffer offset
  media: hantro: Reduce H264 extra space for motion vectors
  media: hantro: Remove now unused H264 pic_size
  media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly
  media: uapi: h264: Add DPB entry field reference flags
  media: hantro: Fix H264 decoding of field encoded content
  media: hantro: Refactor G1 H264 code
  media: hantro: Add support for H264 decoding on RK3399
  media: hantro: Enable H264 decoding on RK3399
  media: hantro: Enable H264 decoding on RK3328

 .../media/uapi/v4l/ext-ctrls-codec.rst        |  12 +
 drivers/staging/media/hantro/Makefile         |   1 +
 .../staging/media/hantro/hantro_g1_h264_dec.c | 685 +++++++++++-------
 drivers/staging/media/hantro/hantro_h264.c    | 209 +++---
 drivers/staging/media/hantro/hantro_hw.h      |  10 +-
 drivers/staging/media/hantro/hantro_v4l2.c    |   6 +-
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |   4 +-
 drivers/staging/media/hantro/rk3399_vpu_hw.c  |  24 +-
 .../media/hantro/rk3399_vpu_hw_h264_dec.c     | 486 +++++++++++++
 include/media/h264-ctrls.h                    |   4 +
 10 files changed, 1091 insertions(+), 350 deletions(-)
 create mode 100644 drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-02 14:00     ` Philipp Zabel
  2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

Scaling list supplied from userspace using ffmpeg and libva-v4l2-request
is already in matrix order and can be used without applying the inverse
scanning process.

The HW also only support 8x8 scaling list for the Y component, indices 0
and 3 in the scaling list supplied from userspace.

Remove reordering and write the scaling matrix in an order expected by
the VPU, also only allocate memory for the two 8x8 lists used.

Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/hantro_h264.c | 64 +++++++---------------
 1 file changed, 20 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index 0d758e0c0f99..e2d01145ac4f 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -20,7 +20,7 @@
 /* Size with u32 units. */
 #define CABAC_INIT_BUFFER_SIZE		(460 * 2)
 #define POC_BUFFER_SIZE			34
-#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
+#define SCALING_LIST_SIZE		(6 * 16 + 2 * 64)
 
 #define POC_CMP(p0, p1) ((p0) < (p1) ? -1 : 1)
 
@@ -194,57 +194,33 @@ static const u32 h264_cabac_table[] = {
 	0x1f0c2517, 0x1f261440
 };
 
-/*
- * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
- * to get the values in matrix order. In addition, the hardware requires bytes
- * swapped within each subsequent 4 bytes. Both arrays below include both
- * transformations.
- */
-static const u32 zig_zag_4x4[] = {
-	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
-};
-
-static const u32 zig_zag_8x8[] = {
-	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
-	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
-	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
-	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
-};
-
 static void
 reorder_scaling_list(struct hantro_ctx *ctx)
 {
 	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
 	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
-	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
-	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
-	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
-	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
 	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
-	u8 *dst = tbl->scaling_list;
-	const u8 *src;
-	int i, j;
-
-	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
-	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
-	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
-		     num_list_4x4 * list_len_4x4 +
-		     num_list_8x8 * list_len_8x8);
-
-	src = &scaling->scaling_list_4x4[0][0];
-	for (i = 0; i < num_list_4x4; ++i) {
-		for (j = 0; j < list_len_4x4; ++j)
-			dst[zig_zag_4x4[j]] = src[j];
-		src += list_len_4x4;
-		dst += list_len_4x4;
+	u32 *dst = (u32 *)tbl->scaling_list;
+	u32 i, j, tmp;
+
+	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_4x4); i++) {
+		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_4x4[0]) / 4; j++) {
+			tmp = (scaling->scaling_list_4x4[i][4 * j + 0] << 24) |
+			      (scaling->scaling_list_4x4[i][4 * j + 1] << 16) |
+			      (scaling->scaling_list_4x4[i][4 * j + 2] << 8) |
+			      (scaling->scaling_list_4x4[i][4 * j + 3]);
+			*dst++ = tmp;
+		}
 	}
 
-	src = &scaling->scaling_list_8x8[0][0];
-	for (i = 0; i < num_list_8x8; ++i) {
-		for (j = 0; j < list_len_8x8; ++j)
-			dst[zig_zag_8x8[j]] = src[j];
-		src += list_len_8x8;
-		dst += list_len_8x8;
+	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_8x8); i += 3) {
+		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_8x8[0]) / 4; j++) {
+			tmp = (scaling->scaling_list_8x8[i][4 * j + 0] << 24) |
+			      (scaling->scaling_list_8x8[i][4 * j + 1] << 16) |
+			      (scaling->scaling_list_8x8[i][4 * j + 2] << 8) |
+			      (scaling->scaling_list_8x8[i][4 * j + 3]);
+			*dst++ = tmp;
+		}
 	}
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288
  2019-09-01 12:42 [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Jonas Karlman
@ 2019-09-01 12:45 ` Jonas Karlman
  2019-09-04 13:07   ` Ezequiel Garcia
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
  2019-09-02 13:02 ` [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Ezequiel Garcia
  2 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

TRM specify supported image size 48x48 to 4096x2304 at step size 16 pixels,
change frmsize max_width/max_height to match TRM.

Fixes: 760327930e10 ("media: hantro: Enable H264 decoding on rk3288")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/rk3288_vpu_hw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c b/drivers/staging/media/hantro/rk3288_vpu_hw.c
index 6bfcc47d1e58..ebb017b8a334 100644
--- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
@@ -67,10 +67,10 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
 		.max_depth = 2,
 		.frmsize = {
 			.min_width = 48,
-			.max_width = 3840,
+			.max_width = 4096,
 			.step_width = H264_MB_DIM,
 			.min_height = 48,
-			.max_height = 2160,
+			.max_height = 2304,
 			.step_height = H264_MB_DIM,
 		},
 	},
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
  2019-09-01 12:45   ` [PATCH 02/12] media: hantro: Do not reorder H264 scaling list Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-03 10:58     ` Philipp Zabel
                       ` (2 more replies)
  2019-09-01 12:45   ` [PATCH 05/12] media: hantro: Remove now unused H264 pic_size Jonas Karlman
                     ` (8 subsequent siblings)
  10 siblings, 3 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
and is laid out in memory as follow:

+-------------------+
| Y-plane   256 MBs |
+-------------------+
| UV-plane  128 MBs |
+-------------------+
| MV buffer  64 MBs |
+-------------------+

The motion vector buffer offset is currently correct for 4:2:0 because
the extra space for motion vectors is overallocated with an extra 64 MBs.

Wrong offset for both destination and motion vector buffer are used
for the bottom field of field encoded content, wrong offset is
also used for 4:0:0 (monochrome) content.

Fix this by always setting the motion vector address to the expected
384 MBs offset for 4:2:0 and 256 MBs offset for 4:0:0 content.

Also use correct destination and motion vector buffer offset
for the bottom field of field encoded content.

While at it also extend the check for 4:0:0 (monochrome) to include an
additional check for High Profile (100).

Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 .../staging/media/hantro/hantro_g1_h264_dec.c | 33 +++++++++++--------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 7ab534936843..159bd67e0a36 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -19,6 +19,9 @@
 #include "hantro_hw.h"
 #include "hantro_v4l2.h"
 
+#define MV_OFFSET_420	384
+#define MV_OFFSET_400	256
+
 static void set_params(struct hantro_ctx *ctx)
 {
 	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
@@ -49,8 +52,8 @@ static void set_params(struct hantro_ctx *ctx)
 	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
 
 	/* Decoder control register 1. */
-	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
-	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
+	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
+	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
 	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
 	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
 
@@ -79,7 +82,7 @@ static void set_params(struct hantro_ctx *ctx)
 		reg |= G1_REG_DEC_CTRL4_CABAC_E;
 	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
 		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
-	if (sps->chroma_format_idc == 0)
+	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
 		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
 	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
 		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
@@ -233,6 +236,7 @@ static void set_buffers(struct hantro_ctx *ctx)
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
 	struct hantro_dev *vpu = ctx->dev;
 	dma_addr_t src_dma, dst_dma;
+	unsigned int offset = MV_OFFSET_420;
 
 	src_buf = hantro_get_src_buf(ctx);
 	dst_buf = hantro_get_dst_buf(ctx);
@@ -243,19 +247,20 @@ static void set_buffers(struct hantro_ctx *ctx)
 
 	/* Destination (decoded frame) buffer. */
 	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+		dst_dma += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
 	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
 
-	/* Higher profiles require DMV buffer appended to reference frames. */
-	if (ctrls->sps->profile_idc > 66) {
-		size_t pic_size = ctx->h264_dec.pic_size;
-		size_t mv_offset = round_up(pic_size, 8);
-
-		if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
-			mv_offset += 32 * H264_MB_WIDTH(ctx->dst_fmt.width);
-
-		vdpu_write_relaxed(vpu, dst_dma + mv_offset,
-				   G1_REG_ADDR_DIR_MV);
-	}
+	/* Motion vector buffer is located after the decoded frame. */
+	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	if (ctrls->sps->profile_idc >= 100 && ctrls->sps->chroma_format_idc == 0)
+		offset = MV_OFFSET_400;
+	dst_dma += offset * H264_MB_WIDTH(ctx->dst_fmt.width) *
+		   H264_MB_HEIGHT(ctx->dst_fmt.height);
+	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+		dst_dma += 32 * H264_MB_WIDTH(ctx->dst_fmt.width) *
+			   H264_MB_HEIGHT(ctx->dst_fmt.height);
+	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DIR_MV);
 
 	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
 	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/12] media: hantro: Reduce H264 extra space for motion vectors
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (2 preceding siblings ...)
  2019-09-01 12:45   ` [PATCH 05/12] media: hantro: Remove now unused H264 pic_size Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-01 12:45   ` [PATCH 06/12] media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly Jonas Karlman
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
with additional 32 bytes on multi-core variants.

Memory layout is as follow:

+-------------------+
| Y-plane   256 MBs |
+-------------------+
| UV-plane  128 MBs |
+-------------------+
| MV buffer  64 MBs |
+-------------------+
| MC sync  32 bytes |
+-------------------+

Reduce the extra space allocated now that motion vector buffer offset no
longer is based on the extra space.

Only use extra space for 64 MBs of motion vector buffer and 32 bytes for
multi-core sync.

Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/hantro_v4l2.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 3dae52abb96c..3a360a6a17e2 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -242,12 +242,12 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f,
 		/*
 		 * The H264 decoder needs extra space on the output buffers
 		 * to store motion vectors. This is needed for reference
-		 * frames.
+		 * frames. 32 extra bytes is used for multi-core sync.
 		 */
 		if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
 			pix_mp->plane_fmt[0].sizeimage +=
-				128 * DIV_ROUND_UP(pix_mp->width, 16) *
-				      DIV_ROUND_UP(pix_mp->height, 16);
+				64 * H264_MB_WIDTH(pix_mp->width) *
+				     H264_MB_WIDTH(pix_mp->height) + 32;
 	} else if (!pix_mp->plane_fmt[0].sizeimage) {
 		/*
 		 * For coded formats the application can specify
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/12] media: hantro: Remove now unused H264 pic_size
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
  2019-09-01 12:45   ` [PATCH 02/12] media: hantro: Do not reorder H264 scaling list Jonas Karlman
  2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-01 12:45   ` [PATCH 04/12] media: hantro: Reduce H264 extra space for motion vectors Jonas Karlman
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

pic_size in hantro_h264_dec_hw_ctx struct is no longer used,
lets remove it.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/hantro_h264.c | 5 -----
 drivers/staging/media/hantro/hantro_hw.h   | 3 ---
 2 files changed, 8 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index e2d01145ac4f..a77cc28e180a 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -603,7 +603,6 @@ int hantro_h264_dec_init(struct hantro_ctx *ctx)
 	struct hantro_h264_dec_hw_ctx *h264_dec = &ctx->h264_dec;
 	struct hantro_aux_buf *priv = &h264_dec->priv;
 	struct hantro_h264_dec_priv_tbl *tbl;
-	struct v4l2_pix_format_mplane pix_mp;
 
 	priv->cpu = dma_alloc_coherent(vpu->dev, sizeof(*tbl), &priv->dma,
 				       GFP_KERNEL);
@@ -614,9 +613,5 @@ int hantro_h264_dec_init(struct hantro_ctx *ctx)
 	tbl = priv->cpu;
 	memcpy(tbl->cabac_table, h264_cabac_table, sizeof(tbl->cabac_table));
 
-	v4l2_fill_pixfmt_mp(&pix_mp, ctx->dst_fmt.pixelformat,
-			    ctx->dst_fmt.width, ctx->dst_fmt.height);
-	h264_dec->pic_size = pix_mp.plane_fmt[0].sizeimage;
-
 	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 2fab655bf098..8adad8ac9b1d 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -80,15 +80,12 @@ struct hantro_h264_dec_reflists {
  * @dpb:	DPB
  * @reflists:	P/B0/B1 reflists
  * @ctrls:	V4L2 controls attached to a run
- * @pic_size:	Size in bytes of decoded picture, this is needed
- *		to pass the location of motion vectors.
  */
 struct hantro_h264_dec_hw_ctx {
 	struct hantro_aux_buf priv;
 	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
 	struct hantro_h264_dec_reflists reflists;
 	struct hantro_h264_dec_ctrls ctrls;
-	size_t pic_size;
 };
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/12] media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (3 preceding siblings ...)
  2019-09-01 12:45   ` [PATCH 04/12] media: hantro: Reduce H264 extra space for motion vectors Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-01 12:45   ` [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content Jonas Karlman
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

The FIELDPIC_FLAG_E bit should be set when field_pic_flag exists in stream,
it is currently set based on field_pic_flag of current frame.
The PIC_FIELDMODE_E bit is correctly set based on the field_pic_flag.

Fix this by setting the FIELDPIC_FLAG_E bit when frame_mbs_only is not set.

Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/hantro_g1_h264_dec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 159bd67e0a36..16f21d258f6a 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -64,7 +64,7 @@ static void set_params(struct hantro_ctx *ctx)
 	/* always use the matrix sent from userspace */
 	reg |= G1_REG_DEC_CTRL2_TYPE1_QUANT_E;
 
-	if (slices[0].flags &  V4L2_H264_SLICE_FLAG_FIELD_PIC)
+	if (!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY))
 		reg |= G1_REG_DEC_CTRL2_FIELDPIC_FLAG_E;
 	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL2);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (4 preceding siblings ...)
  2019-09-01 12:45   ` [PATCH 06/12] media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-03 13:21     ` Philipp Zabel
  2019-09-01 12:45   ` [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags Jonas Karlman
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

This need code cleanup and formatting

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 .../staging/media/hantro/hantro_g1_h264_dec.c |  26 ++--
 drivers/staging/media/hantro/hantro_h264.c    | 126 ++++++++++++------
 drivers/staging/media/hantro/hantro_hw.h      |   4 +
 3 files changed, 100 insertions(+), 56 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 16f21d258f6a..bc628ef73b29 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -130,28 +130,20 @@ static void set_params(struct hantro_ctx *ctx)
 
 static void set_ref(struct hantro_ctx *ctx)
 {
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	const struct v4l2_ctrl_h264_slice_params *slice;
 	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
 	const u8 *b0_reflist, *b1_reflist, *p_reflist;
 	struct hantro_dev *vpu = ctx->dev;
-	u32 dpb_longterm = 0;
-	u32 dpb_valid = 0;
 	int reg_num;
 	u32 reg;
 	int i;
 
-	/*
-	 * Set up bit maps of valid and long term DPBs.
-	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
-	 */
-	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
-		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
-			dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+	dec_param = ctx->h264_dec.ctrls.decode;
+	slice = ctx->h264_dec.ctrls.slices;
 
-		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
-			dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
-	}
-	vdpu_write_relaxed(vpu, dpb_valid << 16, G1_REG_VALID_REF);
-	vdpu_write_relaxed(vpu, dpb_longterm << 16, G1_REG_LT_REF);
+	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_valid, G1_REG_VALID_REF);
+	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_longterm, G1_REG_LT_REF);
 
 	/*
 	 * Set up reference frame picture numbers.
@@ -223,10 +215,8 @@ static void set_ref(struct hantro_ctx *ctx)
 
 	/* Set up addresses of DPB buffers. */
 	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
-		struct vb2_buffer *buf =  hantro_h264_get_ref_buf(ctx, i);
-
-		vdpu_write_relaxed(vpu, vb2_dma_contig_plane_dma_addr(buf, 0),
-				   G1_REG_ADDR_REF(i));
+		dma_addr_t addr = hantro_h264_get_ref_dma_addr(ctx, i);
+		vdpu_write_relaxed(vpu, addr, G1_REG_ADDR_REF(i));
 	}
 }
 
diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index a77cc28e180a..85c86d728b1a 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -228,17 +228,65 @@ static void prepare_table(struct hantro_ctx *ctx)
 {
 	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
 	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
 	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
 	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	u32 dpb_longterm = 0;
+	u32 dpb_valid = 0;
 	int i;
 
+	/*
+	 * Set up bit maps of valid and long term DPBs.
+	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
+	 */
+	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || (slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
+		for (i = 0; i < HANTRO_H264_DPB_SIZE * 2; ++i) {
+			// check for correct reference use
+			u32 flag = (i & 0x1) ? V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM : V4L2_H264_DPB_ENTRY_FLAG_REF_TOP;
+			if (dpb[i / 2].flags & flag)
+				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
+
+			if (dpb[i / 2].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
+		}
+
+		ctx->h264_dec.dpb_valid = dpb_valid;
+		ctx->h264_dec.dpb_longterm = dpb_longterm;
+	} else {
+		for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
+			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+
+			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+		}
+
+		ctx->h264_dec.dpb_valid = dpb_valid << 16;
+		ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
+	}
+
 	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
-		tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
-		tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
+			tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
+			tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
+		} else {
+			tbl->poc[i * 2] = 0;
+			tbl->poc[i * 2 + 1] = 0;
+		}
 	}
 
-	tbl->poc[32] = dec_param->top_field_order_cnt;
-	tbl->poc[33] = dec_param->bottom_field_order_cnt;
+	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || !(slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
+		if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
+			tbl->poc[32] = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
+					dec_param->bottom_field_order_cnt :
+					dec_param->top_field_order_cnt;
+		else
+			tbl->poc[32] = min(dec_param->top_field_order_cnt, dec_param->bottom_field_order_cnt);
+		tbl->poc[33] = 0;
+	} else {
+		tbl->poc[32] = dec_param->top_field_order_cnt;
+		tbl->poc[33] = dec_param->bottom_field_order_cnt;
+	}
 
 	reorder_scaling_list(ctx);
 }
@@ -251,51 +299,36 @@ struct hantro_h264_reflist_builder {
 	u8 num_valid;
 };
 
-static s32 get_poc(enum v4l2_field field, s32 top_field_order_cnt,
-		   s32 bottom_field_order_cnt)
-{
-	switch (field) {
-	case V4L2_FIELD_TOP:
-		return top_field_order_cnt;
-	case V4L2_FIELD_BOTTOM:
-		return bottom_field_order_cnt;
-	default:
-		break;
-	}
-
-	return min(top_field_order_cnt, bottom_field_order_cnt);
-}
-
 static void
 init_reflist_builder(struct hantro_ctx *ctx,
 		     struct hantro_h264_reflist_builder *b)
 {
 	const struct v4l2_ctrl_h264_decode_params *dec_param;
-	struct vb2_v4l2_buffer *buf = hantro_get_dst_buf(ctx);
+	const struct v4l2_ctrl_h264_slice_params *slices;
 	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
-	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
 	unsigned int i;
 
 	dec_param = ctx->h264_dec.ctrls.decode;
+	slices = ctx->h264_dec.ctrls.slices;
 
 	memset(b, 0, sizeof(*b));
 	b->dpb = dpb;
-	b->curpoc = get_poc(buf->field, dec_param->top_field_order_cnt,
-			    dec_param->bottom_field_order_cnt);
+	b->curpoc = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
+		    dec_param->bottom_field_order_cnt :
+		    dec_param->top_field_order_cnt;
 
 	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
-		int buf_idx;
-
-		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+		u32 ref_flag = dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME;
+		if (!ref_flag)
 			continue;
 
-		buf_idx = vb2_find_timestamp(cap_q, dpb[i].reference_ts, 0);
-		if (buf_idx < 0)
-			continue;
+		if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME)
+			b->pocs[i] = min(dpb[i].bottom_field_order_cnt, dpb[i].top_field_order_cnt);
+		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM)
+			b->pocs[i] = dpb[i].bottom_field_order_cnt;
+		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_TOP)
+			b->pocs[i] = dpb[i].top_field_order_cnt;
 
-		buf = to_vb2_v4l2_buffer(vb2_get_buffer(cap_q, buf_idx));
-		b->pocs[i] = get_poc(buf->field, dpb[i].top_field_order_cnt,
-				     dpb[i].bottom_field_order_cnt);
 		b->unordered_reflist[b->num_valid] = i;
 		b->num_valid++;
 	}
@@ -448,8 +481,7 @@ build_b_ref_lists(const struct hantro_h264_reflist_builder *builder,
 static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
 			    const struct v4l2_h264_dpb_entry *b)
 {
-	return a->top_field_order_cnt == b->top_field_order_cnt &&
-	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+	return a->reference_ts == b->reference_ts;
 }
 
 static void update_dpb(struct hantro_ctx *ctx)
@@ -463,13 +495,13 @@ static void update_dpb(struct hantro_ctx *ctx)
 
 	/* Disable all entries by default. */
 	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
-		ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+		ctx->h264_dec.dpb[i].flags = 0;
 
 	/* Try to match new DPB entries with existing ones by their POCs. */
 	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
 		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
 
-		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
 			continue;
 
 		/*
@@ -480,8 +512,7 @@ static void update_dpb(struct hantro_ctx *ctx)
 			struct v4l2_h264_dpb_entry *cdpb;
 
 			cdpb = &ctx->h264_dec.dpb[j];
-			if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
-			    !dpb_entry_match(cdpb, ndpb))
+			if (!dpb_entry_match(cdpb, ndpb))
 				continue;
 
 			*cdpb = *ndpb;
@@ -541,6 +572,25 @@ struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 	return buf;
 }
 
+dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
+					unsigned int dpb_idx)
+{
+	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	const struct v4l2_ctrl_h264_decode_params *dec_param = ctx->h264_dec.ctrls.decode;
+	const struct v4l2_ctrl_h264_slice_params *slices = ctx->h264_dec.ctrls.slices;
+
+	struct vb2_buffer *buf = hantro_h264_get_ref_buf(ctx, dpb_idx);
+	s32 cur_poc = slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD ?
+		      dec_param->bottom_field_order_cnt :
+		      dec_param->top_field_order_cnt;
+	u32 flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE ? 0x2 : 0;
+	flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
+		 abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
+		 0x1 : 0;
+
+	return vb2_dma_contig_plane_dma_addr(buf, 0) | flags;
+}
+
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
 {
 	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 8adad8ac9b1d..d58f2a36ca40 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -86,6 +86,8 @@ struct hantro_h264_dec_hw_ctx {
 	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
 	struct hantro_h264_dec_reflists reflists;
 	struct hantro_h264_dec_ctrls ctrls;
+	u32 dpb_longterm;
+	u32 dpb_valid;
 };
 
 /**
@@ -157,6 +159,8 @@ void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 
 struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 					   unsigned int dpb_idx);
+dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
+					unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
 void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (5 preceding siblings ...)
  2019-09-01 12:45   ` [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2020-07-10  4:21     ` Ezequiel Garcia
  2019-09-01 12:45   ` [RFC 09/12] media: hantro: Refactor G1 H264 code Jonas Karlman
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

Add DPB entry flags to help indicate when a reference frame is a field picture
and how the DPB entry is referenced, top or bottom field or full frame.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
 include/media/h264-ctrls.h                       |  4 ++++
 2 files changed, 16 insertions(+)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index bc5dd8e76567..eb6c32668ad7 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
       - 0x00000004
       - The DPB entry is a long term reference frame
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
+      - 0x00000008
+      - The DPB entry is a field picture
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
+      - 0x00000010
+      - The DPB entry is a top field reference
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
+      - 0x00000020
+      - The DPB entry is a bottom field reference
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
+      - 0x00000030
+      - The DPB entry is a reference frame
 
 ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
     Specifies the decoding mode to use. Currently exposes slice-based and
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index e877bf1d537c..76020ebd1e6c 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
 #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
 #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
 #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
+#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
 
 struct v4l2_h264_dpb_entry {
 	__u64 reference_ts;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 09/12] media: hantro: Refactor G1 H264 code
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (6 preceding siblings ...)
  2019-09-01 12:45   ` [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-01 12:45   ` [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399 Jonas Karlman
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

Use generated code from my rockchip-vpu-regtool

This need code cleanup and formatting

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 .../staging/media/hantro/hantro_g1_h264_dec.c | 661 +++++++++++-------
 drivers/staging/media/hantro/hantro_h264.c    |  14 +
 drivers/staging/media/hantro/hantro_hw.h      |   2 +
 3 files changed, 439 insertions(+), 238 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index bc628ef73b29..4b82b9fd5252 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Rockchip RK3288 VPU codec driver
+ * Hantro VPU codec driver
  *
  * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
  *	Hertz Wong <hertz.wong@rock-chips.com>
@@ -15,273 +15,458 @@
 
 #include <media/v4l2-mem2mem.h>
 
-#include "hantro_g1_regs.h"
 #include "hantro_hw.h"
 #include "hantro_v4l2.h"
 
 #define MV_OFFSET_420	384
 #define MV_OFFSET_400	256
 
-static void set_params(struct hantro_ctx *ctx)
-{
-	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
-	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
-	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
-	const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
-	const struct v4l2_ctrl_h264_pps *pps = ctrls->pps;
-	struct vb2_v4l2_buffer *src_buf = hantro_get_src_buf(ctx);
-	struct hantro_dev *vpu = ctx->dev;
-	u32 reg;
-
-	/* Decoder control register 0. */
-	reg = G1_REG_DEC_CTRL0_DEC_AXI_WR_ID(0x0);
-	if (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)
-		reg |= G1_REG_DEC_CTRL0_SEQ_MBAFF_E;
-	reg |= G1_REG_DEC_CTRL0_PICORD_COUNT_E;
-	if (dec_param->nal_ref_idc)
-		reg |= G1_REG_DEC_CTRL0_WRITE_MVS_E;
-
-	if (!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY) &&
-	    (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD ||
-	     slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
-		reg |= G1_REG_DEC_CTRL0_PIC_INTERLACE_E;
-	if (slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)
-		reg |= G1_REG_DEC_CTRL0_PIC_FIELDMODE_E;
-	if (!(slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD))
-		reg |= G1_REG_DEC_CTRL0_PIC_TOPFIELD_E;
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
-
-	/* Decoder control register 1. */
-	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
-	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
-	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
-
-	/* Decoder control register 2. */
-	reg = G1_REG_DEC_CTRL2_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
-	      G1_REG_DEC_CTRL2_CH_QP_OFFSET2(pps->second_chroma_qp_index_offset);
-
-	/* always use the matrix sent from userspace */
-	reg |= G1_REG_DEC_CTRL2_TYPE1_QUANT_E;
-
-	if (!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY))
-		reg |= G1_REG_DEC_CTRL2_FIELDPIC_FLAG_E;
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL2);
-
-	/* Decoder control register 3. */
-	reg = G1_REG_DEC_CTRL3_START_CODE_E |
-	      G1_REG_DEC_CTRL3_INIT_QP(pps->pic_init_qp_minus26 + 26) |
-	      G1_REG_DEC_CTRL3_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL3);
-
-	/* Decoder control register 4. */
-	reg = G1_REG_DEC_CTRL4_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) |
-	      G1_REG_DEC_CTRL4_FRAMENUM(slices[0].frame_num) |
-	      G1_REG_DEC_CTRL4_WEIGHT_BIPR_IDC(pps->weighted_bipred_idc);
-	if (pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE)
-		reg |= G1_REG_DEC_CTRL4_CABAC_E;
-	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
-		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
-	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
-		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
-	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
-		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL4);
-
-	/* Decoder control register 5. */
-	reg = G1_REG_DEC_CTRL5_REFPIC_MK_LEN(slices[0].dec_ref_pic_marking_bit_size) |
-	      G1_REG_DEC_CTRL5_IDR_PIC_ID(slices[0].idr_pic_id);
-	if (pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED)
-		reg |= G1_REG_DEC_CTRL5_CONST_INTRA_E;
-	if (pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT)
-		reg |= G1_REG_DEC_CTRL5_FILT_CTRL_PRES;
-	if (pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT)
-		reg |= G1_REG_DEC_CTRL5_RDPIC_CNT_PRES;
-	if (pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE)
-		reg |= G1_REG_DEC_CTRL5_8X8TRANS_FLAG_E;
-	if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC)
-		reg |= G1_REG_DEC_CTRL5_IDR_PIC_E;
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL5);
-
-	/* Decoder control register 6. */
-	reg = G1_REG_DEC_CTRL6_PPS_ID(slices[0].pic_parameter_set_id) |
-	      G1_REG_DEC_CTRL6_REFIDX0_ACTIVE(pps->num_ref_idx_l0_default_active_minus1 + 1) |
-	      G1_REG_DEC_CTRL6_REFIDX1_ACTIVE(pps->num_ref_idx_l1_default_active_minus1 + 1) |
-	      G1_REG_DEC_CTRL6_POC_LENGTH(slices[0].pic_order_cnt_bit_size);
-	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL6);
-
-	/* Error concealment register. */
-	vdpu_write_relaxed(vpu, 0, G1_REG_ERR_CONC);
-
-	/* Prediction filter tap register. */
-	vdpu_write_relaxed(vpu,
-			   G1_REG_PRED_FLT_PRED_BC_TAP_0_0(1) |
-			   G1_REG_PRED_FLT_PRED_BC_TAP_0_1(-5 & 0x3ff) |
-			   G1_REG_PRED_FLT_PRED_BC_TAP_0_2(20),
-			   G1_REG_PRED_FLT);
-
-	/* Reference picture buffer control register. */
-	vdpu_write_relaxed(vpu, 0, G1_REG_REF_BUF_CTRL);
-
-	/* Reference picture buffer control register 2. */
-	vdpu_write_relaxed(vpu, G1_REG_REF_BUF_CTRL2_APF_THRESHOLD(8),
-			   G1_REG_REF_BUF_CTRL2);
-}
+#define G1_SWREG(nr)			((nr) * 4)
+
+#define G1_REG_RLC_VLC_BASE		G1_SWREG(12)
+#define G1_REG_DEC_OUT_BASE		G1_SWREG(13)
+#define G1_REG_REFER0_BASE		G1_SWREG(14)
+#define G1_REG_REFER1_BASE		G1_SWREG(15)
+#define G1_REG_REFER2_BASE		G1_SWREG(16)
+#define G1_REG_REFER3_BASE		G1_SWREG(17)
+#define G1_REG_REFER4_BASE		G1_SWREG(18)
+#define G1_REG_REFER5_BASE		G1_SWREG(19)
+#define G1_REG_REFER6_BASE		G1_SWREG(20)
+#define G1_REG_REFER7_BASE		G1_SWREG(21)
+#define G1_REG_REFER8_BASE		G1_SWREG(22)
+#define G1_REG_REFER9_BASE		G1_SWREG(23)
+#define G1_REG_REFER10_BASE		G1_SWREG(24)
+#define G1_REG_REFER11_BASE		G1_SWREG(25)
+#define G1_REG_REFER12_BASE		G1_SWREG(26)
+#define G1_REG_REFER13_BASE		G1_SWREG(27)
+#define G1_REG_REFER14_BASE		G1_SWREG(28)
+#define G1_REG_REFER15_BASE		G1_SWREG(29)
+#define G1_REG_QTABLE_BASE		G1_SWREG(40)
+#define G1_REG_DIR_MV_BASE		G1_SWREG(41)
+#define G1_REG_DEC_E(v)			((v) ? BIT(0) : 0)
+
+#define G1_REG_DEC_AXI_RD_ID(v)		(((v) << 24) & GENMASK(31, 24))
+#define G1_REG_DEC_TIMEOUT_E(v)		((v) ? BIT(23) : 0)
+#define G1_REG_DEC_STRSWAP32_E(v)	((v) ? BIT(22) : 0)
+#define G1_REG_DEC_STRENDIAN_E(v)	((v) ? BIT(21) : 0)
+#define G1_REG_DEC_INSWAP32_E(v)	((v) ? BIT(20) : 0)
+#define G1_REG_DEC_OUTSWAP32_E(v)	((v) ? BIT(19) : 0)
+#define G1_REG_DEC_DATA_DISC_E(v)	((v) ? BIT(18) : 0)
+#define G1_REG_DEC_LATENCY(v)		(((v) << 11) & GENMASK(16, 11))
+#define G1_REG_DEC_CLK_GATE_E(v)	((v) ? BIT(10) : 0)
+#define G1_REG_DEC_IN_ENDIAN(v)		((v) ? BIT(9) : 0)
+#define G1_REG_DEC_OUT_ENDIAN(v)	((v) ? BIT(8) : 0)
+#define G1_REG_DEC_ADV_PRE_DIS(v)	((v) ? BIT(6) : 0)
+#define G1_REG_DEC_SCMD_DIS(v)		((v) ? BIT(5) : 0)
+#define G1_REG_DEC_MAX_BURST(v)		(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_DEC_MODE(v)		(((v) << 28) & GENMASK(31, 28))
+#define G1_REG_RLC_MODE_E(v)		((v) ? BIT(27) : 0)
+#define G1_REG_PIC_INTERLACE_E(v)	((v) ? BIT(23) : 0)
+#define G1_REG_PIC_FIELDMODE_E(v)	((v) ? BIT(22) : 0)
+#define G1_REG_PIC_TOPFIELD_E(v)	((v) ? BIT(19) : 0)
+#define G1_REG_FILTERING_DIS(v)		((v) ? BIT(14) : 0)
+#define G1_REG_PIC_FIXED_QUANT(v)	((v) ? BIT(13) : 0)
+#define G1_REG_WRITE_MVS_E(v)		((v) ? BIT(12) : 0)
+#define G1_REG_SEQ_MBAFF_E(v)		((v) ? BIT(10) : 0)
+#define G1_REG_PICORD_COUNT_E(v)	((v) ? BIT(9) : 0)
+#define G1_REG_DEC_AXI_WR_ID(v)		(((v) << 0) & GENMASK(7, 0))
+
+#define G1_REG_PIC_MB_WIDTH(v)		(((v) << 23) & GENMASK(31, 23))
+#define G1_REG_PIC_MB_HEIGHT_P(v)	(((v) << 11) & GENMASK(18, 11))
+#define G1_REG_REF_FRAMES(v)		(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_STRM_START_BIT(v)	(((v) << 26) & GENMASK(31, 26))
+#define G1_REG_TYPE1_QUANT_E(v)		((v) ? BIT(24) : 0)
+#define G1_REG_CH_QP_OFFSET(v)		(((v) << 19) & GENMASK(23, 19))
+#define G1_REG_CH_QP_OFFSET2(v)		(((v) << 14) & GENMASK(18, 14))
+#define G1_REG_FIELDPIC_FLAG_E(v)	((v) ? BIT(0) : 0)
+
+#define G1_REG_START_CODE_E(v)		((v) ? BIT(31) : 0)
+#define G1_REG_INIT_QP(v)		(((v) << 25) & GENMASK(30, 25))
+#define G1_REG_CH_8PIX_ILEAV_E(v)	((v) ? BIT(24) : 0)
+#define G1_REG_STREAM_LEN(v)		(((v) << 0) & GENMASK(23, 0))
+
+#define G1_REG_CABAC_E(v)		((v) ? BIT(31) : 0)
+#define G1_REG_BLACKWHITE_E(v)		((v) ? BIT(30) : 0)
+#define G1_REG_DIR_8X8_INFER_E(v)	((v) ? BIT(29) : 0)
+#define G1_REG_WEIGHT_PRED_E(v)		((v) ? BIT(28) : 0)
+#define G1_REG_WEIGHT_BIPR_IDC(v)	(((v) << 26) & GENMASK(27, 26))
+#define G1_REG_FRAMENUM_LEN(v)		(((v) << 16) & GENMASK(20, 16))
+#define G1_REG_FRAMENUM(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_CONST_INTRA_E(v)		((v) ? BIT(31) : 0)
+#define G1_REG_FILT_CTRL_PRES(v)	((v) ? BIT(30) : 0)
+#define G1_REG_RDPIC_CNT_PRES(v)	((v) ? BIT(29) : 0)
+#define G1_REG_8X8TRANS_FLAG_E(v)	((v) ? BIT(28) : 0)
+#define G1_REG_REFPIC_MK_LEN(v)		(((v) << 17) & GENMASK(27, 17))
+#define G1_REG_IDR_PIC_E(v)		((v) ? BIT(16) : 0)
+#define G1_REG_IDR_PIC_ID(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_PPS_ID(v)		(((v) << 24) & GENMASK(31, 24))
+#define G1_REG_REFIDX1_ACTIVE(v)	(((v) << 19) & GENMASK(23, 19))
+#define G1_REG_REFIDX0_ACTIVE(v)	(((v) << 14) & GENMASK(18, 14))
+#define G1_REG_POC_LENGTH(v)		(((v) << 0) & GENMASK(7, 0))
+
+#define G1_REG_PINIT_RLIST_F9(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_PINIT_RLIST_F8(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_PINIT_RLIST_F7(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_PINIT_RLIST_F6(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_PINIT_RLIST_F5(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_PINIT_RLIST_F4(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_PINIT_RLIST_F15(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_PINIT_RLIST_F14(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_PINIT_RLIST_F13(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_PINIT_RLIST_F12(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_PINIT_RLIST_F11(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_PINIT_RLIST_F10(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_REFER1_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER0_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER3_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER2_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER5_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER4_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER7_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER6_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER9_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER8_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER11_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER10_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER13_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER12_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER15_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define G1_REG_REFER14_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define G1_REG_REFER_LTERM_E(v)		(((v) << 0) & GENMASK(31, 0))
+
+#define G1_REG_REFER_VALID_E(v)		(((v) << 0) & GENMASK(31, 0))
+
+#define G1_REG_BINIT_RLIST_B2(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_BINIT_RLIST_F2(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_BINIT_RLIST_B1(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_BINIT_RLIST_F1(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B0(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F0(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_BINIT_RLIST_B5(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_BINIT_RLIST_F5(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_BINIT_RLIST_B4(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_BINIT_RLIST_F4(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B3(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F3(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_BINIT_RLIST_B8(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_BINIT_RLIST_F8(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_BINIT_RLIST_B7(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_BINIT_RLIST_F7(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B6(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F6(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_BINIT_RLIST_B11(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_BINIT_RLIST_F11(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_BINIT_RLIST_B10(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_BINIT_RLIST_F10(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B9(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F9(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_BINIT_RLIST_B14(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_BINIT_RLIST_F14(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_BINIT_RLIST_B13(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_BINIT_RLIST_F13(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B12(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F12(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_PINIT_RLIST_F3(v)	(((v) << 25) & GENMASK(29, 25))
+#define G1_REG_PINIT_RLIST_F2(v)	(((v) << 20) & GENMASK(24, 20))
+#define G1_REG_PINIT_RLIST_F1(v)	(((v) << 15) & GENMASK(19, 15))
+#define G1_REG_PINIT_RLIST_F0(v)	(((v) << 10) & GENMASK(14, 10))
+#define G1_REG_BINIT_RLIST_B15(v)	(((v) << 5) & GENMASK(9, 5))
+#define G1_REG_BINIT_RLIST_F15(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define G1_REG_STARTMB_X(v)		(((v) << 23) & GENMASK(31, 23))
+#define G1_REG_STARTMB_Y(v)		(((v) << 15) & GENMASK(22, 15))
+
+#define G1_REG_PRED_BC_TAP_0_0(v)	(((v) << 22) & GENMASK(31, 22))
+#define G1_REG_PRED_BC_TAP_0_1(v)	(((v) << 12) & GENMASK(21, 12))
+#define G1_REG_PRED_BC_TAP_0_2(v)	(((v) << 2) & GENMASK(11, 2))
+
+#define G1_REG_REFBU_E(v)		((v) ? BIT(31) : 0)
+
+#define G1_REG_APF_THRESHOLD(v)		(((v) << 0) & GENMASK(13, 0))
+>>>>>>> b22734fb5e2c... Ymedia: hantro: Refactor G1 H264 code
 
-static void set_ref(struct hantro_ctx *ctx)
+void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 {
+	struct hantro_dev *vpu = ctx->dev;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	const struct hantro_h264_dec_ctrls *ctrls;
 	const struct v4l2_ctrl_h264_decode_params *dec_param;
-	const struct v4l2_ctrl_h264_slice_params *slice;
-	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	const struct v4l2_ctrl_h264_slice_params *slices;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
 	const u8 *b0_reflist, *b1_reflist, *p_reflist;
-	struct hantro_dev *vpu = ctx->dev;
-	int reg_num;
+	dma_addr_t addr;
 	u32 reg;
-	int i;
-
-	dec_param = ctx->h264_dec.ctrls.decode;
-	slice = ctx->h264_dec.ctrls.slices;
-
-	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_valid, G1_REG_VALID_REF);
-	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_longterm, G1_REG_LT_REF);
-
-	/*
-	 * Set up reference frame picture numbers.
-	 *
-	 * Each G1_REG_REF_PIC(x) register contains numbers of two
-	 * subsequential reference pictures.
-	 */
-	for (i = 0; i < HANTRO_H264_DPB_SIZE; i += 2) {
-		reg = 0;
-		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
-			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].pic_num);
-		else
-			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].frame_num);
-
-		if (dpb[i + 1].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
-			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].pic_num);
-		else
-			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].frame_num);
-
-		vdpu_write_relaxed(vpu, reg, G1_REG_REF_PIC(i / 2));
-	}
+	unsigned int offset = MV_OFFSET_420;
+
+	/* Prepare the H264 decoder context. */
+	if (hantro_h264_dec_prepare_run(ctx))
+		return;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	ctrls = &ctx->h264_dec.ctrls;
+	dec_param = ctrls->decode;
+	slices = ctrls->slices;
+	sps = ctrls->sps;
+	pps = ctrls->pps;
 
 	b0_reflist = ctx->h264_dec.reflists.b0;
 	b1_reflist = ctx->h264_dec.reflists.b1;
 	p_reflist = ctx->h264_dec.reflists.p;
 
-	/*
-	 * Each G1_REG_BD_REF_PIC(x) register contains three entries
-	 * of each forward and backward picture list.
-	 */
-	reg_num = 0;
-	for (i = 0; i < 15; i += 3) {
-		reg = G1_REG_BD_REF_PIC_BINIT_RLIST_F0(b0_reflist[i]) |
-		      G1_REG_BD_REF_PIC_BINIT_RLIST_F1(b0_reflist[i + 1]) |
-		      G1_REG_BD_REF_PIC_BINIT_RLIST_F2(b0_reflist[i + 2]) |
-		      G1_REG_BD_REF_PIC_BINIT_RLIST_B0(b1_reflist[i]) |
-		      G1_REG_BD_REF_PIC_BINIT_RLIST_B1(b1_reflist[i + 1]) |
-		      G1_REG_BD_REF_PIC_BINIT_RLIST_B2(b1_reflist[i + 2]);
-		vdpu_write_relaxed(vpu, reg, G1_REG_BD_REF_PIC(reg_num++));
-	}
-
-	/*
-	 * G1_REG_BD_P_REF_PIC register contains last entries (index 15)
-	 * of forward and backward reference picture lists and first 4 entries
-	 * of P forward picture list.
-	 */
-	reg = G1_REG_BD_P_REF_PIC_BINIT_RLIST_F15(b0_reflist[15]) |
-	      G1_REG_BD_P_REF_PIC_BINIT_RLIST_B15(b1_reflist[15]) |
-	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F0(p_reflist[0]) |
-	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F1(p_reflist[1]) |
-	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F2(p_reflist[2]) |
-	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F3(p_reflist[3]);
-	vdpu_write_relaxed(vpu, reg, G1_REG_BD_P_REF_PIC);
-
-	/*
-	 * Each G1_REG_FWD_PIC(x) register contains six consecutive
-	 * entries of P forward picture list, starting from index 4.
-	 */
-	reg_num = 0;
-	for (i = 4; i < HANTRO_H264_DPB_SIZE; i += 6) {
-		reg = G1_REG_FWD_PIC_PINIT_RLIST_F0(p_reflist[i]) |
-		      G1_REG_FWD_PIC_PINIT_RLIST_F1(p_reflist[i + 1]) |
-		      G1_REG_FWD_PIC_PINIT_RLIST_F2(p_reflist[i + 2]) |
-		      G1_REG_FWD_PIC_PINIT_RLIST_F3(p_reflist[i + 3]) |
-		      G1_REG_FWD_PIC_PINIT_RLIST_F4(p_reflist[i + 4]) |
-		      G1_REG_FWD_PIC_PINIT_RLIST_F5(p_reflist[i + 5]);
-		vdpu_write_relaxed(vpu, reg, G1_REG_FWD_PIC(reg_num++));
-	}
-
-	/* Set up addresses of DPB buffers. */
-	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
-		dma_addr_t addr = hantro_h264_get_ref_dma_addr(ctx, i);
-		vdpu_write_relaxed(vpu, addr, G1_REG_ADDR_REF(i));
-	}
-}
+	reg = G1_REG_DEC_AXI_RD_ID(0xff) |
+	      G1_REG_DEC_TIMEOUT_E(1) |
+	      G1_REG_DEC_STRSWAP32_E(1) |
+	      G1_REG_DEC_STRENDIAN_E(1) |
+	      G1_REG_DEC_INSWAP32_E(1) |
+	      G1_REG_DEC_OUTSWAP32_E(1) |
+	      G1_REG_DEC_DATA_DISC_E(0) |
+	      G1_REG_DEC_LATENCY(0) |
+	      G1_REG_DEC_CLK_GATE_E(1) |
+	      G1_REG_DEC_IN_ENDIAN(0) |
+	      G1_REG_DEC_OUT_ENDIAN(1) |
+	      G1_REG_DEC_ADV_PRE_DIS(0) |
+	      G1_REG_DEC_SCMD_DIS(0) |
+	      G1_REG_DEC_MAX_BURST(16);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(2));
+
+	reg = G1_REG_DEC_MODE(0) |
+	      G1_REG_RLC_MODE_E(0) |
+	      G1_REG_PIC_INTERLACE_E(!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY) && (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD || slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)) |
+	      G1_REG_PIC_FIELDMODE_E(slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) |
+	      G1_REG_PIC_TOPFIELD_E(!(slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)) |
+	      G1_REG_FILTERING_DIS(0) |
+	      G1_REG_PIC_FIXED_QUANT(0) |
+	      G1_REG_WRITE_MVS_E(dec_param->nal_ref_idc) |
+	      G1_REG_SEQ_MBAFF_E(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD) |
+	      G1_REG_PICORD_COUNT_E(1) |
+	      G1_REG_DEC_AXI_WR_ID(0);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(3));
+
+	reg = G1_REG_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
+	      G1_REG_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
+	      G1_REG_REF_FRAMES(sps->max_num_ref_frames);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(4));
+
+	reg = G1_REG_STRM_START_BIT(0) |
+	      G1_REG_TYPE1_QUANT_E(1) |
+	      G1_REG_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
+	      G1_REG_CH_QP_OFFSET2(pps->second_chroma_qp_index_offset) |
+	      G1_REG_FIELDPIC_FLAG_E(!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(5));
+
+	reg = G1_REG_START_CODE_E(1) |
+	      G1_REG_INIT_QP(pps->pic_init_qp_minus26 + 26) |
+	      G1_REG_CH_8PIX_ILEAV_E(0) |
+	      G1_REG_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(6));
+
+	reg = G1_REG_CABAC_E(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE) |
+	      G1_REG_BLACKWHITE_E(sps->profile_idc >= 100 && sps->chroma_format_idc == 0) |
+	      G1_REG_DIR_8X8_INFER_E(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE) |
+	      G1_REG_WEIGHT_PRED_E(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) |
+	      G1_REG_WEIGHT_BIPR_IDC(pps->weighted_bipred_idc) |
+	      G1_REG_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) |
+	      G1_REG_FRAMENUM(slices[0].frame_num);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(7));
+
+	reg = G1_REG_CONST_INTRA_E(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED) |
+	      G1_REG_FILT_CTRL_PRES(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT) |
+	      G1_REG_RDPIC_CNT_PRES(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT) |
+	      G1_REG_8X8TRANS_FLAG_E(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE) |
+	      G1_REG_REFPIC_MK_LEN(slices[0].dec_ref_pic_marking_bit_size) |
+	      G1_REG_IDR_PIC_E(dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) |
+	      G1_REG_IDR_PIC_ID(slices[0].idr_pic_id);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(8));
+
+	reg = G1_REG_PPS_ID(slices[0].pic_parameter_set_id) |
+	      G1_REG_REFIDX1_ACTIVE(pps->num_ref_idx_l1_default_active_minus1 + 1) |
+	      G1_REG_REFIDX0_ACTIVE(pps->num_ref_idx_l0_default_active_minus1 + 1) |
+	      G1_REG_POC_LENGTH(slices[0].pic_order_cnt_bit_size);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(9));
+
+	reg = G1_REG_PINIT_RLIST_F9(p_reflist[9]) |
+	      G1_REG_PINIT_RLIST_F8(p_reflist[8]) |
+	      G1_REG_PINIT_RLIST_F7(p_reflist[7]) |
+	      G1_REG_PINIT_RLIST_F6(p_reflist[6]) |
+	      G1_REG_PINIT_RLIST_F5(p_reflist[5]) |
+	      G1_REG_PINIT_RLIST_F4(p_reflist[4]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(10));
+
+	reg = G1_REG_PINIT_RLIST_F15(p_reflist[15]) |
+	      G1_REG_PINIT_RLIST_F14(p_reflist[14]) |
+	      G1_REG_PINIT_RLIST_F13(p_reflist[13]) |
+	      G1_REG_PINIT_RLIST_F12(p_reflist[12]) |
+	      G1_REG_PINIT_RLIST_F11(p_reflist[11]) |
+	      G1_REG_PINIT_RLIST_F10(p_reflist[10]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(11));
+
+	reg = G1_REG_REFER1_NBR(hantro_h264_get_ref_nbr(ctx, 1)) |
+	      G1_REG_REFER0_NBR(hantro_h264_get_ref_nbr(ctx, 0));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(30));
+
+	reg = G1_REG_REFER3_NBR(hantro_h264_get_ref_nbr(ctx, 3)) |
+	      G1_REG_REFER2_NBR(hantro_h264_get_ref_nbr(ctx, 2));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(31));
+
+	reg = G1_REG_REFER5_NBR(hantro_h264_get_ref_nbr(ctx, 5)) |
+	      G1_REG_REFER4_NBR(hantro_h264_get_ref_nbr(ctx, 4));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(32));
+
+	reg = G1_REG_REFER7_NBR(hantro_h264_get_ref_nbr(ctx, 7)) |
+	      G1_REG_REFER6_NBR(hantro_h264_get_ref_nbr(ctx, 6));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(33));
+
+	reg = G1_REG_REFER9_NBR(hantro_h264_get_ref_nbr(ctx, 9)) |
+	      G1_REG_REFER8_NBR(hantro_h264_get_ref_nbr(ctx, 8));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(34));
+
+	reg = G1_REG_REFER11_NBR(hantro_h264_get_ref_nbr(ctx, 11)) |
+	      G1_REG_REFER10_NBR(hantro_h264_get_ref_nbr(ctx, 10));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(35));
+
+	reg = G1_REG_REFER13_NBR(hantro_h264_get_ref_nbr(ctx, 13)) |
+	      G1_REG_REFER12_NBR(hantro_h264_get_ref_nbr(ctx, 12));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(36));
+
+	reg = G1_REG_REFER15_NBR(hantro_h264_get_ref_nbr(ctx, 15)) |
+	      G1_REG_REFER14_NBR(hantro_h264_get_ref_nbr(ctx, 14));
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(37));
+
+	reg = G1_REG_REFER_LTERM_E(ctx->h264_dec.dpb_longterm);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(38));
+
+	reg = G1_REG_REFER_VALID_E(ctx->h264_dec.dpb_valid);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(39));
+
+	reg = G1_REG_BINIT_RLIST_B2(b1_reflist[2]) |
+	      G1_REG_BINIT_RLIST_F2(b0_reflist[2]) |
+	      G1_REG_BINIT_RLIST_B1(b1_reflist[1]) |
+	      G1_REG_BINIT_RLIST_F1(b0_reflist[1]) |
+	      G1_REG_BINIT_RLIST_B0(b1_reflist[0]) |
+	      G1_REG_BINIT_RLIST_F0(b0_reflist[0]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(42));
+
+	reg = G1_REG_BINIT_RLIST_B5(b1_reflist[5]) |
+	      G1_REG_BINIT_RLIST_F5(b0_reflist[5]) |
+	      G1_REG_BINIT_RLIST_B4(b1_reflist[4]) |
+	      G1_REG_BINIT_RLIST_F4(b0_reflist[4]) |
+	      G1_REG_BINIT_RLIST_B3(b1_reflist[3]) |
+	      G1_REG_BINIT_RLIST_F3(b0_reflist[3]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(43));
+
+	reg = G1_REG_BINIT_RLIST_B8(b1_reflist[8]) |
+	      G1_REG_BINIT_RLIST_F8(b0_reflist[8]) |
+	      G1_REG_BINIT_RLIST_B7(b1_reflist[7]) |
+	      G1_REG_BINIT_RLIST_F7(b0_reflist[7]) |
+	      G1_REG_BINIT_RLIST_B6(b1_reflist[6]) |
+	      G1_REG_BINIT_RLIST_F6(b0_reflist[6]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(44));
+
+	reg = G1_REG_BINIT_RLIST_B11(b1_reflist[11]) |
+	      G1_REG_BINIT_RLIST_F11(b0_reflist[11]) |
+	      G1_REG_BINIT_RLIST_B10(b1_reflist[10]) |
+	      G1_REG_BINIT_RLIST_F10(b0_reflist[10]) |
+	      G1_REG_BINIT_RLIST_B9(b1_reflist[9]) |
+	      G1_REG_BINIT_RLIST_F9(b0_reflist[9]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(45));
+
+	reg = G1_REG_BINIT_RLIST_B14(b1_reflist[14]) |
+	      G1_REG_BINIT_RLIST_F14(b0_reflist[14]) |
+	      G1_REG_BINIT_RLIST_B13(b1_reflist[13]) |
+	      G1_REG_BINIT_RLIST_F13(b0_reflist[13]) |
+	      G1_REG_BINIT_RLIST_B12(b1_reflist[12]) |
+	      G1_REG_BINIT_RLIST_F12(b0_reflist[12]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(46));
+
+	reg = G1_REG_PINIT_RLIST_F3(p_reflist[3]) |
+	      G1_REG_PINIT_RLIST_F2(p_reflist[2]) |
+	      G1_REG_PINIT_RLIST_F1(p_reflist[1]) |
+	      G1_REG_PINIT_RLIST_F0(p_reflist[0]) |
+	      G1_REG_BINIT_RLIST_B15(b1_reflist[15]) |
+	      G1_REG_BINIT_RLIST_F15(b0_reflist[15]);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(47));
+
+	reg = G1_REG_STARTMB_X(0) |
+	      G1_REG_STARTMB_Y(0);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(48));
+
+	reg = G1_REG_PRED_BC_TAP_0_0(1) |
+	      G1_REG_PRED_BC_TAP_0_1((u32)-5) |
+	      G1_REG_PRED_BC_TAP_0_2(20);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(49));
+
+	reg = G1_REG_REFBU_E(0);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(51));
+
+	reg = G1_REG_APF_THRESHOLD(8);
+	vdpu_write_relaxed(vpu, reg, G1_SWREG(55));
 
-static void set_buffers(struct hantro_ctx *ctx)
-{
-	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
-	struct vb2_v4l2_buffer *src_buf, *dst_buf;
-	struct hantro_dev *vpu = ctx->dev;
-	dma_addr_t src_dma, dst_dma;
-	unsigned int offset = MV_OFFSET_420;
-
-	src_buf = hantro_get_src_buf(ctx);
-	dst_buf = hantro_get_dst_buf(ctx);
+	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
+	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_QTABLE_BASE);
 
 	/* Source (stream) buffer. */
-	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-	vdpu_write_relaxed(vpu, src_dma, G1_REG_ADDR_STR);
+	addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, addr, G1_REG_RLC_VLC_BASE);
 
 	/* Destination (decoded frame) buffer. */
-	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
 	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
-		dst_dma += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
-	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
+		addr += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
+	vdpu_write_relaxed(vpu, addr, G1_REG_DEC_OUT_BASE);
 
 	/* Motion vector buffer is located after the decoded frame. */
-	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
-	if (ctrls->sps->profile_idc >= 100 && ctrls->sps->chroma_format_idc == 0)
+	addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
 		offset = MV_OFFSET_400;
-	dst_dma += offset * H264_MB_WIDTH(ctx->dst_fmt.width) *
+	addr += offset * H264_MB_WIDTH(ctx->dst_fmt.width) *
 		   H264_MB_HEIGHT(ctx->dst_fmt.height);
 	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
-		dst_dma += 32 * H264_MB_WIDTH(ctx->dst_fmt.width) *
+		addr += 32 * H264_MB_WIDTH(ctx->dst_fmt.width) *
 			   H264_MB_HEIGHT(ctx->dst_fmt.height);
-	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DIR_MV);
-
-	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
-	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
-}
-
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
-{
-	struct hantro_dev *vpu = ctx->dev;
-
-	/* Prepare the H264 decoder context. */
-	if (hantro_h264_dec_prepare_run(ctx))
-		return;
-
-	/* Configure hardware registers. */
-	set_params(ctx);
-	set_ref(ctx);
-	set_buffers(ctx);
+	vdpu_write_relaxed(vpu, addr, G1_REG_DIR_MV_BASE);
+
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 0), G1_REG_REFER0_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 1), G1_REG_REFER1_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 2), G1_REG_REFER2_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 3), G1_REG_REFER3_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 4), G1_REG_REFER4_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 5), G1_REG_REFER5_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 6), G1_REG_REFER6_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 7), G1_REG_REFER7_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 8), G1_REG_REFER8_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 9), G1_REG_REFER9_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 10), G1_REG_REFER10_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 11), G1_REG_REFER11_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 12), G1_REG_REFER12_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 13), G1_REG_REFER13_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 14), G1_REG_REFER14_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 15), G1_REG_REFER15_BASE);
 
 	hantro_finish_run(ctx);
 
 	/* Start decoding! */
-	vdpu_write_relaxed(vpu,
-			   G1_REG_CONFIG_DEC_AXI_RD_ID(0xffu) |
-			   G1_REG_CONFIG_DEC_TIMEOUT_E |
-			   G1_REG_CONFIG_DEC_OUT_ENDIAN |
-			   G1_REG_CONFIG_DEC_STRENDIAN_E |
-			   G1_REG_CONFIG_DEC_MAX_BURST(16) |
-			   G1_REG_CONFIG_DEC_OUTSWAP32_E |
-			   G1_REG_CONFIG_DEC_INSWAP32_E |
-			   G1_REG_CONFIG_DEC_STRSWAP32_E |
-			   G1_REG_CONFIG_DEC_CLK_GATE_E,
-			   G1_REG_CONFIG);
-	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+	reg = G1_REG_DEC_E(1);
+	vdpu_write(vpu, reg, G1_SWREG(1));
 }
diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
index 85c86d728b1a..03b37e00b6fd 100644
--- a/drivers/staging/media/hantro/hantro_h264.c
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -591,6 +591,20 @@ dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
 	return vb2_dma_contig_plane_dma_addr(buf, 0) | flags;
 }
 
+u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx,
+			    unsigned int dpb_idx)
+{
+	const struct v4l2_h264_dpb_entry *dpb = &ctx->h264_dec.dpb[dpb_idx];
+
+	if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+		return 0;
+
+	if (dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+		return dpb->pic_num;
+
+	return dpb->frame_num;
+}
+
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
 {
 	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index d58f2a36ca40..51de2fee2233 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -161,6 +161,8 @@ struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 					   unsigned int dpb_idx);
 dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
 					unsigned int dpb_idx);
+u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx,
+			    unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
 void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 11/12] media: hantro: Enable H264 decoding on RK3399
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (8 preceding siblings ...)
  2019-09-01 12:45   ` [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399 Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-01 12:45   ` [RFC 12/12] media: hantro: Enable H264 decoding on RK3328 Jonas Karlman
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

The RK3399 SoC has two VPU blocks capable of H264 decoding, VPU2 and RKVDEC,
this enables support for H264 decoding using the VPU2 block.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/rk3399_vpu_hw.c | 21 +++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw.c b/drivers/staging/media/hantro/rk3399_vpu_hw.c
index 14d14bc6b12b..47ca51b75a0d 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw.c
@@ -60,6 +60,19 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
 		.fourcc = V4L2_PIX_FMT_NV12,
 		.codec_mode = HANTRO_MODE_NONE,
 	},
+	{
+		.fourcc = V4L2_PIX_FMT_H264_SLICE,
+		.codec_mode = HANTRO_MODE_H264_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 4096,
+			.step_width = H264_MB_DIM,
+			.min_height = 48,
+			.max_height = 2304,
+			.step_height = H264_MB_DIM,
+		},
+	},
 	{
 		.fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
 		.codec_mode = HANTRO_MODE_MPEG2_DEC,
@@ -161,6 +174,12 @@ static const struct hantro_codec_ops rk3399_vpu_codec_ops[] = {
 		.init = hantro_jpeg_enc_init,
 		.exit = hantro_jpeg_enc_exit,
 	},
+	[HANTRO_MODE_H264_DEC] = {
+		.run = rk3399_vpu_h264_dec_run,
+		.reset = rk3399_vpu_dec_reset,
+		.init = hantro_h264_dec_init,
+		.exit = hantro_h264_dec_exit,
+	},
 	[HANTRO_MODE_MPEG2_DEC] = {
 		.run = rk3399_vpu_mpeg2_dec_run,
 		.reset = rk3399_vpu_dec_reset,
@@ -196,7 +215,7 @@ const struct hantro_variant rk3399_vpu_variant = {
 	.dec_fmts = rk3399_vpu_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
 	.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
-		 HANTRO_VP8_DECODER,
+		 HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
 	.codec_ops = rk3399_vpu_codec_ops,
 	.irqs = rk3399_irqs,
 	.num_irqs = ARRAY_SIZE(rk3399_irqs),
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (7 preceding siblings ...)
  2019-09-01 12:45   ` [RFC 09/12] media: hantro: Refactor G1 H264 code Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  2019-09-02 11:46     ` Hans Verkuil
  2019-09-01 12:45   ` [RFC 11/12] media: hantro: Enable " Jonas Karlman
  2019-09-01 12:45   ` [RFC 12/12] media: hantro: Enable H264 decoding on RK3328 Jonas Karlman
  10 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

Rockchip RK3399 SoC has the same Hantro G1 IP block
as RK3288, but the registers are entirely different.

In a similar fashion as MPEG-2 and VP8 decoding,
it's simpler to just add a separate implementation.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/Makefile         |   1 +
 .../staging/media/hantro/hantro_g1_h264_dec.c |   1 -
 drivers/staging/media/hantro/hantro_hw.h      |   1 +
 .../media/hantro/rk3399_vpu_hw_h264_dec.c     | 486 ++++++++++++++++++
 4 files changed, 488 insertions(+), 1 deletion(-)
 create mode 100644 drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 5d6b0383d280..8d33b0e8aa6c 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -8,6 +8,7 @@ hantro-vpu-y += \
 		hantro_g1_mpeg2_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
+		rk3399_vpu_hw_h264_dec.o \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		rk3399_vpu_hw_vp8_dec.o \
 		hantro_jpeg.o \
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 4b82b9fd5252..ec2736fb473d 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -202,7 +202,6 @@
 #define G1_REG_REFBU_E(v)		((v) ? BIT(31) : 0)
 
 #define G1_REG_APF_THRESHOLD(v)		(((v) << 0) & GENMASK(13, 0))
->>>>>>> b22734fb5e2c... Ymedia: hantro: Refactor G1 H264 code
 
 void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 {
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 51de2fee2233..00161a4f22ac 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -165,6 +165,7 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx,
 			    unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
 void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
+void rk3399_vpu_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c
new file mode 100644
index 000000000000..8e480a68ca3d
--- /dev/null
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c
@@ -0,0 +1,486 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU codec driver
+ *
+ * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
+ *	Hertz Wong <hertz.wong@rock-chips.com>
+ *	Herman Chen <herman.chen@rock-chips.com>
+ *
+ * Copyright (C) 2014 Google, Inc.
+ *	Tomasz Figa <tfiga@chromium.org>
+ */
+
+#include <linux/types.h>
+#include <linux/sort.h>
+
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro_hw.h"
+#include "hantro_v4l2.h"
+
+#define MV_OFFSET_420	384
+#define MV_OFFSET_400	256
+
+#define VDPU_SWREG(nr)			((nr) * 4)
+
+#define VDPU_REG_DEC_OUT_BASE		VDPU_SWREG(63)
+#define VDPU_REG_RLC_VLC_BASE		VDPU_SWREG(64)
+#define VDPU_REG_QTABLE_BASE		VDPU_SWREG(61)
+#define VDPU_REG_DIR_MV_BASE		VDPU_SWREG(62)
+#define VDPU_REG_REFER0_BASE		VDPU_SWREG(84)
+#define VDPU_REG_REFER1_BASE		VDPU_SWREG(85)
+#define VDPU_REG_REFER2_BASE		VDPU_SWREG(86)
+#define VDPU_REG_REFER3_BASE		VDPU_SWREG(87)
+#define VDPU_REG_REFER4_BASE		VDPU_SWREG(88)
+#define VDPU_REG_REFER5_BASE		VDPU_SWREG(89)
+#define VDPU_REG_REFER6_BASE		VDPU_SWREG(90)
+#define VDPU_REG_REFER7_BASE		VDPU_SWREG(91)
+#define VDPU_REG_REFER8_BASE		VDPU_SWREG(92)
+#define VDPU_REG_REFER9_BASE		VDPU_SWREG(93)
+#define VDPU_REG_REFER10_BASE		VDPU_SWREG(94)
+#define VDPU_REG_REFER11_BASE		VDPU_SWREG(95)
+#define VDPU_REG_REFER12_BASE		VDPU_SWREG(96)
+#define VDPU_REG_REFER13_BASE		VDPU_SWREG(97)
+#define VDPU_REG_REFER14_BASE		VDPU_SWREG(98)
+#define VDPU_REG_REFER15_BASE		VDPU_SWREG(99)
+#define VDPU_REG_DEC_E(v)		((v) ? BIT(0) : 0)
+
+#define VDPU_REG_DEC_ADV_PRE_DIS(v)	((v) ? BIT(11) : 0)
+#define VDPU_REG_DEC_SCMD_DIS(v)	((v) ? BIT(10) : 0)
+#define VDPU_REG_FILTERING_DIS(v)	((v) ? BIT(8) : 0)
+#define VDPU_REG_PIC_FIXED_QUANT(v)	((v) ? BIT(7) : 0)
+#define VDPU_REG_DEC_LATENCY(v)		(((v) << 1) & GENMASK(6, 1))
+
+#define VDPU_REG_INIT_QP(v)		(((v) << 25) & GENMASK(30, 25))
+#define VDPU_REG_STREAM_LEN(v)		(((v) << 0) & GENMASK(23, 0))
+
+#define VDPU_REG_APF_THRESHOLD(v)	(((v) << 17) & GENMASK(30, 17))
+#define VDPU_REG_STARTMB_X(v)		(((v) << 8) & GENMASK(16, 8))
+#define VDPU_REG_STARTMB_Y(v)		(((v) << 0) & GENMASK(7, 0))
+
+#define VDPU_REG_DEC_MODE(v)		(((v) << 0) & GENMASK(3, 0))
+
+#define VDPU_REG_DEC_STRENDIAN_E(v)	((v) ? BIT(5) : 0)
+#define VDPU_REG_DEC_STRSWAP32_E(v)	((v) ? BIT(4) : 0)
+#define VDPU_REG_DEC_OUTSWAP32_E(v)	((v) ? BIT(3) : 0)
+#define VDPU_REG_DEC_INSWAP32_E(v)	((v) ? BIT(2) : 0)
+#define VDPU_REG_DEC_OUT_ENDIAN(v)	((v) ? BIT(1) : 0)
+#define VDPU_REG_DEC_IN_ENDIAN(v)	((v) ? BIT(0) : 0)
+
+#define VDPU_REG_DEC_DATA_DISC_E(v)	((v) ? BIT(22) : 0)
+#define VDPU_REG_DEC_MAX_BURST(v)	(((v) << 16) & GENMASK(20, 16))
+#define VDPU_REG_DEC_AXI_WR_ID(v)	(((v) << 8) & GENMASK(15, 8))
+#define VDPU_REG_DEC_AXI_RD_ID(v)	(((v) << 0) & GENMASK(7, 0))
+
+#define VDPU_REG_START_CODE_E(v)	((v) ? BIT(22) : 0)
+#define VDPU_REG_CH_8PIX_ILEAV_E(v)	((v) ? BIT(21) : 0)
+#define VDPU_REG_RLC_MODE_E(v)		((v) ? BIT(20) : 0)
+#define VDPU_REG_PIC_INTERLACE_E(v)	((v) ? BIT(17) : 0)
+#define VDPU_REG_PIC_FIELDMODE_E(v)	((v) ? BIT(16) : 0)
+#define VDPU_REG_PIC_TOPFIELD_E(v)	((v) ? BIT(13) : 0)
+#define VDPU_REG_WRITE_MVS_E(v)		((v) ? BIT(10) : 0)
+#define VDPU_REG_SEQ_MBAFF_E(v)		((v) ? BIT(7) : 0)
+#define VDPU_REG_PICORD_COUNT_E(v)	((v) ? BIT(6) : 0)
+#define VDPU_REG_DEC_TIMEOUT_E(v)	((v) ? BIT(5) : 0)
+#define VDPU_REG_DEC_CLK_GATE_E(v)	((v) ? BIT(4) : 0)
+
+#define VDPU_REG_PRED_BC_TAP_0_0(v)	(((v) << 22) & GENMASK(31, 22))
+#define VDPU_REG_PRED_BC_TAP_0_1(v)	(((v) << 12) & GENMASK(21, 12))
+#define VDPU_REG_PRED_BC_TAP_0_2(v)	(((v) << 2) & GENMASK(11, 2))
+
+#define VDPU_REG_REFBU_E(v)		((v) ? BIT(31) : 0)
+
+#define VDPU_REG_PINIT_RLIST_F9(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_PINIT_RLIST_F8(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_PINIT_RLIST_F7(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_PINIT_RLIST_F6(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_PINIT_RLIST_F5(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_PINIT_RLIST_F4(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_PINIT_RLIST_F15(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_PINIT_RLIST_F14(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_PINIT_RLIST_F13(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_PINIT_RLIST_F12(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_PINIT_RLIST_F11(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_PINIT_RLIST_F10(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_REFER1_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER0_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER3_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER2_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER5_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER4_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER7_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER6_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER9_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER8_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER11_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER10_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER13_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER12_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFER15_NBR(v)		(((v) << 16) & GENMASK(31, 16))
+#define VDPU_REG_REFER14_NBR(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_BINIT_RLIST_F5(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_BINIT_RLIST_F4(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_BINIT_RLIST_F3(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_F2(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_F1(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_F0(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_BINIT_RLIST_F11(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_BINIT_RLIST_F10(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_BINIT_RLIST_F9(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_F8(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_F7(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_F6(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_BINIT_RLIST_F15(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_F14(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_F13(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_F12(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_BINIT_RLIST_B5(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_BINIT_RLIST_B4(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_BINIT_RLIST_B3(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_B2(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_B1(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_B0(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_BINIT_RLIST_B11(v)	(((v) << 25) & GENMASK(29, 25))
+#define VDPU_REG_BINIT_RLIST_B10(v)	(((v) << 20) & GENMASK(24, 20))
+#define VDPU_REG_BINIT_RLIST_B9(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_B8(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_B7(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_B6(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_BINIT_RLIST_B15(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_BINIT_RLIST_B14(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_BINIT_RLIST_B13(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_BINIT_RLIST_B12(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_PINIT_RLIST_F3(v)	(((v) << 15) & GENMASK(19, 15))
+#define VDPU_REG_PINIT_RLIST_F2(v)	(((v) << 10) & GENMASK(14, 10))
+#define VDPU_REG_PINIT_RLIST_F1(v)	(((v) << 5) & GENMASK(9, 5))
+#define VDPU_REG_PINIT_RLIST_F0(v)	(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_REFER_LTERM_E(v)	(((v) << 0) & GENMASK(31, 0))
+
+#define VDPU_REG_REFER_VALID_E(v)	(((v) << 0) & GENMASK(31, 0))
+
+#define VDPU_REG_STRM_START_BIT(v)	(((v) << 0) & GENMASK(5, 0))
+
+#define VDPU_REG_CH_QP_OFFSET2(v)	(((v) << 22) & GENMASK(26, 22))
+#define VDPU_REG_CH_QP_OFFSET(v)	(((v) << 17) & GENMASK(21, 17))
+#define VDPU_REG_PIC_MB_HEIGHT_P(v)	(((v) << 9) & GENMASK(16, 9))
+#define VDPU_REG_PIC_MB_WIDTH(v)	(((v) << 0) & GENMASK(8, 0))
+
+#define VDPU_REG_WEIGHT_BIPR_IDC(v)	(((v) << 16) & GENMASK(17, 16))
+#define VDPU_REG_REF_FRAMES(v)		(((v) << 0) & GENMASK(4, 0))
+
+#define VDPU_REG_FILT_CTRL_PRES(v)	((v) ? BIT(31) : 0)
+#define VDPU_REG_RDPIC_CNT_PRES(v)	((v) ? BIT(30) : 0)
+#define VDPU_REG_FRAMENUM_LEN(v)	(((v) << 16) & GENMASK(20, 16))
+#define VDPU_REG_FRAMENUM(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_REFPIC_MK_LEN(v)	(((v) << 16) & GENMASK(26, 16))
+#define VDPU_REG_IDR_PIC_ID(v)		(((v) << 0) & GENMASK(15, 0))
+
+#define VDPU_REG_PPS_ID(v)		(((v) << 24) & GENMASK(31, 24))
+#define VDPU_REG_REFIDX1_ACTIVE(v)	(((v) << 19) & GENMASK(23, 19))
+#define VDPU_REG_REFIDX0_ACTIVE(v)	(((v) << 14) & GENMASK(18, 14))
+#define VDPU_REG_POC_LENGTH(v)		(((v) << 0) & GENMASK(7, 0))
+
+#define VDPU_REG_IDR_PIC_E(v)		((v) ? BIT(8) : 0)
+#define VDPU_REG_DIR_8X8_INFER_E(v)	((v) ? BIT(7) : 0)
+#define VDPU_REG_BLACKWHITE_E(v)	((v) ? BIT(6) : 0)
+#define VDPU_REG_CABAC_E(v)		((v) ? BIT(5) : 0)
+#define VDPU_REG_WEIGHT_PRED_E(v)	((v) ? BIT(4) : 0)
+#define VDPU_REG_CONST_INTRA_E(v)	((v) ? BIT(3) : 0)
+#define VDPU_REG_8X8TRANS_FLAG_E(v)	((v) ? BIT(2) : 0)
+#define VDPU_REG_TYPE1_QUANT_E(v)	((v) ? BIT(1) : 0)
+#define VDPU_REG_FIELDPIC_FLAG_E(v)	((v) ? BIT(0) : 0)
+
+void rk3399_vpu_h264_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	const struct hantro_h264_dec_ctrls *ctrls;
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	const struct v4l2_ctrl_h264_slice_params *slices;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
+	const u8 *b0_reflist, *b1_reflist, *p_reflist;
+	dma_addr_t addr;
+	u32 reg;
+	unsigned int offset = MV_OFFSET_420;
+
+	/* Prepare the H264 decoder context. */
+	if (hantro_h264_dec_prepare_run(ctx))
+		return;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	ctrls = &ctx->h264_dec.ctrls;
+	dec_param = ctrls->decode;
+	slices = ctrls->slices;
+	sps = ctrls->sps;
+	pps = ctrls->pps;
+
+	b0_reflist = ctx->h264_dec.reflists.b0;
+	b1_reflist = ctx->h264_dec.reflists.b1;
+	p_reflist = ctx->h264_dec.reflists.p;
+
+	reg = VDPU_REG_DEC_ADV_PRE_DIS(0) |
+	      VDPU_REG_DEC_SCMD_DIS(0) |
+	      VDPU_REG_FILTERING_DIS(0) |
+	      VDPU_REG_PIC_FIXED_QUANT(0) |
+	      VDPU_REG_DEC_LATENCY(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(50));
+
+	reg = VDPU_REG_INIT_QP(pps->pic_init_qp_minus26 + 26) |
+	      VDPU_REG_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(51));
+
+	reg = VDPU_REG_APF_THRESHOLD(8) |
+	      VDPU_REG_STARTMB_X(0) |
+	      VDPU_REG_STARTMB_Y(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(52));
+
+	reg = VDPU_REG_DEC_MODE(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(53));
+
+	reg = VDPU_REG_DEC_STRENDIAN_E(1) |
+	      VDPU_REG_DEC_STRSWAP32_E(1) |
+	      VDPU_REG_DEC_OUTSWAP32_E(1) |
+	      VDPU_REG_DEC_INSWAP32_E(1) |
+	      VDPU_REG_DEC_OUT_ENDIAN(1) |
+	      VDPU_REG_DEC_IN_ENDIAN(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(54));
+
+	reg = VDPU_REG_DEC_DATA_DISC_E(0) |
+	      VDPU_REG_DEC_MAX_BURST(16) |
+	      VDPU_REG_DEC_AXI_WR_ID(0) |
+	      VDPU_REG_DEC_AXI_RD_ID(0xff);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(56));
+
+	reg = VDPU_REG_START_CODE_E(1) |
+	      VDPU_REG_CH_8PIX_ILEAV_E(0) |
+	      VDPU_REG_RLC_MODE_E(0) |
+	      VDPU_REG_PIC_INTERLACE_E(!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY) && (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD || slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)) |
+	      VDPU_REG_PIC_FIELDMODE_E(slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) |
+	      VDPU_REG_PIC_TOPFIELD_E(!(slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)) |
+	      VDPU_REG_WRITE_MVS_E(dec_param->nal_ref_idc) |
+	      VDPU_REG_SEQ_MBAFF_E(sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD) |
+	      VDPU_REG_PICORD_COUNT_E(1) |
+	      VDPU_REG_DEC_TIMEOUT_E(1) |
+	      VDPU_REG_DEC_CLK_GATE_E(1);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(57));
+
+	reg = VDPU_REG_PRED_BC_TAP_0_0(1) |
+	      VDPU_REG_PRED_BC_TAP_0_1((u32)-5) |
+	      VDPU_REG_PRED_BC_TAP_0_2(20);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(59));
+
+	reg = VDPU_REG_REFBU_E(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(65));
+
+	reg = VDPU_REG_PINIT_RLIST_F9(p_reflist[9]) |
+	      VDPU_REG_PINIT_RLIST_F8(p_reflist[8]) |
+	      VDPU_REG_PINIT_RLIST_F7(p_reflist[7]) |
+	      VDPU_REG_PINIT_RLIST_F6(p_reflist[6]) |
+	      VDPU_REG_PINIT_RLIST_F5(p_reflist[5]) |
+	      VDPU_REG_PINIT_RLIST_F4(p_reflist[4]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(74));
+
+	reg = VDPU_REG_PINIT_RLIST_F15(p_reflist[15]) |
+	      VDPU_REG_PINIT_RLIST_F14(p_reflist[14]) |
+	      VDPU_REG_PINIT_RLIST_F13(p_reflist[13]) |
+	      VDPU_REG_PINIT_RLIST_F12(p_reflist[12]) |
+	      VDPU_REG_PINIT_RLIST_F11(p_reflist[11]) |
+	      VDPU_REG_PINIT_RLIST_F10(p_reflist[10]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(75));
+
+	reg = VDPU_REG_REFER1_NBR(hantro_h264_get_ref_nbr(ctx, 1)) |
+	      VDPU_REG_REFER0_NBR(hantro_h264_get_ref_nbr(ctx, 0));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(76));
+
+	reg = VDPU_REG_REFER3_NBR(hantro_h264_get_ref_nbr(ctx, 3)) |
+	      VDPU_REG_REFER2_NBR(hantro_h264_get_ref_nbr(ctx, 2));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(77));
+
+	reg = VDPU_REG_REFER5_NBR(hantro_h264_get_ref_nbr(ctx, 5)) |
+	      VDPU_REG_REFER4_NBR(hantro_h264_get_ref_nbr(ctx, 4));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(78));
+
+	reg = VDPU_REG_REFER7_NBR(hantro_h264_get_ref_nbr(ctx, 7)) |
+	      VDPU_REG_REFER6_NBR(hantro_h264_get_ref_nbr(ctx, 6));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(79));
+
+	reg = VDPU_REG_REFER9_NBR(hantro_h264_get_ref_nbr(ctx, 9)) |
+	      VDPU_REG_REFER8_NBR(hantro_h264_get_ref_nbr(ctx, 8));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(80));
+
+	reg = VDPU_REG_REFER11_NBR(hantro_h264_get_ref_nbr(ctx, 11)) |
+	      VDPU_REG_REFER10_NBR(hantro_h264_get_ref_nbr(ctx, 10));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(81));
+
+	reg = VDPU_REG_REFER13_NBR(hantro_h264_get_ref_nbr(ctx, 13)) |
+	      VDPU_REG_REFER12_NBR(hantro_h264_get_ref_nbr(ctx, 12));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(82));
+
+	reg = VDPU_REG_REFER15_NBR(hantro_h264_get_ref_nbr(ctx, 15)) |
+	      VDPU_REG_REFER14_NBR(hantro_h264_get_ref_nbr(ctx, 14));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(83));
+
+	reg = VDPU_REG_BINIT_RLIST_F5(b0_reflist[5]) |
+	      VDPU_REG_BINIT_RLIST_F4(b0_reflist[4]) |
+	      VDPU_REG_BINIT_RLIST_F3(b0_reflist[3]) |
+	      VDPU_REG_BINIT_RLIST_F2(b0_reflist[2]) |
+	      VDPU_REG_BINIT_RLIST_F1(b0_reflist[1]) |
+	      VDPU_REG_BINIT_RLIST_F0(b0_reflist[0]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(100));
+
+	reg = VDPU_REG_BINIT_RLIST_F11(b0_reflist[11]) |
+	      VDPU_REG_BINIT_RLIST_F10(b0_reflist[10]) |
+	      VDPU_REG_BINIT_RLIST_F9(b0_reflist[9]) |
+	      VDPU_REG_BINIT_RLIST_F8(b0_reflist[8]) |
+	      VDPU_REG_BINIT_RLIST_F7(b0_reflist[7]) |
+	      VDPU_REG_BINIT_RLIST_F6(b0_reflist[6]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(101));
+
+	reg = VDPU_REG_BINIT_RLIST_F15(b0_reflist[15]) |
+	      VDPU_REG_BINIT_RLIST_F14(b0_reflist[14]) |
+	      VDPU_REG_BINIT_RLIST_F13(b0_reflist[13]) |
+	      VDPU_REG_BINIT_RLIST_F12(b0_reflist[12]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(102));
+
+	reg = VDPU_REG_BINIT_RLIST_B5(b1_reflist[5]) |
+	      VDPU_REG_BINIT_RLIST_B4(b1_reflist[4]) |
+	      VDPU_REG_BINIT_RLIST_B3(b1_reflist[3]) |
+	      VDPU_REG_BINIT_RLIST_B2(b1_reflist[2]) |
+	      VDPU_REG_BINIT_RLIST_B1(b1_reflist[1]) |
+	      VDPU_REG_BINIT_RLIST_B0(b1_reflist[0]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(103));
+
+	reg = VDPU_REG_BINIT_RLIST_B11(b1_reflist[11]) |
+	      VDPU_REG_BINIT_RLIST_B10(b1_reflist[10]) |
+	      VDPU_REG_BINIT_RLIST_B9(b1_reflist[9]) |
+	      VDPU_REG_BINIT_RLIST_B8(b1_reflist[8]) |
+	      VDPU_REG_BINIT_RLIST_B7(b1_reflist[7]) |
+	      VDPU_REG_BINIT_RLIST_B6(b1_reflist[6]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(104));
+
+	reg = VDPU_REG_BINIT_RLIST_B15(b1_reflist[15]) |
+	      VDPU_REG_BINIT_RLIST_B14(b1_reflist[14]) |
+	      VDPU_REG_BINIT_RLIST_B13(b1_reflist[13]) |
+	      VDPU_REG_BINIT_RLIST_B12(b1_reflist[12]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(105));
+
+	reg = VDPU_REG_PINIT_RLIST_F3(p_reflist[3]) |
+	      VDPU_REG_PINIT_RLIST_F2(p_reflist[2]) |
+	      VDPU_REG_PINIT_RLIST_F1(p_reflist[1]) |
+	      VDPU_REG_PINIT_RLIST_F0(p_reflist[0]);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(106));
+
+	reg = VDPU_REG_REFER_LTERM_E(ctx->h264_dec.dpb_longterm);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(107));
+
+	reg = VDPU_REG_REFER_VALID_E(ctx->h264_dec.dpb_valid);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(108));
+
+	reg = VDPU_REG_STRM_START_BIT(0);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(109));
+
+	reg = VDPU_REG_CH_QP_OFFSET2(pps->second_chroma_qp_index_offset) |
+	      VDPU_REG_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
+	      VDPU_REG_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
+	      VDPU_REG_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(110));
+
+	reg = VDPU_REG_WEIGHT_BIPR_IDC(pps->weighted_bipred_idc) |
+	      VDPU_REG_REF_FRAMES(sps->max_num_ref_frames);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(111));
+
+	reg = VDPU_REG_FILT_CTRL_PRES(pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT) |
+	      VDPU_REG_RDPIC_CNT_PRES(pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT) |
+	      VDPU_REG_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) |
+	      VDPU_REG_FRAMENUM(slices[0].frame_num);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(112));
+
+	reg = VDPU_REG_REFPIC_MK_LEN(slices[0].dec_ref_pic_marking_bit_size) |
+	      VDPU_REG_IDR_PIC_ID(slices[0].idr_pic_id);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(113));
+
+	reg = VDPU_REG_PPS_ID(slices[0].pic_parameter_set_id) |
+	      VDPU_REG_REFIDX1_ACTIVE(pps->num_ref_idx_l1_default_active_minus1 + 1) |
+	      VDPU_REG_REFIDX0_ACTIVE(pps->num_ref_idx_l0_default_active_minus1 + 1) |
+	      VDPU_REG_POC_LENGTH(slices[0].pic_order_cnt_bit_size);
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(114));
+
+	reg = VDPU_REG_IDR_PIC_E(dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) |
+	      VDPU_REG_DIR_8X8_INFER_E(sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE) |
+	      VDPU_REG_BLACKWHITE_E(sps->profile_idc >= 100 && sps->chroma_format_idc == 0) |
+	      VDPU_REG_CABAC_E(pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE) |
+	      VDPU_REG_WEIGHT_PRED_E(pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) |
+	      VDPU_REG_CONST_INTRA_E(pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED) |
+	      VDPU_REG_8X8TRANS_FLAG_E(pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE) |
+	      VDPU_REG_TYPE1_QUANT_E(1) |
+	      VDPU_REG_FIELDPIC_FLAG_E(!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY));
+	vdpu_write_relaxed(vpu, reg, VDPU_SWREG(115));
+
+	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
+	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, VDPU_REG_QTABLE_BASE);
+
+	/* Source (stream) buffer. */
+	addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, addr, VDPU_REG_RLC_VLC_BASE);
+
+	/* Destination (decoded frame) buffer. */
+	addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+		addr += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
+	vdpu_write_relaxed(vpu, addr, VDPU_REG_DEC_OUT_BASE);
+
+	/* Motion vector buffer is located after the decoded frame. */
+	addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
+		offset = MV_OFFSET_400;
+	addr += offset * H264_MB_WIDTH(ctx->dst_fmt.width) *
+		   H264_MB_HEIGHT(ctx->dst_fmt.height);
+	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+		addr += 32 * H264_MB_WIDTH(ctx->dst_fmt.width) *
+			   H264_MB_HEIGHT(ctx->dst_fmt.height);
+	vdpu_write_relaxed(vpu, addr, VDPU_REG_DIR_MV_BASE);
+
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 0), VDPU_REG_REFER0_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 1), VDPU_REG_REFER1_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 2), VDPU_REG_REFER2_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 3), VDPU_REG_REFER3_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 4), VDPU_REG_REFER4_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 5), VDPU_REG_REFER5_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 6), VDPU_REG_REFER6_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 7), VDPU_REG_REFER7_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 8), VDPU_REG_REFER8_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 9), VDPU_REG_REFER9_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 10), VDPU_REG_REFER10_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 11), VDPU_REG_REFER11_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 12), VDPU_REG_REFER12_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 13), VDPU_REG_REFER13_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 14), VDPU_REG_REFER14_BASE);
+	vdpu_write_relaxed(vpu, hantro_h264_get_ref_dma_addr(ctx, 15), VDPU_REG_REFER15_BASE);
+
+	hantro_finish_run(ctx);
+
+	/* Start decoding! */
+	reg = vdpu_read(vpu, VDPU_SWREG(57)) | VDPU_REG_DEC_E(1);
+	vdpu_write(vpu, reg, VDPU_SWREG(57));
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC 12/12] media: hantro: Enable H264 decoding on RK3328
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
                     ` (9 preceding siblings ...)
  2019-09-01 12:45   ` [RFC 11/12] media: hantro: Enable " Jonas Karlman
@ 2019-09-01 12:45   ` Jonas Karlman
  10 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-01 12:45 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel, Jonas Karlman

RK3328 SoC has the same decoder IP block as RK3399,
lets enable H264 decoding on RK3328.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 drivers/staging/media/hantro/rk3399_vpu_hw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw.c b/drivers/staging/media/hantro/rk3399_vpu_hw.c
index 47ca51b75a0d..08b965129377 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw.c
@@ -232,7 +232,8 @@ const struct hantro_variant rk3328_vpu_variant = {
 	.dec_offset = 0x400,
 	.dec_fmts = rk3399_vpu_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
-	.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER,
+	.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
+		 HANTRO_H264_DECODER,
 	.codec_ops = rk3399_vpu_codec_ops,
 	.irqs = rk3328_irqs,
 	.num_irqs = ARRAY_SIZE(rk3328_irqs),
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399
  2019-09-01 12:45   ` [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399 Jonas Karlman
@ 2019-09-02 11:46     ` Hans Verkuil
  2019-09-02 15:25       ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Hans Verkuil @ 2019-09-02 11:46 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Boris Brezillon, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 9/1/19 2:45 PM, Jonas Karlman wrote:
> Rockchip RK3399 SoC has the same Hantro G1 IP block
> as RK3288, but the registers are entirely different.
> 
> In a similar fashion as MPEG-2 and VP8 decoding,
> it's simpler to just add a separate implementation.
> 
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  drivers/staging/media/hantro/Makefile         |   1 +
>  .../staging/media/hantro/hantro_g1_h264_dec.c |   1 -
>  drivers/staging/media/hantro/hantro_hw.h      |   1 +
>  .../media/hantro/rk3399_vpu_hw_h264_dec.c     | 486 ++++++++++++++++++
>  4 files changed, 488 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c
> 
> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> index 5d6b0383d280..8d33b0e8aa6c 100644
> --- a/drivers/staging/media/hantro/Makefile
> +++ b/drivers/staging/media/hantro/Makefile
> @@ -8,6 +8,7 @@ hantro-vpu-y += \
>  		hantro_g1_mpeg2_dec.o \
>  		hantro_g1_vp8_dec.o \
>  		rk3399_vpu_hw_jpeg_enc.o \
> +		rk3399_vpu_hw_h264_dec.o \
>  		rk3399_vpu_hw_mpeg2_dec.o \
>  		rk3399_vpu_hw_vp8_dec.o \
>  		hantro_jpeg.o \
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 4b82b9fd5252..ec2736fb473d 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -202,7 +202,6 @@
>  #define G1_REG_REFBU_E(v)		((v) ? BIT(31) : 0)
>  
>  #define G1_REG_APF_THRESHOLD(v)		(((v) << 0) & GENMASK(13, 0))
> ->>>>>>> b22734fb5e2c... Ymedia: hantro: Refactor G1 H264 code

^^^^^^^^^^^^^^^

Seems to be a left-over from patch 9?

Regards,

	Hans

>  
>  void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>  {

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH RFC 00/12] media: hantro: H264 fixes and improvements
  2019-09-01 12:42 [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Jonas Karlman
  2019-09-01 12:45 ` [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288 Jonas Karlman
       [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
@ 2019-09-02 13:02 ` Ezequiel Garcia
  2019-09-02 16:28   ` Jonas Karlman
  2 siblings, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2019-09-02 13:02 UTC (permalink / raw)
  To: Jonas Karlman
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

Hi Jonas,

Thanks for the series, I'll be reviewing this shortly.

On Sun, 2019-09-01 at 12:42 +0000, Jonas Karlman wrote:
> This series contains fixes and improvements for the hantro H264 decoder.
> 
> Patch 1-6 fixes issues and limitations observed when preparing support
> for field encoded content.
> 
> Patch 7 introduce new DPB entry flags that is used to signal how a reference
> frame is referenced. This information is needed to correctly build a
> reference list for field encoded content.
> 
> Patch 8 adds bits to handle field encoded content, this is a rough patch
> and should be reworked with proper code style and formatting.
> Please get back with feedback on how to improve this.
> 
> The following samples from [1] are now playable with patch 1-8
> - H264_1080i-25-interlace_Kaesescheibchen.mkv
> - H264_10_1080i_50_AC3-Astra19.2_ProSieben_HD.ts
> - big_buck_bunny_1080p_H264_AAC_25fps_7200K.mp4
> - h264_tivo_sample.ts
> 
> The rest of the patches refactors G1 H264 code to more closely match
> the code generated by my rockchip-vpu-regtool at [2] and then adds
> support for H264 decoding on RK3399/RK3328 using the VPU2 block.
> This code is early work and needs proper code style and formatting,
> I just wanted to share the early work and get some initial feedback.
> 
> This series has been tested using ffmpeg v4l2 request hwaccel at [3] [4]
> 

What boards have you tested this on?

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-01 12:45   ` [PATCH 02/12] media: hantro: Do not reorder H264 scaling list Jonas Karlman
@ 2019-09-02 14:00     ` Philipp Zabel
  2019-09-02 16:18       ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Philipp Zabel @ 2019-09-02 14:00 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

Hi Jonas,

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> Scaling list supplied from userspace using ffmpeg and libva-v4l2-request
> is already in matrix order and can be used without applying the inverse
> scanning process.

"in matrix order" is equivalent to "in raster scan order"?

Could you add this requirement to the
V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX documentation?

> The HW also only support 8x8 scaling list for the Y component, indices 0
> and 3 in the scaling list supplied from userspace.
> 
> Remove reordering and write the scaling matrix in an order expected by
> the VPU, also only allocate memory for the two 8x8 lists used.
> 
> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  drivers/staging/media/hantro/hantro_h264.c | 64 +++++++---------------
>  1 file changed, 20 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
> index 0d758e0c0f99..e2d01145ac4f 100644
> --- a/drivers/staging/media/hantro/hantro_h264.c
> +++ b/drivers/staging/media/hantro/hantro_h264.c
> @@ -20,7 +20,7 @@
>  /* Size with u32 units. */
>  #define CABAC_INIT_BUFFER_SIZE		(460 * 2)
>  #define POC_BUFFER_SIZE			34
> -#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
> +#define SCALING_LIST_SIZE		(6 * 16 + 2 * 64)

This changes the size of struct hantro_h264_dec_priv_tbl. Did this
describe the auxiliary buffer format incorrectly before?

>  #define POC_CMP(p0, p1) ((p0) < (p1) ? -1 : 1)
>  
> @@ -194,57 +194,33 @@ static const u32 h264_cabac_table[] = {
>  	0x1f0c2517, 0x1f261440
>  };
>  
> -/*
> - * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
> - * to get the values in matrix order. In addition, the hardware requires bytes
> - * swapped within each subsequent 4 bytes. Both arrays below include both
> - * transformations.
> - */
> -static const u32 zig_zag_4x4[] = {
> -	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
> -};
> -
> -static const u32 zig_zag_8x8[] = {
> -	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
> -	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
> -	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
> -	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
> -};
> -
>  static void
>  reorder_scaling_list(struct hantro_ctx *ctx)
>  {
>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>  	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
> -	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
> -	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
> -	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
> -	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
>  	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
> -	u8 *dst = tbl->scaling_list;
> -	const u8 *src;
> -	int i, j;
> -
> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
> -	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
> -		     num_list_4x4 * list_len_4x4 +
> -		     num_list_8x8 * list_len_8x8);
> -
> -	src = &scaling->scaling_list_4x4[0][0];
> -	for (i = 0; i < num_list_4x4; ++i) {
> -		for (j = 0; j < list_len_4x4; ++j)
> -			dst[zig_zag_4x4[j]] = src[j];
> -		src += list_len_4x4;
> -		dst += list_len_4x4;
> +	u32 *dst = (u32 *)tbl->scaling_list;
> +	u32 i, j, tmp;
> +
> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_4x4); i++) {
> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_4x4[0]) / 4; j++) {
> +			tmp = (scaling->scaling_list_4x4[i][4 * j + 0] << 24) |
> +			      (scaling->scaling_list_4x4[i][4 * j + 1] << 16) |
> +			      (scaling->scaling_list_4x4[i][4 * j + 2] << 8) |
> +			      (scaling->scaling_list_4x4[i][4 * j + 3]);
> +			*dst++ = tmp;
> +		}

This looks like it could use swab32().

>  	}
>  
> -	src = &scaling->scaling_list_8x8[0][0];
> -	for (i = 0; i < num_list_8x8; ++i) {
> -		for (j = 0; j < list_len_8x8; ++j)
> -			dst[zig_zag_8x8[j]] = src[j];
> -		src += list_len_8x8;
> -		dst += list_len_8x8;
> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_8x8); i += 3) {
> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_8x8[0]) / 4; j++) {
> +			tmp = (scaling->scaling_list_8x8[i][4 * j + 0] << 24) |
> +			      (scaling->scaling_list_8x8[i][4 * j + 1] << 16) |
> +			      (scaling->scaling_list_8x8[i][4 * j + 2] << 8) |
> +			      (scaling->scaling_list_8x8[i][4 * j + 3]);
> +			*dst++ = tmp;
> +		}

After this change, the second 8x8 scaling list has moved to a different
offset. Is this where the hardware has always been looking for it, or is
there a change missing in another place?

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399
  2019-09-02 11:46     ` Hans Verkuil
@ 2019-09-02 15:25       ` Jonas Karlman
  0 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-02 15:25 UTC (permalink / raw)
  To: Hans Verkuil, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Boris Brezillon, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2019-09-02 13:46, Hans Verkuil wrote:
> On 9/1/19 2:45 PM, Jonas Karlman wrote:
>> Rockchip RK3399 SoC has the same Hantro G1 IP block
>> as RK3288, but the registers are entirely different.
>>
>> In a similar fashion as MPEG-2 and VP8 decoding,
>> it's simpler to just add a separate implementation.
>>
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>> ---
>>  drivers/staging/media/hantro/Makefile         |   1 +
>>  .../staging/media/hantro/hantro_g1_h264_dec.c |   1 -
>>  drivers/staging/media/hantro/hantro_hw.h      |   1 +
>>  .../media/hantro/rk3399_vpu_hw_h264_dec.c     | 486 ++++++++++++++++++
>>  4 files changed, 488 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/staging/media/hantro/rk3399_vpu_hw_h264_dec.c
>>
>> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
>> index 5d6b0383d280..8d33b0e8aa6c 100644
>> --- a/drivers/staging/media/hantro/Makefile
>> +++ b/drivers/staging/media/hantro/Makefile
>> @@ -8,6 +8,7 @@ hantro-vpu-y += \
>>  		hantro_g1_mpeg2_dec.o \
>>  		hantro_g1_vp8_dec.o \
>>  		rk3399_vpu_hw_jpeg_enc.o \
>> +		rk3399_vpu_hw_h264_dec.o \
>>  		rk3399_vpu_hw_mpeg2_dec.o \
>>  		rk3399_vpu_hw_vp8_dec.o \
>>  		hantro_jpeg.o \
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 4b82b9fd5252..ec2736fb473d 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -202,7 +202,6 @@
>>  #define G1_REG_REFBU_E(v)		((v) ? BIT(31) : 0)
>>  
>>  #define G1_REG_APF_THRESHOLD(v)		(((v) << 0) & GENMASK(13, 0))
>> ->>>>>>> b22734fb5e2c... Ymedia: hantro: Refactor G1 H264 code
> ^^^^^^^^^^^^^^^
>
> Seems to be a left-over from patch 9?

Yes, thanks for noticing, looks like there was a small rebase issue, the line got added
in one patch and removed in the next. Will fix in next spin.

Regards,
Jonas

>
> Regards,
>
> 	Hans
>
>>  
>>  void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>>  {


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-02 14:00     ` Philipp Zabel
@ 2019-09-02 16:18       ` Jonas Karlman
  2019-09-03  7:54         ` Jonas Karlman
                           ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-02 16:18 UTC (permalink / raw)
  To: Philipp Zabel, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2019-09-02 16:00, Philipp Zabel wrote:
> Hi Jonas,
>
> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>> Scaling list supplied from userspace using ffmpeg and libva-v4l2-request
>> is already in matrix order and can be used without applying the inverse
>> scanning process.
> "in matrix order" is equivalent to "in raster scan order"?

The values supplied by ffmpeg and libva-v4l2-request is in the order after the
inverse scanning process has been applied (scaling list has been transformed
into a scaling matrix). Not sure what this is called, "matrix order" seemed
close enough.

Since there is two scan orders, zig-zag and field, and cedrus already expecting
the values in "matrix" order, it seems more logical to let userspace handle the
inverse scanning process.

>
> Could you add this requirement to the
> V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX documentation?

Sure, I will update documentation in v2.

>
>> The HW also only support 8x8 scaling list for the Y component, indices 0
>> and 3 in the scaling list supplied from userspace.
>>
>> Remove reordering and write the scaling matrix in an order expected by
>> the VPU, also only allocate memory for the two 8x8 lists used.
>>
>> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>> ---
>>  drivers/staging/media/hantro/hantro_h264.c | 64 +++++++---------------
>>  1 file changed, 20 insertions(+), 44 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>> index 0d758e0c0f99..e2d01145ac4f 100644
>> --- a/drivers/staging/media/hantro/hantro_h264.c
>> +++ b/drivers/staging/media/hantro/hantro_h264.c
>> @@ -20,7 +20,7 @@
>>  /* Size with u32 units. */
>>  #define CABAC_INIT_BUFFER_SIZE		(460 * 2)
>>  #define POC_BUFFER_SIZE			34
>> -#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
>> +#define SCALING_LIST_SIZE		(6 * 16 + 2 * 64)
> This changes the size of struct hantro_h264_dec_priv_tbl. Did this
> describe the auxiliary buffer format incorrectly before?

Based on RKMPP and Hantro SDK the HW expects the 8x8 inter/intra list for
Y-component to be located at indices 0 and 1, lists for Cr/Cb is only used for
4:4:4 and HW only supports 4:0:0/4:2:0 if I am not mistaken. So the unused
extra 4 lists at the end of the auxiliary buffer seemed like a waste,
also RKMPP and Hantro SDK only seemed to allocate space for 2 lists.

>
>>  #define POC_CMP(p0, p1) ((p0) < (p1) ? -1 : 1)
>>  
>> @@ -194,57 +194,33 @@ static const u32 h264_cabac_table[] = {
>>  	0x1f0c2517, 0x1f261440
>>  };
>>  
>> -/*
>> - * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
>> - * to get the values in matrix order. In addition, the hardware requires bytes
>> - * swapped within each subsequent 4 bytes. Both arrays below include both
>> - * transformations.
>> - */
>> -static const u32 zig_zag_4x4[] = {
>> -	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
>> -};
>> -
>> -static const u32 zig_zag_8x8[] = {
>> -	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
>> -	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
>> -	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
>> -	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
>> -};
>> -
>>  static void
>>  reorder_scaling_list(struct hantro_ctx *ctx)
>>  {
>>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>>  	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
>> -	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
>> -	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
>> -	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
>> -	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
>>  	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
>> -	u8 *dst = tbl->scaling_list;
>> -	const u8 *src;
>> -	int i, j;
>> -
>> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
>> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
>> -	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
>> -		     num_list_4x4 * list_len_4x4 +
>> -		     num_list_8x8 * list_len_8x8);
>> -
>> -	src = &scaling->scaling_list_4x4[0][0];
>> -	for (i = 0; i < num_list_4x4; ++i) {
>> -		for (j = 0; j < list_len_4x4; ++j)
>> -			dst[zig_zag_4x4[j]] = src[j];
>> -		src += list_len_4x4;
>> -		dst += list_len_4x4;
>> +	u32 *dst = (u32 *)tbl->scaling_list;
>> +	u32 i, j, tmp;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_4x4); i++) {
>> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_4x4[0]) / 4; j++) {
>> +			tmp = (scaling->scaling_list_4x4[i][4 * j + 0] << 24) |
>> +			      (scaling->scaling_list_4x4[i][4 * j + 1] << 16) |
>> +			      (scaling->scaling_list_4x4[i][4 * j + 2] << 8) |
>> +			      (scaling->scaling_list_4x4[i][4 * j + 3]);
>> +			*dst++ = tmp;
>> +		}
> This looks like it could use swab32().

Thanks for the tip, will look into and change in v2.

>
>>  	}
>>  
>> -	src = &scaling->scaling_list_8x8[0][0];
>> -	for (i = 0; i < num_list_8x8; ++i) {
>> -		for (j = 0; j < list_len_8x8; ++j)
>> -			dst[zig_zag_8x8[j]] = src[j];
>> -		src += list_len_8x8;
>> -		dst += list_len_8x8;
>> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_8x8); i += 3) {
>> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_8x8[0]) / 4; j++) {
>> +			tmp = (scaling->scaling_list_8x8[i][4 * j + 0] << 24) |
>> +			      (scaling->scaling_list_8x8[i][4 * j + 1] << 16) |
>> +			      (scaling->scaling_list_8x8[i][4 * j + 2] << 8) |
>> +			      (scaling->scaling_list_8x8[i][4 * j + 3]);
>> +			*dst++ = tmp;
>> +		}
> After this change, the second 8x8 scaling list has moved to a different
> offset. Is this where the hardware has always been looking for it, or is
> there a change missing in another place?

As mentioned above HW only looks at indices 0 and 1, and ffmpeg will store the
inter/intra Y list at indices 0 and 3 as seen at [1], in similar way cedrus only
use indices 0 and 3 at [2].
FFmpeg memcpy entire scaling_matrix8 to scaling_list_8x8 for v4l2-request-api
and memcpy scaling_matrix8[0] and scaling_matrix8[3] for vaapi.

You can see the effect of this patch using the h264_tivo_sample.ts sample from
cover letter, patch 3-8 must be applied. With this patch applied the green
football field will stay green, without the patch the field will shift in colors.

[1] https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/h264_ps.c#L299-L308
[2] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/sunxi/cedrus/cedrus_h264.c#n231

Regards,
Jonas

>
> regards
> Philipp


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH RFC 00/12] media: hantro: H264 fixes and improvements
  2019-09-02 13:02 ` [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Ezequiel Garcia
@ 2019-09-02 16:28   ` Jonas Karlman
  0 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-02 16:28 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

On 2019-09-02 15:02, Ezequiel Garcia wrote:
> Hi Jonas,
>
> Thanks for the series, I'll be reviewing this shortly.
>
> On Sun, 2019-09-01 at 12:42 +0000, Jonas Karlman wrote:
>> This series contains fixes and improvements for the hantro H264 decoder.
>>
>> Patch 1-6 fixes issues and limitations observed when preparing support
>> for field encoded content.
>>
>> Patch 7 introduce new DPB entry flags that is used to signal how a reference
>> frame is referenced. This information is needed to correctly build a
>> reference list for field encoded content.
>>
>> Patch 8 adds bits to handle field encoded content, this is a rough patch
>> and should be reworked with proper code style and formatting.
>> Please get back with feedback on how to improve this.
>>
>> The following samples from [1] are now playable with patch 1-8
>> - H264_1080i-25-interlace_Kaesescheibchen.mkv
>> - H264_10_1080i_50_AC3-Astra19.2_ProSieben_HD.ts
>> - big_buck_bunny_1080p_H264_AAC_25fps_7200K.mp4
>> - h264_tivo_sample.ts
>>
>> The rest of the patches refactors G1 H264 code to more closely match
>> the code generated by my rockchip-vpu-regtool at [2] and then adds
>> support for H264 decoding on RK3399/RK3328 using the VPU2 block.
>> This code is early work and needs proper code style and formatting,
>> I just wanted to share the early work and get some initial feedback.
>>
>> This series has been tested using ffmpeg v4l2 request hwaccel at [3] [4]
>>
> What boards have you tested this on?

Main testing has been done on a Tinker Board S (RK3288) and a Rock64 (RK3328) device.
Very limited testing on a Rock Pi 4 (RK3399) using earlier version of the patchset,
I will redo some RK3399 testing to make sure it is not only VPU2 on RK3328 that works.

Regards,
Jonas

>
> Thanks,
> Ezequiel
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-02 16:18       ` Jonas Karlman
@ 2019-09-03  7:54         ` Jonas Karlman
  2019-09-03 12:53           ` Philipp Zabel
  2019-09-03  9:56         ` Philipp Zabel
  2019-09-10 10:14         ` Ezequiel Garcia
  2 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-03  7:54 UTC (permalink / raw)
  To: Philipp Zabel, Ezequiel Garcia
  Cc: Paul Kocialkowski, linux-kernel, Hans Verkuil, linux-rockchip,
	Boris Brezillon, Mauro Carvalho Chehab, linux-media

On 2019-09-02 18:18, Jonas Karlman wrote:
> On 2019-09-02 16:00, Philipp Zabel wrote:
>> Hi Jonas,
>>
>> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>>> Scaling list supplied from userspace using ffmpeg and libva-v4l2-request
>>> is already in matrix order and can be used without applying the inverse
>>> scanning process.
>> "in matrix order" is equivalent to "in raster scan order"?
> The values supplied by ffmpeg and libva-v4l2-request is in the order after the
> inverse scanning process has been applied (scaling list has been transformed
> into a scaling matrix). Not sure what this is called, "matrix order" seemed
> close enough.
>
> Since there is two scan orders, zig-zag and field, and cedrus already expecting
> the values in "matrix" order, it seems more logical to let userspace handle the
> inverse scanning process.

After a closer look both ffmpeg and rkmpp only apply zig-zag scan and not field scan,
ffmpeg will memcpy the scaling_matrix4/8 as is for vaapi, vdpau and nvdec,
for dxva2 there is a workaround flag that controls if zig-zag should be applied or not.

I suggest a clarification of the expect order of values and use of the same value order as vaapi, vdpau and nvdec.
i.e. have the scaling list values in "matrix order"/"raster order", after zig-zag scan has been applied,
as is currently expected by cedrus and hantro after this patch.

I would also suggest a change to the expected order of the 8x8 scaling lists to follow the H264 standard,
instead of the ffmpeg order like this patch and cedrus driver currently expects.

Expected scaling list order would then be,
for 4x4: Intra Y, Intra Cb, Intra Cr, Inter Y, Inter Cb, Inter Cr,
for 8x8: Intra Y, Inter Y, Intra Cb, Inter Cb, Intra Cr, Inter Cr.

Regards,
Jonas

>
>> Could you add this requirement to the
>> V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX documentation?
> Sure, I will update documentation in v2.
>
>>> The HW also only support 8x8 scaling list for the Y component, indices 0
>>> and 3 in the scaling list supplied from userspace.
>>>
>>> Remove reordering and write the scaling matrix in an order expected by
>>> the VPU, also only allocate memory for the two 8x8 lists used.
>>>
>>> Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding")
>>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>>> ---
>>>  drivers/staging/media/hantro/hantro_h264.c | 64 +++++++---------------
>>>  1 file changed, 20 insertions(+), 44 deletions(-)
>>>
>>> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>>> index 0d758e0c0f99..e2d01145ac4f 100644
>>> --- a/drivers/staging/media/hantro/hantro_h264.c
>>> +++ b/drivers/staging/media/hantro/hantro_h264.c
>>> @@ -20,7 +20,7 @@
>>>  /* Size with u32 units. */
>>>  #define CABAC_INIT_BUFFER_SIZE		(460 * 2)
>>>  #define POC_BUFFER_SIZE			34
>>> -#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
>>> +#define SCALING_LIST_SIZE		(6 * 16 + 2 * 64)
>> This changes the size of struct hantro_h264_dec_priv_tbl. Did this
>> describe the auxiliary buffer format incorrectly before?
> Based on RKMPP and Hantro SDK the HW expects the 8x8 inter/intra list for
> Y-component to be located at indices 0 and 1, lists for Cr/Cb is only used for
> 4:4:4 and HW only supports 4:0:0/4:2:0 if I am not mistaken. So the unused
> extra 4 lists at the end of the auxiliary buffer seemed like a waste,
> also RKMPP and Hantro SDK only seemed to allocate space for 2 lists.
>
>>>  #define POC_CMP(p0, p1) ((p0) < (p1) ? -1 : 1)
>>>  
>>> @@ -194,57 +194,33 @@ static const u32 h264_cabac_table[] = {
>>>  	0x1f0c2517, 0x1f261440
>>>  };
>>>  
>>> -/*
>>> - * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
>>> - * to get the values in matrix order. In addition, the hardware requires bytes
>>> - * swapped within each subsequent 4 bytes. Both arrays below include both
>>> - * transformations.
>>> - */
>>> -static const u32 zig_zag_4x4[] = {
>>> -	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
>>> -};
>>> -
>>> -static const u32 zig_zag_8x8[] = {
>>> -	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
>>> -	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
>>> -	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
>>> -	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
>>> -};
>>> -
>>>  static void
>>>  reorder_scaling_list(struct hantro_ctx *ctx)
>>>  {
>>>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>>>  	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
>>> -	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
>>> -	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
>>> -	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
>>> -	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
>>>  	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
>>> -	u8 *dst = tbl->scaling_list;
>>> -	const u8 *src;
>>> -	int i, j;
>>> -
>>> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
>>> -	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
>>> -	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
>>> -		     num_list_4x4 * list_len_4x4 +
>>> -		     num_list_8x8 * list_len_8x8);
>>> -
>>> -	src = &scaling->scaling_list_4x4[0][0];
>>> -	for (i = 0; i < num_list_4x4; ++i) {
>>> -		for (j = 0; j < list_len_4x4; ++j)
>>> -			dst[zig_zag_4x4[j]] = src[j];
>>> -		src += list_len_4x4;
>>> -		dst += list_len_4x4;
>>> +	u32 *dst = (u32 *)tbl->scaling_list;
>>> +	u32 i, j, tmp;
>>> +
>>> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_4x4); i++) {
>>> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_4x4[0]) / 4; j++) {
>>> +			tmp = (scaling->scaling_list_4x4[i][4 * j + 0] << 24) |
>>> +			      (scaling->scaling_list_4x4[i][4 * j + 1] << 16) |
>>> +			      (scaling->scaling_list_4x4[i][4 * j + 2] << 8) |
>>> +			      (scaling->scaling_list_4x4[i][4 * j + 3]);
>>> +			*dst++ = tmp;
>>> +		}
>> This looks like it could use swab32().
> Thanks for the tip, will look into and change in v2.
>
>>>  	}
>>>  
>>> -	src = &scaling->scaling_list_8x8[0][0];
>>> -	for (i = 0; i < num_list_8x8; ++i) {
>>> -		for (j = 0; j < list_len_8x8; ++j)
>>> -			dst[zig_zag_8x8[j]] = src[j];
>>> -		src += list_len_8x8;
>>> -		dst += list_len_8x8;
>>> +	for (i = 0; i < ARRAY_SIZE(scaling->scaling_list_8x8); i += 3) {
>>> +		for (j = 0; j < ARRAY_SIZE(scaling->scaling_list_8x8[0]) / 4; j++) {
>>> +			tmp = (scaling->scaling_list_8x8[i][4 * j + 0] << 24) |
>>> +			      (scaling->scaling_list_8x8[i][4 * j + 1] << 16) |
>>> +			      (scaling->scaling_list_8x8[i][4 * j + 2] << 8) |
>>> +			      (scaling->scaling_list_8x8[i][4 * j + 3]);
>>> +			*dst++ = tmp;
>>> +		}
>> After this change, the second 8x8 scaling list has moved to a different
>> offset. Is this where the hardware has always been looking for it, or is
>> there a change missing in another place?
> As mentioned above HW only looks at indices 0 and 1, and ffmpeg will store the
> inter/intra Y list at indices 0 and 3 as seen at [1], in similar way cedrus only
> use indices 0 and 3 at [2].
> FFmpeg memcpy entire scaling_matrix8 to scaling_list_8x8 for v4l2-request-api
> and memcpy scaling_matrix8[0] and scaling_matrix8[3] for vaapi.
>
> You can see the effect of this patch using the h264_tivo_sample.ts sample from
> cover letter, patch 3-8 must be applied. With this patch applied the green
> football field will stay green, without the patch the field will shift in colors.
>
> [1] https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/h264_ps.c#L299-L308
> [2] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/sunxi/cedrus/cedrus_h264.c#n231
>
> Regards,
> Jonas
>
>> regards
>> Philipp


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-02 16:18       ` Jonas Karlman
  2019-09-03  7:54         ` Jonas Karlman
@ 2019-09-03  9:56         ` Philipp Zabel
  2019-09-10 10:14         ` Ezequiel Garcia
  2 siblings, 0 replies; 45+ messages in thread
From: Philipp Zabel @ 2019-09-03  9:56 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On Mon, 2019-09-02 at 16:18 +0000, Jonas Karlman wrote:
> On 2019-09-02 16:00, Philipp Zabel wrote:
> > Hi Jonas,
> > 
> > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> > > Scaling list supplied from userspace using ffmpeg and libva-v4l2-request
> > > is already in matrix order and can be used without applying the inverse
> > > scanning process.
> > 
> > "in matrix order" is equivalent to "in raster scan order"?
> 
> The values supplied by ffmpeg and libva-v4l2-request is in the order after the
> inverse scanning process has been applied (scaling list has been transformed
> into a scaling matrix). Not sure what this is called, "matrix order" seemed
> close enough.

Ok, after reading chapters

8.5.6 Inverse scanning process for 4x4 transform coefficients and scaling lists
8.5.7 Inverse scanning process for 8x8 transform coefficients and scaling lists

of ITU-T Rec. H.264, this seems clear enough. I just asked to make sure,
because libva documentation uses the term "raster scan" [1].

[1] http://intel.github.io/libva/structVAIQMatrixBufferH264.html

> Since there is two scan orders, zig-zag and field, and cedrus already expecting
> the values in "matrix" order, it seems more logical to let userspace handle the
> inverse scanning process.

I agree.

[...]
> > This changes the size of struct hantro_h264_dec_priv_tbl. Did this
> > describe the auxiliary buffer format incorrectly before?
> 
> Based on RKMPP and Hantro SDK the HW expects the 8x8 inter/intra list for
> Y-component to be located at indices 0 and 1, lists for Cr/Cb is only used for
> 4:4:4 and HW only supports 4:0:0/4:2:0 if I am not mistaken. So the unused
> extra 4 lists at the end of the auxiliary buffer seemed like a waste,
> also RKMPP and Hantro SDK only seemed to allocate space for 2 lists.

Ok.

> > After this change, the second 8x8 scaling list has moved to a different
> > offset. Is this where the hardware has always been looking for it, or is
> > there a change missing in another place?
> 
> As mentioned above HW only looks at indices 0 and 1, and ffmpeg will store the
> inter/intra Y list at indices 0 and 3 as seen at [1], in similar way cedrus only
> use indices 0 and 3 at [2].
> FFmpeg memcpy entire scaling_matrix8 to scaling_list_8x8 for v4l2-request-api
> and memcpy scaling_matrix8[0] and scaling_matrix8[3] for vaapi.
> 
> You can see the effect of this patch using the h264_tivo_sample.ts sample from
> cover letter, patch 3-8 must be applied. With this patch applied the green
> football field will stay green, without the patch the field will shift in colors.
> 
> [1] https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/h264_ps.c#L299-L308
> [2] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/sunxi/cedrus/cedrus_h264.c#n231

Thank you, I'll try this.

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset
  2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
@ 2019-09-03 10:58     ` Philipp Zabel
  2019-09-03 20:13       ` Jonas Karlman
  2019-09-10 10:18     ` Ezequiel Garcia
  2019-09-10 11:34     ` Ezequiel Garcia
  2 siblings, 1 reply; 45+ messages in thread
From: Philipp Zabel @ 2019-09-03 10:58 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

Hi Jonas,

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
> and is laid out in memory as follow:

Do you mean "A decoded 8-bit 4:2:0 frame needs up to 448 bytes per
macroblock"?

A 1280x720 frame already consists of 3600 macroblocks (each 16x16 Y +
2x8x8 Cb,Cr).

> +-------------------+
> > Y-plane   256 MBs |

So that looks like it should be 256 bytes * number of macroblocks
instead, same for the following two.

> +-------------------+
> > UV-plane  128 MBs |
> +-------------------+
> > MV buffer  64 MBs |
> 
> +-------------------+
>
> The motion vector buffer offset is currently correct for 4:2:0 because
> the extra space for motion vectors is overallocated with an extra 64 MBs.
> 
> Wrong offset for both destination and motion vector buffer are used
> for the bottom field of field encoded content, wrong offset is
> also used for 4:0:0 (monochrome) content.
> 
> Fix this by always setting the motion vector address to the expected
> 384 MBs offset for 4:2:0 and 256 MBs offset for 4:0:0 content.

Expected by whom? For example, could these be placed in separate buffers
instead of appended to the VB2 allocated buffers?

> Also use correct destination and motion vector buffer offset
> for the bottom field of field encoded content.
> 
> While at it also extend the check for 4:0:0 (monochrome) to include an
> additional check for High Profile (100).
> 
> Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1")
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  .../staging/media/hantro/hantro_g1_h264_dec.c | 33 +++++++++++--------
>  1 file changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 7ab534936843..159bd67e0a36 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -19,6 +19,9 @@
>  #include "hantro_hw.h"
>  #include "hantro_v4l2.h"
>  
> +#define MV_OFFSET_420	384
> +#define MV_OFFSET_400	256
> +
>  static void set_params(struct hantro_ctx *ctx)
>  {
>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
> @@ -49,8 +52,8 @@ static void set_params(struct hantro_ctx *ctx)
>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
>  
>  	/* Decoder control register 1. */
> -	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
> -	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
> +	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
> +	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
>  	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
>  
> @@ -79,7 +82,7 @@ static void set_params(struct hantro_ctx *ctx)
>  		reg |= G1_REG_DEC_CTRL4_CABAC_E;
>  	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
>  		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
> -	if (sps->chroma_format_idc == 0)
> +	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
>  		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
>  	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
>  		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
> @@ -233,6 +236,7 @@ static void set_buffers(struct hantro_ctx *ctx)
>  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>  	struct hantro_dev *vpu = ctx->dev;
>  	dma_addr_t src_dma, dst_dma;
> +	unsigned int offset = MV_OFFSET_420;
>  
>  	src_buf = hantro_get_src_buf(ctx);
>  	dst_buf = hantro_get_dst_buf(ctx);
> @@ -243,19 +247,20 @@ static void set_buffers(struct hantro_ctx *ctx)
>  
>  	/* Destination (decoded frame) buffer. */
>  	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> +	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
> +		dst_dma += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);

How does this work? Does userspace decode two fields into the same
capture buffer and the hardware writes each field with a stride of 2
lines? I suppose this corresponds to V4L2_FIELD_INTERLACED. Could this
also be made to support V4L2_FIELD_SEQ_TB output?

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-03  7:54         ` Jonas Karlman
@ 2019-09-03 12:53           ` Philipp Zabel
  0 siblings, 0 replies; 45+ messages in thread
From: Philipp Zabel @ 2019-09-03 12:53 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Paul Kocialkowski, linux-kernel, Hans Verkuil, linux-rockchip,
	Boris Brezillon, Mauro Carvalho Chehab, linux-media

On Tue, 2019-09-03 at 07:54 +0000, Jonas Karlman wrote:
[...]
> After a closer look both ffmpeg and rkmpp only apply zig-zag scan and not field scan,
> ffmpeg will memcpy the scaling_matrix4/8 as is for vaapi, vdpau and nvdec,
> for dxva2 there is a workaround flag that controls if zig-zag should be applied or not.
> 
> I suggest a clarification of the expect order of values and use of the same value order as vaapi, vdpau and nvdec.
> i.e. have the scaling list values in "matrix order"/"raster order", after zig-zag scan has been applied,
> as is currently expected by cedrus and hantro after this patch.
> 
> I would also suggest a change to the expected order of the 8x8 scaling lists to follow the H264 standard,
> instead of the ffmpeg order like this patch and cedrus driver currently expects.
> 
> Expected scaling list order would then be,
> for 4x4: Intra Y, Intra Cb, Intra Cr, Inter Y, Inter Cb, Inter Cr,
> for 8x8: Intra Y, Inter Y, Intra Cb, Inter Cb, Intra Cr, Inter Cr.

I'm in favor of both, it seems unnecessary to reorder the lists in
userspace only to have the kernel reorder them back before passing them
to the hardware.

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content
  2019-09-01 12:45   ` [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content Jonas Karlman
@ 2019-09-03 13:21     ` Philipp Zabel
  2019-09-03 14:02       ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Philipp Zabel @ 2019-09-03 13:21 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> This need code cleanup and formatting
> 
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>

The previous patches all work, but this patch breaks decoding of
progressive content for me (i.MX8MQ with FFmpeg based on Ezequiel's
branch).

regards
Philipp

> ---
>  .../staging/media/hantro/hantro_g1_h264_dec.c |  26 ++--
>  drivers/staging/media/hantro/hantro_h264.c    | 126 ++++++++++++------
>  drivers/staging/media/hantro/hantro_hw.h      |   4 +
>  3 files changed, 100 insertions(+), 56 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 16f21d258f6a..bc628ef73b29 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -130,28 +130,20 @@ static void set_params(struct hantro_ctx *ctx)
>  
>  static void set_ref(struct hantro_ctx *ctx)
>  {
> +	const struct v4l2_ctrl_h264_decode_params *dec_param;
> +	const struct v4l2_ctrl_h264_slice_params *slice;
>  	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
>  	const u8 *b0_reflist, *b1_reflist, *p_reflist;
>  	struct hantro_dev *vpu = ctx->dev;
> -	u32 dpb_longterm = 0;
> -	u32 dpb_valid = 0;
>  	int reg_num;
>  	u32 reg;
>  	int i;
>  
> -	/*
> -	 * Set up bit maps of valid and long term DPBs.
> -	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
> -	 */
> -	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
> -		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> -			dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> +	dec_param = ctx->h264_dec.ctrls.decode;
> +	slice = ctx->h264_dec.ctrls.slices;
>  
> -		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> -			dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> -	}
> -	vdpu_write_relaxed(vpu, dpb_valid << 16, G1_REG_VALID_REF);
> -	vdpu_write_relaxed(vpu, dpb_longterm << 16, G1_REG_LT_REF);
> +	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_valid, G1_REG_VALID_REF);
> +	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_longterm, G1_REG_LT_REF);
>  
>  	/*
>  	 * Set up reference frame picture numbers.
> @@ -223,10 +215,8 @@ static void set_ref(struct hantro_ctx *ctx)
>  
>  	/* Set up addresses of DPB buffers. */
>  	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
> -		struct vb2_buffer *buf =  hantro_h264_get_ref_buf(ctx, i);
> -
> -		vdpu_write_relaxed(vpu, vb2_dma_contig_plane_dma_addr(buf, 0),
> -				   G1_REG_ADDR_REF(i));
> +		dma_addr_t addr = hantro_h264_get_ref_dma_addr(ctx, i);
> +		vdpu_write_relaxed(vpu, addr, G1_REG_ADDR_REF(i));
>  	}
>  }
>  
> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
> index a77cc28e180a..85c86d728b1a 100644
> --- a/drivers/staging/media/hantro/hantro_h264.c
> +++ b/drivers/staging/media/hantro/hantro_h264.c
> @@ -228,17 +228,65 @@ static void prepare_table(struct hantro_ctx *ctx)
>  {
>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>  	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
> +	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
>  	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
>  	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> +	u32 dpb_longterm = 0;
> +	u32 dpb_valid = 0;
>  	int i;
>  
> +	/*
> +	 * Set up bit maps of valid and long term DPBs.
> +	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
> +	 */
> +	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || (slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
> +		for (i = 0; i < HANTRO_H264_DPB_SIZE * 2; ++i) {
> +			// check for correct reference use
> +			u32 flag = (i & 0x1) ? V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM : V4L2_H264_DPB_ENTRY_FLAG_REF_TOP;
> +			if (dpb[i / 2].flags & flag)
> +				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
> +
> +			if (dpb[i / 2].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> +				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
> +		}
> +
> +		ctx->h264_dec.dpb_valid = dpb_valid;
> +		ctx->h264_dec.dpb_longterm = dpb_longterm;
> +	} else {
> +		for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
> +			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> +				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> +
> +			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> +				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
> +		}
> +
> +		ctx->h264_dec.dpb_valid = dpb_valid << 16;
> +		ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
> +	}
> +
>  	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
> -		tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
> -		tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
> +		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
> +			tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
> +			tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
> +		} else {
> +			tbl->poc[i * 2] = 0;
> +			tbl->poc[i * 2 + 1] = 0;
> +		}
>  	}
>  
> -	tbl->poc[32] = dec_param->top_field_order_cnt;
> -	tbl->poc[33] = dec_param->bottom_field_order_cnt;
> +	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || !(slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
> +		if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
> +			tbl->poc[32] = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
> +					dec_param->bottom_field_order_cnt :
> +					dec_param->top_field_order_cnt;
> +		else
> +			tbl->poc[32] = min(dec_param->top_field_order_cnt, dec_param->bottom_field_order_cnt);
> +		tbl->poc[33] = 0;
> +	} else {
> +		tbl->poc[32] = dec_param->top_field_order_cnt;
> +		tbl->poc[33] = dec_param->bottom_field_order_cnt;
> +	}
>  
>  	reorder_scaling_list(ctx);
>  }
> @@ -251,51 +299,36 @@ struct hantro_h264_reflist_builder {
>  	u8 num_valid;
>  };
>  
> -static s32 get_poc(enum v4l2_field field, s32 top_field_order_cnt,
> -		   s32 bottom_field_order_cnt)
> -{
> -	switch (field) {
> -	case V4L2_FIELD_TOP:
> -		return top_field_order_cnt;
> -	case V4L2_FIELD_BOTTOM:
> -		return bottom_field_order_cnt;
> -	default:
> -		break;
> -	}
> -
> -	return min(top_field_order_cnt, bottom_field_order_cnt);
> -}
> -
>  static void
>  init_reflist_builder(struct hantro_ctx *ctx,
>  		     struct hantro_h264_reflist_builder *b)
>  {
>  	const struct v4l2_ctrl_h264_decode_params *dec_param;
> -	struct vb2_v4l2_buffer *buf = hantro_get_dst_buf(ctx);
> +	const struct v4l2_ctrl_h264_slice_params *slices;
>  	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> -	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
>  	unsigned int i;
>  
>  	dec_param = ctx->h264_dec.ctrls.decode;
> +	slices = ctx->h264_dec.ctrls.slices;
>  
>  	memset(b, 0, sizeof(*b));
>  	b->dpb = dpb;
> -	b->curpoc = get_poc(buf->field, dec_param->top_field_order_cnt,
> -			    dec_param->bottom_field_order_cnt);
> +	b->curpoc = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
> +		    dec_param->bottom_field_order_cnt :
> +		    dec_param->top_field_order_cnt;
>  
>  	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
> -		int buf_idx;
> -
> -		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> +		u32 ref_flag = dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME;
> +		if (!ref_flag)
>  			continue;
>  
> -		buf_idx = vb2_find_timestamp(cap_q, dpb[i].reference_ts, 0);
> -		if (buf_idx < 0)
> -			continue;
> +		if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME)
> +			b->pocs[i] = min(dpb[i].bottom_field_order_cnt, dpb[i].top_field_order_cnt);
> +		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM)
> +			b->pocs[i] = dpb[i].bottom_field_order_cnt;
> +		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_TOP)
> +			b->pocs[i] = dpb[i].top_field_order_cnt;
>  
> -		buf = to_vb2_v4l2_buffer(vb2_get_buffer(cap_q, buf_idx));
> -		b->pocs[i] = get_poc(buf->field, dpb[i].top_field_order_cnt,
> -				     dpb[i].bottom_field_order_cnt);
>  		b->unordered_reflist[b->num_valid] = i;
>  		b->num_valid++;
>  	}
> @@ -448,8 +481,7 @@ build_b_ref_lists(const struct hantro_h264_reflist_builder *builder,
>  static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
>  			    const struct v4l2_h264_dpb_entry *b)
>  {
> -	return a->top_field_order_cnt == b->top_field_order_cnt &&
> -	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> +	return a->reference_ts == b->reference_ts;
>  }
>  
>  static void update_dpb(struct hantro_ctx *ctx)
> @@ -463,13 +495,13 @@ static void update_dpb(struct hantro_ctx *ctx)
>  
>  	/* Disable all entries by default. */
>  	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
> -		ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> +		ctx->h264_dec.dpb[i].flags = 0;
>  
>  	/* Try to match new DPB entries with existing ones by their POCs. */
>  	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
>  		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
>  
> -		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> +		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
>  			continue;
>  
>  		/*
> @@ -480,8 +512,7 @@ static void update_dpb(struct hantro_ctx *ctx)
>  			struct v4l2_h264_dpb_entry *cdpb;
>  
>  			cdpb = &ctx->h264_dec.dpb[j];
> -			if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
> -			    !dpb_entry_match(cdpb, ndpb))
> +			if (!dpb_entry_match(cdpb, ndpb))
>  				continue;
>  
>  			*cdpb = *ndpb;
> @@ -541,6 +572,25 @@ struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
>  	return buf;
>  }
>  
> +dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
> +					unsigned int dpb_idx)
> +{
> +	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
> +	const struct v4l2_ctrl_h264_decode_params *dec_param = ctx->h264_dec.ctrls.decode;
> +	const struct v4l2_ctrl_h264_slice_params *slices = ctx->h264_dec.ctrls.slices;
> +
> +	struct vb2_buffer *buf = hantro_h264_get_ref_buf(ctx, dpb_idx);
> +	s32 cur_poc = slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD ?
> +		      dec_param->bottom_field_order_cnt :
> +		      dec_param->top_field_order_cnt;
> +	u32 flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE ? 0x2 : 0;
> +	flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
> +		 abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
> +		 0x1 : 0;
> +
> +	return vb2_dma_contig_plane_dma_addr(buf, 0) | flags;
> +}
> +
>  int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
>  {
>  	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
> index 8adad8ac9b1d..d58f2a36ca40 100644
> --- a/drivers/staging/media/hantro/hantro_hw.h
> +++ b/drivers/staging/media/hantro/hantro_hw.h
> @@ -86,6 +86,8 @@ struct hantro_h264_dec_hw_ctx {
>  	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
>  	struct hantro_h264_dec_reflists reflists;
>  	struct hantro_h264_dec_ctrls ctrls;
> +	u32 dpb_longterm;
> +	u32 dpb_valid;
>  };
>  
>  /**
> @@ -157,6 +159,8 @@ void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
>  
>  struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
>  					   unsigned int dpb_idx);
> +dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
> +					unsigned int dpb_idx);
>  int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
>  void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
>  int hantro_h264_dec_init(struct hantro_ctx *ctx);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content
  2019-09-03 13:21     ` Philipp Zabel
@ 2019-09-03 14:02       ` Jonas Karlman
  2019-09-03 15:01         ` Philipp Zabel
  0 siblings, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2019-09-03 14:02 UTC (permalink / raw)
  To: Philipp Zabel, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2019-09-03 15:21, Philipp Zabel wrote:
> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>> This need code cleanup and formatting
>>
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> The previous patches all work, but this patch breaks decoding of
> progressive content for me (i.MX8MQ with FFmpeg based on Ezequiel's
> branch).

Please try with ffmpeg based on my v4l2-request-hwaccel-4.0.4-hantro branch at [1],
up to and including the commit "HACK: add dpb flags for reference usage and field picture".
That commit adds code to set reference flags needed by the changes in this patch.

There is probably also some other minor difference between our two ffmpeg branches.
I have not observed any issues with progressive content with this patch and my ffmpeg branch (on a RK3288 device).
Some H264 reference samples do have visual issues after this patch, however all my real world samples does seem to work.

My branch use libudev to probe media/video devices and needs to be configured with:
--enable-v4l2-request --enable-libudev --enable-libdrm

[1] https://github.com/Kwiboo/FFmpeg/commits/v4l2-request-hwaccel-4.0.4-hantro

Regards,
Jonas

>
> regards
> Philipp
>
>> ---
>>  .../staging/media/hantro/hantro_g1_h264_dec.c |  26 ++--
>>  drivers/staging/media/hantro/hantro_h264.c    | 126 ++++++++++++------
>>  drivers/staging/media/hantro/hantro_hw.h      |   4 +
>>  3 files changed, 100 insertions(+), 56 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 16f21d258f6a..bc628ef73b29 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -130,28 +130,20 @@ static void set_params(struct hantro_ctx *ctx)
>>  
>>  static void set_ref(struct hantro_ctx *ctx)
>>  {
>> +	const struct v4l2_ctrl_h264_decode_params *dec_param;
>> +	const struct v4l2_ctrl_h264_slice_params *slice;
>>  	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
>>  	const u8 *b0_reflist, *b1_reflist, *p_reflist;
>>  	struct hantro_dev *vpu = ctx->dev;
>> -	u32 dpb_longterm = 0;
>> -	u32 dpb_valid = 0;
>>  	int reg_num;
>>  	u32 reg;
>>  	int i;
>>  
>> -	/*
>> -	 * Set up bit maps of valid and long term DPBs.
>> -	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
>> -	 */
>> -	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
>> -		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
>> -			dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>> +	dec_param = ctx->h264_dec.ctrls.decode;
>> +	slice = ctx->h264_dec.ctrls.slices;
>>  
>> -		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>> -			dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>> -	}
>> -	vdpu_write_relaxed(vpu, dpb_valid << 16, G1_REG_VALID_REF);
>> -	vdpu_write_relaxed(vpu, dpb_longterm << 16, G1_REG_LT_REF);
>> +	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_valid, G1_REG_VALID_REF);
>> +	vdpu_write_relaxed(vpu, ctx->h264_dec.dpb_longterm, G1_REG_LT_REF);
>>  
>>  	/*
>>  	 * Set up reference frame picture numbers.
>> @@ -223,10 +215,8 @@ static void set_ref(struct hantro_ctx *ctx)
>>  
>>  	/* Set up addresses of DPB buffers. */
>>  	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
>> -		struct vb2_buffer *buf =  hantro_h264_get_ref_buf(ctx, i);
>> -
>> -		vdpu_write_relaxed(vpu, vb2_dma_contig_plane_dma_addr(buf, 0),
>> -				   G1_REG_ADDR_REF(i));
>> +		dma_addr_t addr = hantro_h264_get_ref_dma_addr(ctx, i);
>> +		vdpu_write_relaxed(vpu, addr, G1_REG_ADDR_REF(i));
>>  	}
>>  }
>>  
>> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>> index a77cc28e180a..85c86d728b1a 100644
>> --- a/drivers/staging/media/hantro/hantro_h264.c
>> +++ b/drivers/staging/media/hantro/hantro_h264.c
>> @@ -228,17 +228,65 @@ static void prepare_table(struct hantro_ctx *ctx)
>>  {
>>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>>  	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
>> +	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
>>  	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
>>  	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
>> +	u32 dpb_longterm = 0;
>> +	u32 dpb_valid = 0;
>>  	int i;
>>  
>> +	/*
>> +	 * Set up bit maps of valid and long term DPBs.
>> +	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
>> +	 */
>> +	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || (slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
>> +		for (i = 0; i < HANTRO_H264_DPB_SIZE * 2; ++i) {
>> +			// check for correct reference use
>> +			u32 flag = (i & 0x1) ? V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM : V4L2_H264_DPB_ENTRY_FLAG_REF_TOP;
>> +			if (dpb[i / 2].flags & flag)
>> +				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
>> +
>> +			if (dpb[i / 2].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>> +				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE * 2 - 1 - i);
>> +		}
>> +
>> +		ctx->h264_dec.dpb_valid = dpb_valid;
>> +		ctx->h264_dec.dpb_longterm = dpb_longterm;
>> +	} else {
>> +		for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
>> +			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
>> +				dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>> +
>> +			if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
>> +				dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
>> +		}
>> +
>> +		ctx->h264_dec.dpb_valid = dpb_valid << 16;
>> +		ctx->h264_dec.dpb_longterm = dpb_longterm << 16;
>> +	}
>> +
>>  	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
>> -		tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
>> -		tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
>> +		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) {
>> +			tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
>> +			tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
>> +		} else {
>> +			tbl->poc[i * 2] = 0;
>> +			tbl->poc[i * 2 + 1] = 0;
>> +		}
>>  	}
>>  
>> -	tbl->poc[32] = dec_param->top_field_order_cnt;
>> -	tbl->poc[33] = dec_param->bottom_field_order_cnt;
>> +	if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC) || !(slices[0].flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)) {
>> +		if ((slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
>> +			tbl->poc[32] = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
>> +					dec_param->bottom_field_order_cnt :
>> +					dec_param->top_field_order_cnt;
>> +		else
>> +			tbl->poc[32] = min(dec_param->top_field_order_cnt, dec_param->bottom_field_order_cnt);
>> +		tbl->poc[33] = 0;
>> +	} else {
>> +		tbl->poc[32] = dec_param->top_field_order_cnt;
>> +		tbl->poc[33] = dec_param->bottom_field_order_cnt;
>> +	}
>>  
>>  	reorder_scaling_list(ctx);
>>  }
>> @@ -251,51 +299,36 @@ struct hantro_h264_reflist_builder {
>>  	u8 num_valid;
>>  };
>>  
>> -static s32 get_poc(enum v4l2_field field, s32 top_field_order_cnt,
>> -		   s32 bottom_field_order_cnt)
>> -{
>> -	switch (field) {
>> -	case V4L2_FIELD_TOP:
>> -		return top_field_order_cnt;
>> -	case V4L2_FIELD_BOTTOM:
>> -		return bottom_field_order_cnt;
>> -	default:
>> -		break;
>> -	}
>> -
>> -	return min(top_field_order_cnt, bottom_field_order_cnt);
>> -}
>> -
>>  static void
>>  init_reflist_builder(struct hantro_ctx *ctx,
>>  		     struct hantro_h264_reflist_builder *b)
>>  {
>>  	const struct v4l2_ctrl_h264_decode_params *dec_param;
>> -	struct vb2_v4l2_buffer *buf = hantro_get_dst_buf(ctx);
>> +	const struct v4l2_ctrl_h264_slice_params *slices;
>>  	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
>> -	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
>>  	unsigned int i;
>>  
>>  	dec_param = ctx->h264_dec.ctrls.decode;
>> +	slices = ctx->h264_dec.ctrls.slices;
>>  
>>  	memset(b, 0, sizeof(*b));
>>  	b->dpb = dpb;
>> -	b->curpoc = get_poc(buf->field, dec_param->top_field_order_cnt,
>> -			    dec_param->bottom_field_order_cnt);
>> +	b->curpoc = (slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD) ?
>> +		    dec_param->bottom_field_order_cnt :
>> +		    dec_param->top_field_order_cnt;
>>  
>>  	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
>> -		int buf_idx;
>> -
>> -		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
>> +		u32 ref_flag = dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME;
>> +		if (!ref_flag)
>>  			continue;
>>  
>> -		buf_idx = vb2_find_timestamp(cap_q, dpb[i].reference_ts, 0);
>> -		if (buf_idx < 0)
>> -			continue;
>> +		if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME)
>> +			b->pocs[i] = min(dpb[i].bottom_field_order_cnt, dpb[i].top_field_order_cnt);
>> +		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM)
>> +			b->pocs[i] = dpb[i].bottom_field_order_cnt;
>> +		else if (ref_flag == V4L2_H264_DPB_ENTRY_FLAG_REF_TOP)
>> +			b->pocs[i] = dpb[i].top_field_order_cnt;
>>  
>> -		buf = to_vb2_v4l2_buffer(vb2_get_buffer(cap_q, buf_idx));
>> -		b->pocs[i] = get_poc(buf->field, dpb[i].top_field_order_cnt,
>> -				     dpb[i].bottom_field_order_cnt);
>>  		b->unordered_reflist[b->num_valid] = i;
>>  		b->num_valid++;
>>  	}
>> @@ -448,8 +481,7 @@ build_b_ref_lists(const struct hantro_h264_reflist_builder *builder,
>>  static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
>>  			    const struct v4l2_h264_dpb_entry *b)
>>  {
>> -	return a->top_field_order_cnt == b->top_field_order_cnt &&
>> -	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
>> +	return a->reference_ts == b->reference_ts;
>>  }
>>  
>>  static void update_dpb(struct hantro_ctx *ctx)
>> @@ -463,13 +495,13 @@ static void update_dpb(struct hantro_ctx *ctx)
>>  
>>  	/* Disable all entries by default. */
>>  	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
>> -		ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
>> +		ctx->h264_dec.dpb[i].flags = 0;
>>  
>>  	/* Try to match new DPB entries with existing ones by their POCs. */
>>  	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
>>  		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
>>  
>> -		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
>> +		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_VALID))
>>  			continue;
>>  
>>  		/*
>> @@ -480,8 +512,7 @@ static void update_dpb(struct hantro_ctx *ctx)
>>  			struct v4l2_h264_dpb_entry *cdpb;
>>  
>>  			cdpb = &ctx->h264_dec.dpb[j];
>> -			if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
>> -			    !dpb_entry_match(cdpb, ndpb))
>> +			if (!dpb_entry_match(cdpb, ndpb))
>>  				continue;
>>  
>>  			*cdpb = *ndpb;
>> @@ -541,6 +572,25 @@ struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
>>  	return buf;
>>  }
>>  
>> +dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
>> +					unsigned int dpb_idx)
>> +{
>> +	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
>> +	const struct v4l2_ctrl_h264_decode_params *dec_param = ctx->h264_dec.ctrls.decode;
>> +	const struct v4l2_ctrl_h264_slice_params *slices = ctx->h264_dec.ctrls.slices;
>> +
>> +	struct vb2_buffer *buf = hantro_h264_get_ref_buf(ctx, dpb_idx);
>> +	s32 cur_poc = slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD ?
>> +		      dec_param->bottom_field_order_cnt :
>> +		      dec_param->top_field_order_cnt;
>> +	u32 flags = dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE ? 0x2 : 0;
>> +	flags |= abs(dpb[dpb_idx].top_field_order_cnt - cur_poc) <
>> +		 abs(dpb[dpb_idx].bottom_field_order_cnt - cur_poc) ?
>> +		 0x1 : 0;
>> +
>> +	return vb2_dma_contig_plane_dma_addr(buf, 0) | flags;
>> +}
>> +
>>  int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
>>  {
>>  	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
>> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>> index 8adad8ac9b1d..d58f2a36ca40 100644
>> --- a/drivers/staging/media/hantro/hantro_hw.h
>> +++ b/drivers/staging/media/hantro/hantro_hw.h
>> @@ -86,6 +86,8 @@ struct hantro_h264_dec_hw_ctx {
>>  	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
>>  	struct hantro_h264_dec_reflists reflists;
>>  	struct hantro_h264_dec_ctrls ctrls;
>> +	u32 dpb_longterm;
>> +	u32 dpb_valid;
>>  };
>>  
>>  /**
>> @@ -157,6 +159,8 @@ void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
>>  
>>  struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
>>  					   unsigned int dpb_idx);
>> +dma_addr_t hantro_h264_get_ref_dma_addr(struct hantro_ctx *ctx,
>> +					unsigned int dpb_idx);
>>  int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
>>  void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
>>  int hantro_h264_dec_init(struct hantro_ctx *ctx);


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content
  2019-09-03 14:02       ` Jonas Karlman
@ 2019-09-03 15:01         ` Philipp Zabel
  2019-09-03 19:47           ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Philipp Zabel @ 2019-09-03 15:01 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On Tue, 2019-09-03 at 14:02 +0000, Jonas Karlman wrote:
> On 2019-09-03 15:21, Philipp Zabel wrote:
> > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> > > This need code cleanup and formatting
> > > 
> > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > 
> > The previous patches all work, but this patch breaks decoding of
> > progressive content for me (i.MX8MQ with FFmpeg based on Ezequiel's
> > branch).
> 
> Please try with ffmpeg based on my v4l2-request-hwaccel-4.0.4-hantro branch at [1],
> up to and including the commit "HACK: add dpb flags for reference usage and field picture".
> That commit adds code to set reference flags needed by the changes in this patch.
> 
> There is probably also some other minor difference between our two ffmpeg branches.
> I have not observed any issues with progressive content with this patch and my ffmpeg branch (on a RK3288 device).
> Some H264 reference samples do have visual issues after this patch, however all my real world samples does seem to work.
> 
> My branch use libudev to probe media/video devices and needs to be configured with:
> --enable-v4l2-request --enable-libudev --enable-libdrm
> 
> [1] https://github.com/Kwiboo/FFmpeg/commits/v4l2-request-hwaccel-4.0.4-hantro

I hadn't realized that this is a backwards incompatible change. With
your branch rebased onto n4.2, and this patch applied, decoding seems to
work.

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content
  2019-09-03 15:01         ` Philipp Zabel
@ 2019-09-03 19:47           ` Jonas Karlman
  0 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-03 19:47 UTC (permalink / raw)
  To: Philipp Zabel, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2019-09-03 17:01, Philipp Zabel wrote:
> On Tue, 2019-09-03 at 14:02 +0000, Jonas Karlman wrote:
>> On 2019-09-03 15:21, Philipp Zabel wrote:
>>> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>>>> This need code cleanup and formatting
>>>>
>>>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>>> The previous patches all work, but this patch breaks decoding of
>>> progressive content for me (i.MX8MQ with FFmpeg based on Ezequiel's
>>> branch).
>> Please try with ffmpeg based on my v4l2-request-hwaccel-4.0.4-hantro branch at [1],
>> up to and including the commit "HACK: add dpb flags for reference usage and field picture".
>> That commit adds code to set reference flags needed by the changes in this patch.
>>
>> There is probably also some other minor difference between our two ffmpeg branches.
>> I have not observed any issues with progressive content with this patch and my ffmpeg branch (on a RK3288 device).
>> Some H264 reference samples do have visual issues after this patch, however all my real world samples does seem to work.
>>
>> My branch use libudev to probe media/video devices and needs to be configured with:
>> --enable-v4l2-request --enable-libudev --enable-libdrm
>>
>> [1] https://github.com/Kwiboo/FFmpeg/commits/v4l2-request-hwaccel-4.0.4-hantro
> I hadn't realized that this is a backwards incompatible change. With
> your branch rebased onto n4.2, and this patch applied, decoding seems to
> work.

Nor did I, thanks for testing and verifying, I will try to reduce breaking changes in a v2.

Regards,
Jonas

>
> regards
> Philipp


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset
  2019-09-03 10:58     ` Philipp Zabel
@ 2019-09-03 20:13       ` Jonas Karlman
  0 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-03 20:13 UTC (permalink / raw)
  To: Philipp Zabel, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2019-09-03 12:58, Philipp Zabel wrote:
> Hi Jonas,
>
> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>> A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
>> and is laid out in memory as follow:
> Do you mean "A decoded 8-bit 4:2:0 frame needs up to 448 bytes per
> macroblock"?
>
> A 1280x720 frame already consists of 3600 macroblocks (each 16x16 Y +
> 2x8x8 Cb,Cr).

You are correct, thanks for pointing out, I will change in a v2.

>
>> +-------------------+
>>> Y-plane   256 MBs |
> So that looks like it should be 256 bytes * number of macroblocks
> instead, same for the following two.

Ack.

>
>> +-------------------+
>>> UV-plane  128 MBs |
>> +-------------------+
>>> MV buffer  64 MBs |
>> +-------------------+
>>
>> The motion vector buffer offset is currently correct for 4:2:0 because
>> the extra space for motion vectors is overallocated with an extra 64 MBs.
>>
>> Wrong offset for both destination and motion vector buffer are used
>> for the bottom field of field encoded content, wrong offset is
>> also used for 4:0:0 (monochrome) content.
>>
>> Fix this by always setting the motion vector address to the expected
>> 384 MBs offset for 4:2:0 and 256 MBs offset for 4:0:0 content.
> Expected by whom? For example, could these be placed in separate buffers
> instead of appended to the VB2 allocated buffers?

From what I understand main and high profile decoding have hw constraints in that
the direct mode motion vectors buffer must be located continuously after the YUV buffer.

I also observed instances where the current requirement for profile_idc > 66 caused issues
for some streams, e.g. big_buck_bunny_1080p_H264_AAC_25fps_7200K.mp4

Because of this it was just easier to always configure the motion vector buffer address.

>
>> Also use correct destination and motion vector buffer offset
>> for the bottom field of field encoded content.
>>
>> While at it also extend the check for 4:0:0 (monochrome) to include an
>> additional check for High Profile (100).
>>
>> Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1")
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>> ---
>>  .../staging/media/hantro/hantro_g1_h264_dec.c | 33 +++++++++++--------
>>  1 file changed, 19 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 7ab534936843..159bd67e0a36 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -19,6 +19,9 @@
>>  #include "hantro_hw.h"
>>  #include "hantro_v4l2.h"
>>  
>> +#define MV_OFFSET_420	384
>> +#define MV_OFFSET_400	256
>> +
>>  static void set_params(struct hantro_ctx *ctx)
>>  {
>>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
>> @@ -49,8 +52,8 @@ static void set_params(struct hantro_ctx *ctx)
>>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
>>  
>>  	/* Decoder control register 1. */
>> -	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
>> -	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
>> +	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
>> +	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |
>>  	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
>>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
>>  
>> @@ -79,7 +82,7 @@ static void set_params(struct hantro_ctx *ctx)
>>  		reg |= G1_REG_DEC_CTRL4_CABAC_E;
>>  	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
>>  		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
>> -	if (sps->chroma_format_idc == 0)
>> +	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
>>  		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
>>  	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
>>  		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
>> @@ -233,6 +236,7 @@ static void set_buffers(struct hantro_ctx *ctx)
>>  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>>  	struct hantro_dev *vpu = ctx->dev;
>>  	dma_addr_t src_dma, dst_dma;
>> +	unsigned int offset = MV_OFFSET_420;
>>  
>>  	src_buf = hantro_get_src_buf(ctx);
>>  	dst_buf = hantro_get_dst_buf(ctx);
>> @@ -243,19 +247,20 @@ static void set_buffers(struct hantro_ctx *ctx)
>>  
>>  	/* Destination (decoded frame) buffer. */
>>  	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
>> +	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
>> +		dst_dma += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
> How does this work? Does userspace decode two fields into the same
> capture buffer and the hardware writes each field with a stride of 2
> lines? I suppose this corresponds to V4L2_FIELD_INTERLACED. Could this
> also be made to support V4L2_FIELD_SEQ_TB output?

Yes, both fields are decoded into the same capture buffer, top field to odd numbered lines
and bottom field to even numbered lines, so I guess this corresponds to V4L2_FIELD_INTERLACED.
This is also how the cedrus driver handles field decoding with Jernej's h264 patches.

I do not know if it is possible to configure the hw to decode into a V4L2_FIELD_SEQ_TB type output.

Regards,
Jonas

>
> regards
> Philipp


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288
  2019-09-01 12:45 ` [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288 Jonas Karlman
@ 2019-09-04 13:07   ` Ezequiel Garcia
  2019-09-09 19:25     ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2019-09-04 13:07 UTC (permalink / raw)
  To: Jonas Karlman
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

Hello Jonas,

Thank you for the patch.

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> TRM specify supported image size 48x48 to 4096x2304 at step size 16 pixels,
> change frmsize max_width/max_height to match TRM.
> 

The RK3288 TRM v1.1 (2015-8-20) I have here mentions a maximum
of 3840x2160.

I must admit I haven't tested with actual content this size
to verify it, have you checked it?

Thanks,
Ezequiel
 
> Fixes: 760327930e10 ("media: hantro: Enable H264 decoding on rk3288")
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  drivers/staging/media/hantro/rk3288_vpu_hw.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c b/drivers/staging/media/hantro/rk3288_vpu_hw.c
> index 6bfcc47d1e58..ebb017b8a334 100644
> --- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
> +++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
> @@ -67,10 +67,10 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
>  		.max_depth = 2,
>  		.frmsize = {
>  			.min_width = 48,
> -			.max_width = 3840,
> +			.max_width = 4096,
>  			.step_width = H264_MB_DIM,
>  			.min_height = 48,
> -			.max_height = 2160,
> +			.max_height = 2304,
>  			.step_height = H264_MB_DIM,
>  		},
>  	},



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288
  2019-09-04 13:07   ` Ezequiel Garcia
@ 2019-09-09 19:25     ` Jonas Karlman
  0 siblings, 0 replies; 45+ messages in thread
From: Jonas Karlman @ 2019-09-09 19:25 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

On 2019-09-04 15:07, Ezequiel Garcia wrote:
> Hello Jonas,
>
> Thank you for the patch.
>
> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>> TRM specify supported image size 48x48 to 4096x2304 at step size 16 pixels,
>> change frmsize max_width/max_height to match TRM.
>>
> The RK3288 TRM v1.1 (2015-8-20) I have here mentions a maximum
> of 3840x2160.
>
> I must admit I haven't tested with actual content this size
> to verify it, have you checked it?

I can confirm that one of my test samples (PUPPIES BATH IN 4K) is 4096x2304 and can be decoded after this patch.
However the decoding speed is not optimal at 400Mhz, if I recall correctly you need to set the VPU1 clock to 600Mhz for 4K decoding on RK3288.

The RK3288 TRM vcodec chapter from [1], unknown revision and date, lists 48x48 to 4096x2304 step size 16 pixels under 25.5.1 H.264 decoder.

[1] http://www.t-firefly.com/download/firefly-rk3288/docs/TRM/rk3288-chapter-25-video-encoder-decoder-unit-(vcodec).pdf

Regards,
Jonas

>
> Thanks,
> Ezequiel
>  
>> Fixes: 760327930e10 ("media: hantro: Enable H264 decoding on rk3288")
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>> ---
>>  drivers/staging/media/hantro/rk3288_vpu_hw.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c b/drivers/staging/media/hantro/rk3288_vpu_hw.c
>> index 6bfcc47d1e58..ebb017b8a334 100644
>> --- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
>> +++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
>> @@ -67,10 +67,10 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
>>  		.max_depth = 2,
>>  		.frmsize = {
>>  			.min_width = 48,
>> -			.max_width = 3840,
>> +			.max_width = 4096,
>>  			.step_width = H264_MB_DIM,
>>  			.min_height = 48,
>> -			.max_height = 2160,
>> +			.max_height = 2304,
>>  			.step_height = H264_MB_DIM,
>>  		},
>>  	},
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/12] media: hantro: Do not reorder H264 scaling list
  2019-09-02 16:18       ` Jonas Karlman
  2019-09-03  7:54         ` Jonas Karlman
  2019-09-03  9:56         ` Philipp Zabel
@ 2019-09-10 10:14         ` Ezequiel Garcia
  2 siblings, 0 replies; 45+ messages in thread
From: Ezequiel Garcia @ 2019-09-10 10:14 UTC (permalink / raw)
  To: Jonas Karlman, Philipp Zabel
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

Hi Jonas,

Thanks for your patch.

On Mon, 2019-09-02 at 16:18 +0000, Jonas Karlman wrote:
[..]
> 
> > > diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
> > > index 0d758e0c0f99..e2d01145ac4f 100644
> > > --- a/drivers/staging/media/hantro/hantro_h264.c
> > > +++ b/drivers/staging/media/hantro/hantro_h264.c
> > > @@ -20,7 +20,7 @@
> > >  /* Size with u32 units. */
> > >  #define CABAC_INIT_BUFFER_SIZE		(460 * 2)
> > >  #define POC_BUFFER_SIZE			34
> > > -#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
> > > +#define SCALING_LIST_SIZE		(6 * 16 + 2 * 64)
> > This changes the size of struct hantro_h264_dec_priv_tbl. Did this
> > describe the auxiliary buffer format incorrectly before?
> 
> Based on RKMPP and Hantro SDK the HW expects the 8x8 inter/intra list for
> Y-component to be located at indices 0 and 1, lists for Cr/Cb is only used for
> 4:4:4 and HW only supports 4:0:0/4:2:0 if I am not mistaken. So the unused
> extra 4 lists at the end of the auxiliary buffer seemed like a waste,
> also RKMPP and Hantro SDK only seemed to allocate space for 2 lists.
> 

I think it would make a lot of sense to document what the hardware
expects somewhere, perhaps as part of the struct hantro_h264_dec_priv_tbl
documentation?

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset
  2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
  2019-09-03 10:58     ` Philipp Zabel
@ 2019-09-10 10:18     ` Ezequiel Garcia
  2019-09-10 11:34     ` Ezequiel Garcia
  2 siblings, 0 replies; 45+ messages in thread
From: Ezequiel Garcia @ 2019-09-10 10:18 UTC (permalink / raw)
  To: Jonas Karlman
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

Hi Jonas,

Thanks for fixing this, I'm happy we are reducing the amount
of black magic here.

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
> and is laid out in memory as follow:
> 
> +-------------------+
> > Y-plane   256 MBs |
> +-------------------+
> > UV-plane  128 MBs |
> +-------------------+
> > MV buffer  64 MBs |
> +-------------------+
> 
> The motion vector buffer offset is currently correct for 4:2:0 because
> the extra space for motion vectors is overallocated with an extra 64 MBs.
> 
> Wrong offset for both destination and motion vector buffer are used
> for the bottom field of field encoded content, wrong offset is
> also used for 4:0:0 (monochrome) content.
> 
> Fix this by always setting the motion vector address to the expected
> 384 MBs offset for 4:2:0 and 256 MBs offset for 4:0:0 content.
> 
> Also use correct destination and motion vector buffer offset
> for the bottom field of field encoded content.
> 
> While at it also extend the check for 4:0:0 (monochrome) to include an
> additional check for High Profile (100).
> 

As with the scaling list, I believe it would make a lot of sense
to document this in the driver itself.

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset
  2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
  2019-09-03 10:58     ` Philipp Zabel
  2019-09-10 10:18     ` Ezequiel Garcia
@ 2019-09-10 11:34     ` Ezequiel Garcia
  2 siblings, 0 replies; 45+ messages in thread
From: Ezequiel Garcia @ 2019-09-10 11:34 UTC (permalink / raw)
  To: Jonas Karlman
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

A few more comments...

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> A decoded 8-bit 4:2:0 frame need memory for up to 448 macroblocks
> and is laid out in memory as follow:
> 
> +-------------------+
> > Y-plane   256 MBs |
> +-------------------+
> > UV-plane  128 MBs |
> +-------------------+
> > MV buffer  64 MBs |
> +-------------------+
> 
> The motion vector buffer offset is currently correct for 4:2:0 because
> the extra space for motion vectors is overallocated with an extra 64 MBs.
> 
> Wrong offset for both destination and motion vector buffer are used
> for the bottom field of field encoded content, wrong offset is
> also used for 4:0:0 (monochrome) content.
> 
> Fix this by always setting the motion vector address to the expected
> 384 MBs offset for 4:2:0 and 256 MBs offset for 4:0:0 content.
> 
> Also use correct destination and motion vector buffer offset
> for the bottom field of field encoded content.
> 
> While at it also extend the check for 4:0:0 (monochrome) to include an
> additional check for High Profile (100).
> 
> Fixes: dea0a82f3d22 ("media: hantro: Add support for H264 decoding on G1")
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  .../staging/media/hantro/hantro_g1_h264_dec.c | 33 +++++++++++--------
>  1 file changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 7ab534936843..159bd67e0a36 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -19,6 +19,9 @@
>  #include "hantro_hw.h"
>  #include "hantro_v4l2.h"
>  
> +#define MV_OFFSET_420	384
> +#define MV_OFFSET_400	256
> +

Instead of introducing these macros, I'd just use the macroblock width
and height ones explicitly. This way it's more clear where is
the code coming from.

>  static void set_params(struct hantro_ctx *ctx)
>  {
>  	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
> @@ -49,8 +52,8 @@ static void set_params(struct hantro_ctx *ctx)
>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
>  
>  	/* Decoder control register 1. */
> -	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
> -	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
> +	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(H264_MB_WIDTH(ctx->dst_fmt.width)) |
> +	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(H264_MB_HEIGHT(ctx->dst_fmt.height)) |

This is a nice fix, but unless I'm missing something it's unrelated to this patch.
 
>  	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
>  	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
>  
> @@ -79,7 +82,7 @@ static void set_params(struct hantro_ctx *ctx)
>  		reg |= G1_REG_DEC_CTRL4_CABAC_E;
>  	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
>  		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
> -	if (sps->chroma_format_idc == 0)
> +	if (sps->profile_idc >= 100 && sps->chroma_format_idc == 0)
>  		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
>  	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
>  		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
> @@ -233,6 +236,7 @@ static void set_buffers(struct hantro_ctx *ctx)
>  	struct vb2_v4l2_buffer *src_buf, *dst_buf;
>  	struct hantro_dev *vpu = ctx->dev;
>  	dma_addr_t src_dma, dst_dma;
> +	unsigned int offset = MV_OFFSET_420;
>  
>  	src_buf = hantro_get_src_buf(ctx);
>  	dst_buf = hantro_get_dst_buf(ctx);
> @@ -243,19 +247,20 @@ static void set_buffers(struct hantro_ctx *ctx)
>  
>  	/* Destination (decoded frame) buffer. */
>  	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
> +	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
> +		dst_dma += ALIGN(ctx->dst_fmt.width, H264_MB_DIM);
>  	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
>  
> -	/* Higher profiles require DMV buffer appended to reference frames. */
> -	if (ctrls->sps->profile_idc > 66) {
> -		size_t pic_size = ctx->h264_dec.pic_size;
> -		size_t mv_offset = round_up(pic_size, 8);
> -
> -		if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
> -			mv_offset += 32 * H264_MB_WIDTH(ctx->dst_fmt.width);
> -
> -		vdpu_write_relaxed(vpu, dst_dma + mv_offset,
> -				   G1_REG_ADDR_DIR_MV);
> -	}
> +	/* Motion vector buffer is located after the decoded frame. */
> +	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);

I would try to rework the code to avoid calling
vb2_dma_contig_plane_dma_addr() again.

> +	if (ctrls->sps->profile_idc >= 100 && ctrls->sps->chroma_format_idc == 0)
> +		offset = MV_OFFSET_400;
> +	dst_dma += offset * H264_MB_WIDTH(ctx->dst_fmt.width) *
> +		   H264_MB_HEIGHT(ctx->dst_fmt.height);

Perhaps rename 'offset' to something different? Maybe bytes_per_mb
or similar.

> +	if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
> +		dst_dma += 32 * H264_MB_WIDTH(ctx->dst_fmt.width) *
> +			   H264_MB_HEIGHT(ctx->dst_fmt.height);

While here, could you replace this 32 magic number with some
meaningful macro?

> +	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DIR_MV);
>  
>  	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
>  	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);

Thanks a lot,
Ezequiel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2019-09-01 12:45   ` [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags Jonas Karlman
@ 2020-07-10  4:21     ` Ezequiel Garcia
  2020-07-10  8:13       ` Boris Brezillon
  0 siblings, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2020-07-10  4:21 UTC (permalink / raw)
  To: Jonas Karlman
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Boris Brezillon,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

Hello Jonas,

In the context of the uAPI cleanup,
I'm revisiting this patch.

On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> Add DPB entry flags to help indicate when a reference frame is a field picture
> and how the DPB entry is referenced, top or bottom field or full frame.
> 
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>  include/media/h264-ctrls.h                       |  4 ++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index bc5dd8e76567..eb6c32668ad7 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>        - 0x00000004
>        - The DPB entry is a long term reference frame
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> +      - 0x00000008
> +      - The DPB entry is a field picture
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> +      - 0x00000010
> +      - The DPB entry is a top field reference
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> +      - 0x00000020
> +      - The DPB entry is a bottom field reference
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> +      - 0x00000030
> +      - The DPB entry is a reference frame
>  
>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>      Specifies the decoding mode to use. Currently exposes slice-based and
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index e877bf1d537c..76020ebd1e6c 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
>  

I've been going thru the H264 spec and I'm unsure,
are all these flags semantically needed?

For instance, if one of REF_BOTTOM or REF_TOP (or both)
are set, doesn't that indicate it's a field picture?

Or conversely, if neither REF_BOTTOM or REF_TOP are set,
then it's a frame picture?

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10  4:21     ` Ezequiel Garcia
@ 2020-07-10  8:13       ` Boris Brezillon
  2020-07-10  8:48         ` Jonas Karlman
  2020-07-10 11:50         ` Ezequiel Garcia
  0 siblings, 2 replies; 45+ messages in thread
From: Boris Brezillon @ 2020-07-10  8:13 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Jonas Karlman, Mauro Carvalho Chehab, Hans Verkuil,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

On Fri, 10 Jul 2020 01:21:07 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:

> Hello Jonas,
> 
> In the context of the uAPI cleanup,
> I'm revisiting this patch.
> 
> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> > Add DPB entry flags to help indicate when a reference frame is a field picture
> > and how the DPB entry is referenced, top or bottom field or full frame.
> > 
> > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > ---
> >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> >  include/media/h264-ctrls.h                       |  4 ++++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > index bc5dd8e76567..eb6c32668ad7 100644
> > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> >        - 0x00000004
> >        - The DPB entry is a long term reference frame
> > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > +      - 0x00000008
> > +      - The DPB entry is a field picture
> > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > +      - 0x00000010
> > +      - The DPB entry is a top field reference
> > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > +      - 0x00000020
> > +      - The DPB entry is a bottom field reference
> > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > +      - 0x00000030
> > +      - The DPB entry is a reference frame
> >  
> >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> >      Specifies the decoding mode to use. Currently exposes slice-based and
> > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > index e877bf1d537c..76020ebd1e6c 100644
> > --- a/include/media/h264-ctrls.h
> > +++ b/include/media/h264-ctrls.h
> > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> >    
> 
> I've been going thru the H264 spec and I'm unsure,
> are all these flags semantically needed?
> 
> For instance, if one of REF_BOTTOM or REF_TOP (or both)
> are set, doesn't that indicate it's a field picture?
> 
> Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> then it's a frame picture?

I think that's what I was trying to do here [1]

[1]https://patchwork.kernel.org/patch/11392095/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10  8:13       ` Boris Brezillon
@ 2020-07-10  8:48         ` Jonas Karlman
  2020-07-10 12:18           ` Ezequiel Garcia
  2020-07-10 11:50         ` Ezequiel Garcia
  1 sibling, 1 reply; 45+ messages in thread
From: Jonas Karlman @ 2020-07-10  8:48 UTC (permalink / raw)
  To: Boris Brezillon, Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2020-07-10 10:13, Boris Brezillon wrote:
> On Fri, 10 Jul 2020 01:21:07 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
>> Hello Jonas,
>>
>> In the context of the uAPI cleanup,
>> I'm revisiting this patch.
>>
>> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
>>> Add DPB entry flags to help indicate when a reference frame is a field picture
>>> and how the DPB entry is referenced, top or bottom field or full frame.
>>>
>>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>>> ---
>>>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>>>  include/media/h264-ctrls.h                       |  4 ++++
>>>  2 files changed, 16 insertions(+)
>>>
>>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> index bc5dd8e76567..eb6c32668ad7 100644
>>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>>>        - 0x00000004
>>>        - The DPB entry is a long term reference frame
>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
>>> +      - 0x00000008
>>> +      - The DPB entry is a field picture
>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
>>> +      - 0x00000010
>>> +      - The DPB entry is a top field reference
>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
>>> +      - 0x00000020
>>> +      - The DPB entry is a bottom field reference
>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
>>> +      - 0x00000030
>>> +      - The DPB entry is a reference frame
>>>  
>>>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>>>      Specifies the decoding mode to use. Currently exposes slice-based and
>>> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
>>> index e877bf1d537c..76020ebd1e6c 100644
>>> --- a/include/media/h264-ctrls.h
>>> +++ b/include/media/h264-ctrls.h
>>> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>>>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>>>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>>>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
>>> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
>>>    
>>
>> I've been going thru the H264 spec and I'm unsure,
>> are all these flags semantically needed?
>>
>> For instance, if one of REF_BOTTOM or REF_TOP (or both)
>> are set, doesn't that indicate it's a field picture?

These flags would only indicate how the frame / field pair / field is
referenced and not if the DPB entry was decoded as a frame or field pair.

Both hantro and rkvdec needs to know how the referenced frame / field pair
was decoded (not how it is referenced), my best guess is that MV is stored
differently for a frame (linear) and field pair (buffer split in two).

I think we should be able to track how the buffer was decoded similar to
how VP9 keep track of buffer width/height.

When I played with interlaced decoding of rkvdec a few weeks ago I
reverted flags to something similar as my initial rfc patch, see [1].
I guess it should be possible to keep current flags and track field_pic
in driver, some macro to simplify check for top/bottom ref could be
useful if flags is kept as-is.

I am hoping to find some time next week to revisit hantro interlaced
and refine rkvdec interlaced support.

[1] https://github.com/Kwiboo/linux-rockchip/compare/da52ca6f8d2284aebea2d0b99d254b64922faa2d...c9f04cd9bc65eda0da713f4ce1c77eeb1960bd70

Regards,
Jonas

>>
>> Or conversely, if neither REF_BOTTOM or REF_TOP are set,
>> then it's a frame picture?
> 
> I think that's what I was trying to do here [1]
> 
> [1]https://patchwork.kernel.org/patch/11392095/
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10  8:13       ` Boris Brezillon
  2020-07-10  8:48         ` Jonas Karlman
@ 2020-07-10 11:50         ` Ezequiel Garcia
  2020-07-10 12:05           ` Boris Brezillon
  1 sibling, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2020-07-10 11:50 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Jonas Karlman, Mauro Carvalho Chehab, Hans Verkuil,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> On Fri, 10 Jul 2020 01:21:07 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
> > Hello Jonas,
> > 
> > In the context of the uAPI cleanup,
> > I'm revisiting this patch.
> > 
> > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> > > Add DPB entry flags to help indicate when a reference frame is a field picture
> > > and how the DPB entry is referenced, top or bottom field or full frame.
> > > 
> > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > ---
> > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > >  include/media/h264-ctrls.h                       |  4 ++++
> > >  2 files changed, 16 insertions(+)
> > > 
> > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > index bc5dd8e76567..eb6c32668ad7 100644
> > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > >        - 0x00000004
> > >        - The DPB entry is a long term reference frame
> > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > +      - 0x00000008
> > > +      - The DPB entry is a field picture
> > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > +      - 0x00000010
> > > +      - The DPB entry is a top field reference
> > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > +      - 0x00000020
> > > +      - The DPB entry is a bottom field reference
> > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > +      - 0x00000030
> > > +      - The DPB entry is a reference frame
> > >  
> > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > >      Specifies the decoding mode to use. Currently exposes slice-based and
> > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > index e877bf1d537c..76020ebd1e6c 100644
> > > --- a/include/media/h264-ctrls.h
> > > +++ b/include/media/h264-ctrls.h
> > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > >    
> > 
> > I've been going thru the H264 spec and I'm unsure,
> > are all these flags semantically needed?
> > 
> > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > are set, doesn't that indicate it's a field picture?
> > 
> > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > then it's a frame picture?
> 
> I think that's what I was trying to do here [1]
> 
> [1]https://patchwork.kernel.org/patch/11392095/

Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?

If I understand correctly, the DPB can contain:

* frames (FLAG_FIELD not set)
* a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
* a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).

Ezequiel



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10 11:50         ` Ezequiel Garcia
@ 2020-07-10 12:05           ` Boris Brezillon
  2020-07-10 12:25             ` Ezequiel Garcia
  0 siblings, 1 reply; 45+ messages in thread
From: Boris Brezillon @ 2020-07-10 12:05 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Jonas Karlman, Mauro Carvalho Chehab, Hans Verkuil,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

On Fri, 10 Jul 2020 08:50:28 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:

> On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > On Fri, 10 Jul 2020 01:21:07 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> >   
> > > Hello Jonas,
> > > 
> > > In the context of the uAPI cleanup,
> > > I'm revisiting this patch.
> > > 
> > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > Add DPB entry flags to help indicate when a reference frame is a field picture
> > > > and how the DPB entry is referenced, top or bottom field or full frame.
> > > > 
> > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > ---
> > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > >  2 files changed, 16 insertions(+)
> > > > 
> > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > >        - 0x00000004
> > > >        - The DPB entry is a long term reference frame
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > +      - 0x00000008
> > > > +      - The DPB entry is a field picture
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > +      - 0x00000010
> > > > +      - The DPB entry is a top field reference
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > +      - 0x00000020
> > > > +      - The DPB entry is a bottom field reference
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > +      - 0x00000030
> > > > +      - The DPB entry is a reference frame
> > > >  
> > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > >      Specifies the decoding mode to use. Currently exposes slice-based and
> > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > --- a/include/media/h264-ctrls.h
> > > > +++ b/include/media/h264-ctrls.h
> > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > >      
> > > 
> > > I've been going thru the H264 spec and I'm unsure,
> > > are all these flags semantically needed?
> > > 
> > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > are set, doesn't that indicate it's a field picture?
> > > 
> > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > then it's a frame picture?  
> > 
> > I think that's what I was trying to do here [1]
> > 
> > [1]https://patchwork.kernel.org/patch/11392095/  
> 
> Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> 
> If I understand correctly, the DPB can contain:
> 
> * frames (FLAG_FIELD not set)
> * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).

Well, my understand is that, if the buffer contains both a TOP and
BOTTOM field, it actually becomes a full frame, so you actually have
those cases:

* FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
  decoded buffer can become of frame if it's complemented with the
  missing field later during the decoding)
* FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
* FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
* FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination

but I might be wrong.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10  8:48         ` Jonas Karlman
@ 2020-07-10 12:18           ` Ezequiel Garcia
  0 siblings, 0 replies; 45+ messages in thread
From: Ezequiel Garcia @ 2020-07-10 12:18 UTC (permalink / raw)
  To: Jonas Karlman, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On Fri, 2020-07-10 at 08:48 +0000, Jonas Karlman wrote:
> On 2020-07-10 10:13, Boris Brezillon wrote:
> > On Fri, 10 Jul 2020 01:21:07 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > 
> > > Hello Jonas,
> > > 
> > > In the context of the uAPI cleanup,
> > > I'm revisiting this patch.
> > > 
> > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:
> > > > Add DPB entry flags to help indicate when a reference frame is a field picture
> > > > and how the DPB entry is referenced, top or bottom field or full frame.
> > > > 
> > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > ---
> > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > >  2 files changed, 16 insertions(+)
> > > > 
> > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > >        - 0x00000004
> > > >        - The DPB entry is a long term reference frame
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > +      - 0x00000008
> > > > +      - The DPB entry is a field picture
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > +      - 0x00000010
> > > > +      - The DPB entry is a top field reference
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > +      - 0x00000020
> > > > +      - The DPB entry is a bottom field reference
> > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > +      - 0x00000030
> > > > +      - The DPB entry is a reference frame
> > > >  
> > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > >      Specifies the decoding mode to use. Currently exposes slice-based and
> > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > --- a/include/media/h264-ctrls.h
> > > > +++ b/include/media/h264-ctrls.h
> > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > >    
> > > 
> > > I've been going thru the H264 spec and I'm unsure,
> > > are all these flags semantically needed?
> > > 
> > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > are set, doesn't that indicate it's a field picture?
> 
> These flags would only indicate how the frame / field pair / field is
> referenced and not if the DPB entry was decoded as a frame or field pair.
> 

I believe _how_ the picture is referenced shouldn't (or can't?) be signaled
in the DPB representation. It seems Jernej's [1] which properly adds a
flag for each entry in ref_pic_list0 is the right way.

https://patchwork.linuxtv.org/patch/64289/

> Both hantro and rkvdec needs to know how the referenced frame / field pair
> was decoded (not how it is referenced), my best guess is that MV is stored
> differently for a frame (linear) and field pair (buffer split in two).
> 
> I think we should be able to track how the buffer was decoded similar to
> how VP9 keep track of buffer width/height.
> 
> When I played with interlaced decoding of rkvdec a few weeks ago I
> reverted flags to something similar as my initial rfc patch, see [1].
> I guess it should be possible to keep current flags and track field_pic
> in driver, some macro to simplify check for top/bottom ref could be
> useful if flags is kept as-is.
> 
> I am hoping to find some time next week to revisit hantro interlaced
> and refine rkvdec interlaced support.
> 
> [1] https://github.com/Kwiboo/linux-rockchip/compare/da52ca6f8d2284aebea2d0b99d254b64922faa2d...c9f04cd9bc65eda0da713f4ce1c77eeb1960bd70
> 

Yup, I noticed this and it's why I started looking at the uAPI side
of the DPB.

It seems to me all we are missing is further clarification
of the meaning of each DPB_ENTRY_FLAG (possibly adding/removing
flags).

From this snippet:

		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD_PIC)
			refer_addr |= RKVDEC_FIELD_REF;
		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_TOP_REF)
			refer_addr |= RKVDEC_TOPFIELD_USED_REF;
		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_BOTTOM_REF)
			refer_addr |= RKVDEC_BOTFIELD_USED_REF;
		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
			refer_addr |= RKVDEC_COLMV_USED_FLAG_REF;

Which of these flags are needed, i.e. which are required to fully
describe a picture stored in the DPB?

Also, since we are here, I wonder what is exactly RKVDEC COLMV
and what's the condition for RKVDEC_COLMV_USED_FLAG_REF.

Thanks a lot!
Ezequiel

> Regards,
> Jonas
> 
> > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > then it's a frame picture?
> > 
> > I think that's what I was trying to do here [1]
> > 
> > [1]https://patchwork.kernel.org/patch/11392095/
> > 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10 12:05           ` Boris Brezillon
@ 2020-07-10 12:25             ` Ezequiel Garcia
  2020-07-10 21:49               ` Nicolas Dufresne
  0 siblings, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2020-07-10 12:25 UTC (permalink / raw)
  To: Boris Brezillon, nicolas.dufresne
  Cc: Jonas Karlman, Mauro Carvalho Chehab, Hans Verkuil,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

+Nicolas

On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
> On Fri, 10 Jul 2020 08:50:28 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
> > On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > > On Fri, 10 Jul 2020 01:21:07 -0300
> > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > >   
> > > > Hello Jonas,
> > > > 
> > > > In the context of the uAPI cleanup,
> > > > I'm revisiting this patch.
> > > > 
> > > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > > Add DPB entry flags to help indicate when a reference frame is a field picture
> > > > > and how the DPB entry is referenced, top or bottom field or full frame.
> > > > > 
> > > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > > ---
> > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > > >  2 files changed, 16 insertions(+)
> > > > > 
> > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > @@ -2022,6 +2022,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > > >        - 0x00000004
> > > > >        - The DPB entry is a long term reference frame
> > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > > +      - 0x00000008
> > > > > +      - The DPB entry is a field picture
> > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > > +      - 0x00000010
> > > > > +      - The DPB entry is a top field reference
> > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > > +      - 0x00000020
> > > > > +      - The DPB entry is a bottom field reference
> > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > > +      - 0x00000030
> > > > > +      - The DPB entry is a reference frame
> > > > >  
> > > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > > >      Specifies the decoding mode to use. Currently exposes slice-based and
> > > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > > --- a/include/media/h264-ctrls.h
> > > > > +++ b/include/media/h264-ctrls.h
> > > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > > >      
> > > > 
> > > > I've been going thru the H264 spec and I'm unsure,
> > > > are all these flags semantically needed?
> > > > 
> > > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > > are set, doesn't that indicate it's a field picture?
> > > > 
> > > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > > then it's a frame picture?  
> > > 
> > > I think that's what I was trying to do here [1]
> > > 
> > > [1]https://patchwork.kernel.org/patch/11392095/  
> > 
> > Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> > 
> > If I understand correctly, the DPB can contain:
> > 
> > * frames (FLAG_FIELD not set)
> > * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> > * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
> 
> Well, my understand is that, if the buffer contains both a TOP and
> BOTTOM field, it actually becomes a full frame, so you actually have
> those cases:
> 
> * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
>   decoded buffer can become of frame if it's complemented with the
>   missing field later during the decoding)
> * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
> * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
> * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination
> 
> but I might be wrong.

Yes, perhaps that's correct. I was trying to think strictly
in terms of the H264 semantics, to define a clean interface.

From the mpp code, looks like the above is enough for rkvdec
(although I haven't done any tests).

Ezequiel




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10 12:25             ` Ezequiel Garcia
@ 2020-07-10 21:49               ` Nicolas Dufresne
  2020-07-11 10:21                 ` Jonas Karlman
  0 siblings, 1 reply; 45+ messages in thread
From: Nicolas Dufresne @ 2020-07-10 21:49 UTC (permalink / raw)
  To: Ezequiel Garcia, Boris Brezillon
  Cc: Jonas Karlman, Mauro Carvalho Chehab, Hans Verkuil,
	Philipp Zabel, Paul Kocialkowski, linux-media, linux-rockchip,
	linux-kernel

Le vendredi 10 juillet 2020 à 09:25 -0300, Ezequiel Garcia a écrit :
> +Nicolas
> 
> On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
> > On Fri, 10 Jul 2020 08:50:28 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > 
> > > On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > > > On Fri, 10 Jul 2020 01:21:07 -0300
> > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > >   
> > > > > Hello Jonas,
> > > > > 
> > > > > In the context of the uAPI cleanup,
> > > > > I'm revisiting this patch.
> > > > > 
> > > > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > > > Add DPB entry flags to help indicate when a reference frame is a
> > > > > > field picture
> > > > > > and how the DPB entry is referenced, top or bottom field or full
> > > > > > frame.
> > > > > > 
> > > > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > > > ---
> > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > > > >  2 files changed, 16 insertions(+)
> > > > > > 
> > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > @@ -2022,6 +2022,18 @@ enum
> > > > > > v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > > > >        - 0x00000004
> > > > > >        - The DPB entry is a long term reference frame
> > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > > > +      - 0x00000008
> > > > > > +      - The DPB entry is a field picture
> > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > > > +      - 0x00000010
> > > > > > +      - The DPB entry is a top field reference
> > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > > > +      - 0x00000020
> > > > > > +      - The DPB entry is a bottom field reference
> > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > > > +      - 0x00000030
> > > > > > +      - The DPB entry is a reference frame
> > > > > >  
> > > > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > > > >      Specifies the decoding mode to use. Currently exposes slice-
> > > > > > based and
> > > > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > > > --- a/include/media/h264-ctrls.h
> > > > > > +++ b/include/media/h264-ctrls.h
> > > > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > > > >      
> > > > > 
> > > > > I've been going thru the H264 spec and I'm unsure,
> > > > > are all these flags semantically needed?
> > > > > 
> > > > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > > > are set, doesn't that indicate it's a field picture?
> > > > > 
> > > > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > > > then it's a frame picture?  
> > > > 
> > > > I think that's what I was trying to do here [1]
> > > > 
> > > > [1]https://patchwork.kernel.org/patch/11392095/  
> > > 
> > > Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> > > 
> > > If I understand correctly, the DPB can contain:
> > > 
> > > * frames (FLAG_FIELD not set)
> > > * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> > > * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
> > 
> > Well, my understand is that, if the buffer contains both a TOP and
> > BOTTOM field, it actually becomes a full frame, so you actually have
> > those cases:
> > 
> > * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
> >   decoded buffer can become of frame if it's complemented with the
> >   missing field later during the decoding)
> > * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
> > * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
> > * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination

Let's admit, while this work, it's odd. Can we just move to that instewad ?

  FLAG_TOP_FIELD
  FLAG_BOTTOM_FIELD
  FLAG_FRAME = (FLAG_TOP_FIELD | FLAG_BOTTOM_FIELD)

So it can be used as a flag, but also is a proper enum and there is no longer an
invalid combination.
  
> > 
> > but I might be wrong.
> 
> Yes, perhaps that's correct. I was trying to think strictly
> in terms of the H264 semantics, to define a clean interface.
> 
> From the mpp code, looks like the above is enough for rkvdec
> (although I haven't done any tests).
> 
> Ezequiel
> 
> 
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-10 21:49               ` Nicolas Dufresne
@ 2020-07-11 10:21                 ` Jonas Karlman
  2020-07-11 18:36                   ` Nicolas Dufresne
  2020-07-12 22:59                   ` Ezequiel Garcia
  0 siblings, 2 replies; 45+ messages in thread
From: Jonas Karlman @ 2020-07-11 10:21 UTC (permalink / raw)
  To: Nicolas Dufresne, Ezequiel Garcia, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On 2020-07-10 23:49, Nicolas Dufresne wrote:
> Le vendredi 10 juillet 2020 à 09:25 -0300, Ezequiel Garcia a écrit :
>> +Nicolas
>>
>> On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
>>> On Fri, 10 Jul 2020 08:50:28 -0300
>>> Ezequiel Garcia <ezequiel@collabora.com> wrote:
>>>
>>>> On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
>>>>> On Fri, 10 Jul 2020 01:21:07 -0300
>>>>> Ezequiel Garcia <ezequiel@collabora.com> wrote:
>>>>>   
>>>>>> Hello Jonas,
>>>>>>
>>>>>> In the context of the uAPI cleanup,
>>>>>> I'm revisiting this patch.
>>>>>>
>>>>>> On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
>>>>>>> Add DPB entry flags to help indicate when a reference frame is a
>>>>>>> field picture
>>>>>>> and how the DPB entry is referenced, top or bottom field or full
>>>>>>> frame.
>>>>>>>
>>>>>>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>>>>>>> ---
>>>>>>>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>>>>>>>  include/media/h264-ctrls.h                       |  4 ++++
>>>>>>>  2 files changed, 16 insertions(+)
>>>>>>>
>>>>>>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>> b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>> index bc5dd8e76567..eb6c32668ad7 100644
>>>>>>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>> @@ -2022,6 +2022,18 @@ enum
>>>>>>> v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>>>>>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>>>>>>>        - 0x00000004
>>>>>>>        - The DPB entry is a long term reference frame
>>>>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
>>>>>>> +      - 0x00000008
>>>>>>> +      - The DPB entry is a field picture
>>>>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
>>>>>>> +      - 0x00000010
>>>>>>> +      - The DPB entry is a top field reference
>>>>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
>>>>>>> +      - 0x00000020
>>>>>>> +      - The DPB entry is a bottom field reference
>>>>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
>>>>>>> +      - 0x00000030
>>>>>>> +      - The DPB entry is a reference frame
>>>>>>>  
>>>>>>>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>>>>>>>      Specifies the decoding mode to use. Currently exposes slice-
>>>>>>> based and
>>>>>>> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
>>>>>>> index e877bf1d537c..76020ebd1e6c 100644
>>>>>>> --- a/include/media/h264-ctrls.h
>>>>>>> +++ b/include/media/h264-ctrls.h
>>>>>>> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>>>>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>>>>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>>>>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
>>>>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
>>>>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
>>>>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
>>>>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
>>>>>>>      
>>>>>>
>>>>>> I've been going thru the H264 spec and I'm unsure,
>>>>>> are all these flags semantically needed?
>>>>>>
>>>>>> For instance, if one of REF_BOTTOM or REF_TOP (or both)
>>>>>> are set, doesn't that indicate it's a field picture?
>>>>>>
>>>>>> Or conversely, if neither REF_BOTTOM or REF_TOP are set,
>>>>>> then it's a frame picture?  
>>>>>
>>>>> I think that's what I was trying to do here [1]
>>>>>
>>>>> [1]https://patchwork.kernel.org/patch/11392095/  
>>>>
>>>> Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
>>>>
>>>> If I understand correctly, the DPB can contain:
>>>>
>>>> * frames (FLAG_FIELD not set)
>>>> * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
>>>> * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
>>>
>>> Well, my understand is that, if the buffer contains both a TOP and
>>> BOTTOM field, it actually becomes a full frame, so you actually have
>>> those cases:
>>>
>>> * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
>>>   decoded buffer can become of frame if it's complemented with the
>>>   missing field later during the decoding)
>>> * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
>>> * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
>>> * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination
> 
> Let's admit, while this work, it's odd. Can we just move to that instewad ?
> 
>   FLAG_TOP_FIELD
>   FLAG_BOTTOM_FIELD
>   FLAG_FRAME = (FLAG_TOP_FIELD | FLAG_BOTTOM_FIELD)
> 
> So it can be used as a flag, but also is a proper enum and there is no longer an
> invalid combination.
>   
>>>
>>> but I might be wrong.

There seems to be some misunderstanding here, the top/bottom flagging should
not be used to describe if the picture is a field, field pair or frame, it
should be used to flag if a frame or the top and/or bottom field (in case of
a field pair) is "used for short-term reference".

FLAG_TOP_REF
FLAG_BOTTOM_REF
FLAG_FRAME_REF = (FLAG_TOP_REF | FLAG_BOTTOM_REF)

Would be a more appropriate naming.

The FIELD_PIC flag would then be used to describe if the picture is a
reference frame or a complementary reference field pair.

As described in hantro h264 driver [1] the MV buffer is split in two
for field encoded frames, and I guess the rkvdec block does something
similar and therefore the HW blocks probably needs to know if the reference
picture is a reference frame or a complementary reference field pair.
It should be possible to keep such state in driver but since such information
was easily available in ffmpeg and the driver being "stateless" using a flag
seamed like a good choice at the time.

Please note that I have not done any test without the "field pic" flagging
but both mpp and the imx/hantro reference code are configuring this bit.

[1] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/hantro/hantro_g1_h264_dec.c#n265

Regards,
Jonas

>>
>> Yes, perhaps that's correct. I was trying to think strictly
>> in terms of the H264 semantics, to define a clean interface.
>>
>> From the mpp code, looks like the above is enough for rkvdec
>> (although I haven't done any tests).
>>
>> Ezequiel
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-11 10:21                 ` Jonas Karlman
@ 2020-07-11 18:36                   ` Nicolas Dufresne
  2020-07-12 22:59                   ` Ezequiel Garcia
  1 sibling, 0 replies; 45+ messages in thread
From: Nicolas Dufresne @ 2020-07-11 18:36 UTC (permalink / raw)
  To: Jonas Karlman, Ezequiel Garcia, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8063 bytes --]

Le samedi 11 juillet 2020 à 10:21 +0000, Jonas Karlman a écrit :
> On 2020-07-10 23:49, Nicolas Dufresne wrote:
> > Le vendredi 10 juillet 2020 à 09:25 -0300, Ezequiel Garcia a écrit :
> > > +Nicolas
> > > 
> > > On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
> > > > On Fri, 10 Jul 2020 08:50:28 -0300
> > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > 
> > > > > On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > > > > > On Fri, 10 Jul 2020 01:21:07 -0300
> > > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > > >   
> > > > > > > Hello Jonas,
> > > > > > > 
> > > > > > > In the context of the uAPI cleanup,
> > > > > > > I'm revisiting this patch.
> > > > > > > 
> > > > > > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > > > > > Add DPB entry flags to help indicate when a reference frame is a
> > > > > > > > field picture
> > > > > > > > and how the DPB entry is referenced, top or bottom field or full
> > > > > > > > frame.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > > > > > ---
> > > > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > > > > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > > > > > >  2 files changed, 16 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > @@ -2022,6 +2022,18 @@ enum
> > > > > > > > v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > > > > > >        - 0x00000004
> > > > > > > >        - The DPB entry is a long term reference frame
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > > > > > +      - 0x00000008
> > > > > > > > +      - The DPB entry is a field picture
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > > > > > +      - 0x00000010
> > > > > > > > +      - The DPB entry is a top field reference
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > > > > > +      - 0x00000020
> > > > > > > > +      - The DPB entry is a bottom field reference
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > > > > > +      - 0x00000030
> > > > > > > > +      - The DPB entry is a reference frame
> > > > > > > >  
> > > > > > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > > > > > >      Specifies the decoding mode to use. Currently exposes slice-
> > > > > > > > based and
> > > > > > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > > > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > > > > > --- a/include/media/h264-ctrls.h
> > > > > > > > +++ b/include/media/h264-ctrls.h
> > > > > > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > > > > > >      
> > > > > > > 
> > > > > > > I've been going thru the H264 spec and I'm unsure,
> > > > > > > are all these flags semantically needed?
> > > > > > > 
> > > > > > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > > > > > are set, doesn't that indicate it's a field picture?
> > > > > > > 
> > > > > > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > > > > > then it's a frame picture?  
> > > > > > 
> > > > > > I think that's what I was trying to do here [1]
> > > > > > 
> > > > > > [1]https://patchwork.kernel.org/patch/11392095/  
> > > > > 
> > > > > Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> > > > > 
> > > > > If I understand correctly, the DPB can contain:
> > > > > 
> > > > > * frames (FLAG_FIELD not set)
> > > > > * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> > > > > * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
> > > > 
> > > > Well, my understand is that, if the buffer contains both a TOP and
> > > > BOTTOM field, it actually becomes a full frame, so you actually have
> > > > those cases:
> > > > 
> > > > * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
> > > >   decoded buffer can become of frame if it's complemented with the
> > > >   missing field later during the decoding)
> > > > * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
> > > > * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
> > > > * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination
> > 
> > Let's admit, while this work, it's odd. Can we just move to that instewad ?
> > 
> >   FLAG_TOP_FIELD
> >   FLAG_BOTTOM_FIELD
> >   FLAG_FRAME = (FLAG_TOP_FIELD | FLAG_BOTTOM_FIELD)
> > 
> > So it can be used as a flag, but also is a proper enum and there is no longer an
> > invalid combination.
> >   
> > > > but I might be wrong.
> 
> There seems to be some misunderstanding here, the top/bottom flagging should
> not be used to describe if the picture is a field, field pair or frame, it
> should be used to flag if a frame or the top and/or bottom field (in case of
> a field pair) is "used for short-term reference".
> 
> FLAG_TOP_REF
> FLAG_BOTTOM_REF
> FLAG_FRAME_REF = (FLAG_TOP_REF | FLAG_BOTTOM_REF)
> 
> Would be a more appropriate naming.

It's a subtle nuance, but could work.

The reason I referred to it like this is because in gstreamer-vaapi,
this information is deduced from picture->structure flags (I believe
it's inspired from JM reference decoder). This structure is updated
when a specific field has been decoded. So it effectively represent
which field of that picture are valid/decoded, and the combination of
this picture being reference and that flag is the only state used to
communicate that information. The real use for this is for the case we
have lost a field. A missing reference picture can then be detected.

So in gstreamer-vaapi, the case where you have both top/bottom field of
a reference being decoded, but only one of the field marked for
reference in the DPB does not exist. I don't know if that really exist
in H.264.

> 
> The FIELD_PIC flag would then be used to describe if the picture is a
> reference frame or a complementary reference field pair.
> 
> As described in hantro h264 driver [1] the MV buffer is split in two
> for field encoded frames, and I guess the rkvdec block does something
> similar and therefore the HW blocks probably needs to know if the reference
> picture is a reference frame or a complementary reference field pair.
> It should be possible to keep such state in driver but since such information
> was easily available in ffmpeg and the driver being "stateless" using a flag
> seamed like a good choice at the time.
> 
> Please note that I have not done any test without the "field pic" flagging
> but both mpp and the imx/hantro reference code are configuring this bit.
> 
> [1] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/hantro/hantro_g1_h264_dec.c#n265
> 
> Regards,
> Jonas
> 
> > > Yes, perhaps that's correct. I was trying to think strictly
> > > in terms of the H264 semantics, to define a clean interface.
> > > 
> > > From the mpp code, looks like the above is enough for rkvdec
> > > (although I haven't done any tests).
> > > 
> > > Ezequiel
> > > 
> > > 
> > > 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-11 10:21                 ` Jonas Karlman
  2020-07-11 18:36                   ` Nicolas Dufresne
@ 2020-07-12 22:59                   ` Ezequiel Garcia
  2020-07-14 16:04                     ` Nicolas Dufresne
  1 sibling, 1 reply; 45+ messages in thread
From: Ezequiel Garcia @ 2020-07-12 22:59 UTC (permalink / raw)
  To: Jonas Karlman, Nicolas Dufresne, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

On Sat, 2020-07-11 at 10:21 +0000, Jonas Karlman wrote:
> On 2020-07-10 23:49, Nicolas Dufresne wrote:
> > Le vendredi 10 juillet 2020 à 09:25 -0300, Ezequiel Garcia a écrit :
> > > +Nicolas
> > > 
> > > On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
> > > > On Fri, 10 Jul 2020 08:50:28 -0300
> > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > 
> > > > > On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > > > > > On Fri, 10 Jul 2020 01:21:07 -0300
> > > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > > >   
> > > > > > > Hello Jonas,
> > > > > > > 
> > > > > > > In the context of the uAPI cleanup,
> > > > > > > I'm revisiting this patch.
> > > > > > > 
> > > > > > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > > > > > Add DPB entry flags to help indicate when a reference frame is a
> > > > > > > > field picture
> > > > > > > > and how the DPB entry is referenced, top or bottom field or full
> > > > > > > > frame.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > > > > > ---
> > > > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > > > > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > > > > > >  2 files changed, 16 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > @@ -2022,6 +2022,18 @@ enum
> > > > > > > > v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > > > > > >        - 0x00000004
> > > > > > > >        - The DPB entry is a long term reference frame
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > > > > > +      - 0x00000008
> > > > > > > > +      - The DPB entry is a field picture
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > > > > > +      - 0x00000010
> > > > > > > > +      - The DPB entry is a top field reference
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > > > > > +      - 0x00000020
> > > > > > > > +      - The DPB entry is a bottom field reference
> > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > > > > > +      - 0x00000030
> > > > > > > > +      - The DPB entry is a reference frame
> > > > > > > >  
> > > > > > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > > > > > >      Specifies the decoding mode to use. Currently exposes slice-
> > > > > > > > based and
> > > > > > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > > > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > > > > > --- a/include/media/h264-ctrls.h
> > > > > > > > +++ b/include/media/h264-ctrls.h
> > > > > > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > > > > > >      
> > > > > > > 
> > > > > > > I've been going thru the H264 spec and I'm unsure,
> > > > > > > are all these flags semantically needed?
> > > > > > > 
> > > > > > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > > > > > are set, doesn't that indicate it's a field picture?
> > > > > > > 
> > > > > > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > > > > > then it's a frame picture?  
> > > > > > 
> > > > > > I think that's what I was trying to do here [1]
> > > > > > 
> > > > > > [1]https://patchwork.kernel.org/patch/11392095/  
> > > > > 
> > > > > Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> > > > > 
> > > > > If I understand correctly, the DPB can contain:
> > > > > 
> > > > > * frames (FLAG_FIELD not set)
> > > > > * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> > > > > * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
> > > > 
> > > > Well, my understand is that, if the buffer contains both a TOP and
> > > > BOTTOM field, it actually becomes a full frame, so you actually have
> > > > those cases:
> > > > 
> > > > * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
> > > >   decoded buffer can become of frame if it's complemented with the
> > > >   missing field later during the decoding)
> > > > * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
> > > > * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
> > > > * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination
> > 
> > Let's admit, while this work, it's odd. Can we just move to that instewad ?
> > 
> >   FLAG_TOP_FIELD
> >   FLAG_BOTTOM_FIELD
> >   FLAG_FRAME = (FLAG_TOP_FIELD | FLAG_BOTTOM_FIELD)
> > 
> > So it can be used as a flag, but also is a proper enum and there is no longer an
> > invalid combination.
> >   
> > > > but I might be wrong.
> 
> There seems to be some misunderstanding here, the top/bottom flagging should
> not be used to describe if the picture is a field, field pair or frame, it
> should be used to flag if a frame or the top and/or bottom field (in case of
> a field pair) is "used for short-term reference".
> 

I'm not sure why "used for short-term reference" instead
of "used for reference".

> FLAG_TOP_REF
> FLAG_BOTTOM_REF
> FLAG_FRAME_REF = (FLAG_TOP_REF | FLAG_BOTTOM_REF)
> 
> Would be a more appropriate naming.
> 
> The FIELD_PIC flag would then be used to describe if the picture is a
> reference frame or a complementary reference field pair.
> 
> As described in hantro h264 driver [1] the MV buffer is split in two
> for field encoded frames, and I guess the rkvdec block does something
> similar and therefore the HW blocks probably needs to know if the reference
> picture is a reference frame or a complementary reference field pair.
> It should be possible to keep such state in driver but since such information
> was easily available in ffmpeg and the driver being "stateless" using a flag
> seamed like a good choice at the time.
> 
> Please note that I have not done any test without the "field pic" flagging
> but both mpp and the imx/hantro reference code are configuring this bit.
> 
> [1] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/hantro/hantro_g1_h264_dec.c#n265
> 

How about this:

#define V4L2_H264_DPB_ENTRY_FLAG_VALID          0x01
#define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE         0x02
#define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM      0x04
#define V4L2_H264_DPB_ENTRY_FLAG_FIELD          0x08

enum v4l2_h264_dpb_reference {
        V4L2_H264_DPB_TOP_REF = 0x1,
        V4L2_H264_DPB_BOTTOM_REF = 0x2,
        V4L2_H264_DPB_FRAME_REF = 0x3,
};

With the following semantics (which should be
specified in the documentation):

* VALID: non-empty DPB entry.
* ACTIVE: picture is marked as "used for reference" (short-term or long-term).
* LONG_TERM: picture is marked as "used for long-term".
* FIELD: picture is a single field, or a complementary field pair. 

The v4l2_h264_dpb_reference enum would flag which
of the fields as used for reference.

This enum seems less ambiguous and easier to use for both
drivers and applications.

I am not exactly sure why a driver would ever need to
configure an "unused for reference" decoded picture
(i.e. VALID=1, ACTIVE=0), but I guess it's just clearer
to include this in the interface.

Thanks,
Ezequiel

> Regards,
> Jonas
> 
> > > Yes, perhaps that's correct. I was trying to think strictly
> > > in terms of the H264 semantics, to define a clean interface.
> > > 
> > > From the mpp code, looks like the above is enough for rkvdec
> > > (although I haven't done any tests).
> > > 
> > > Ezequiel
> > > 
> > > 
> > > 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags
  2020-07-12 22:59                   ` Ezequiel Garcia
@ 2020-07-14 16:04                     ` Nicolas Dufresne
  0 siblings, 0 replies; 45+ messages in thread
From: Nicolas Dufresne @ 2020-07-14 16:04 UTC (permalink / raw)
  To: Ezequiel Garcia, Jonas Karlman, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Philipp Zabel,
	Paul Kocialkowski, linux-media, linux-rockchip, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9162 bytes --]

Le dimanche 12 juillet 2020 à 19:59 -0300, Ezequiel Garcia a écrit :
> On Sat, 2020-07-11 at 10:21 +0000, Jonas Karlman wrote:
> > On 2020-07-10 23:49, Nicolas Dufresne wrote:
> > > Le vendredi 10 juillet 2020 à 09:25 -0300, Ezequiel Garcia a écrit :
> > > > +Nicolas
> > > > 
> > > > On Fri, 2020-07-10 at 14:05 +0200, Boris Brezillon wrote:
> > > > > On Fri, 10 Jul 2020 08:50:28 -0300
> > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > > 
> > > > > > On Fri, 2020-07-10 at 10:13 +0200, Boris Brezillon wrote:
> > > > > > > On Fri, 10 Jul 2020 01:21:07 -0300
> > > > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > > > >   
> > > > > > > > Hello Jonas,
> > > > > > > > 
> > > > > > > > In the context of the uAPI cleanup,
> > > > > > > > I'm revisiting this patch.
> > > > > > > > 
> > > > > > > > On Sun, 2019-09-01 at 12:45 +0000, Jonas Karlman wrote:  
> > > > > > > > > Add DPB entry flags to help indicate when a reference frame is a
> > > > > > > > > field picture
> > > > > > > > > and how the DPB entry is referenced, top or bottom field or full
> > > > > > > > > frame.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> > > > > > > > > ---
> > > > > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> > > > > > > > >  include/media/h264-ctrls.h                       |  4 ++++
> > > > > > > > >  2 files changed, 16 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > > b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > > index bc5dd8e76567..eb6c32668ad7 100644
> > > > > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > > @@ -2022,6 +2022,18 @@ enum
> > > > > > > > > v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > > > > >      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> > > > > > > > >        - 0x00000004
> > > > > > > > >        - The DPB entry is a long term reference frame
> > > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> > > > > > > > > +      - 0x00000008
> > > > > > > > > +      - The DPB entry is a field picture
> > > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> > > > > > > > > +      - 0x00000010
> > > > > > > > > +      - The DPB entry is a top field reference
> > > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> > > > > > > > > +      - 0x00000020
> > > > > > > > > +      - The DPB entry is a bottom field reference
> > > > > > > > > +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> > > > > > > > > +      - 0x00000030
> > > > > > > > > +      - The DPB entry is a reference frame
> > > > > > > > >  
> > > > > > > > >  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> > > > > > > > >      Specifies the decoding mode to use. Currently exposes slice-
> > > > > > > > > based and
> > > > > > > > > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > > > > > > > > index e877bf1d537c..76020ebd1e6c 100644
> > > > > > > > > --- a/include/media/h264-ctrls.h
> > > > > > > > > +++ b/include/media/h264-ctrls.h
> > > > > > > > > @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> > > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> > > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> > > > > > > > >  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> > > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> > > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> > > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> > > > > > > > > +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> > > > > > > > >      
> > > > > > > > 
> > > > > > > > I've been going thru the H264 spec and I'm unsure,
> > > > > > > > are all these flags semantically needed?
> > > > > > > > 
> > > > > > > > For instance, if one of REF_BOTTOM or REF_TOP (or both)
> > > > > > > > are set, doesn't that indicate it's a field picture?
> > > > > > > > 
> > > > > > > > Or conversely, if neither REF_BOTTOM or REF_TOP are set,
> > > > > > > > then it's a frame picture?  
> > > > > > > 
> > > > > > > I think that's what I was trying to do here [1]
> > > > > > > 
> > > > > > > [1]https://patchwork.kernel.org/patch/11392095/  
> > > > > > 
> > > > > > Right. Aren't we missing a DPB_ENTRY_FLAG_TOP_FIELD?
> > > > > > 
> > > > > > If I understand correctly, the DPB can contain:
> > > > > > 
> > > > > > * frames (FLAG_FIELD not set)
> > > > > > * a field pair, with a single field (FLAG_FIELD and either TOP or BOTTOM).
> > > > > > * a field pair, with boths fields (FLAG_FIELD and both TOP or BOTTOM).
> > > > > 
> > > > > Well, my understand is that, if the buffer contains both a TOP and
> > > > > BOTTOM field, it actually becomes a full frame, so you actually have
> > > > > those cases:
> > > > > 
> > > > > * FLAG_FIELD not set: this a frame (note that a TOP/BOTTOM field
> > > > >   decoded buffer can become of frame if it's complemented with the
> > > > >   missing field later during the decoding)
> > > > > * FLAG_FIELD set + BOTTOM_FIELD not set: this is a TOP field
> > > > > * FLAG_FIELD set + BOTTOM_FIELD set: this is a BOTTOM field
> > > > > * FLAG_FIELD not set + BOTTOM_FIELD set: invalid combination
> > > 
> > > Let's admit, while this work, it's odd. Can we just move to that instewad ?
> > > 
> > >   FLAG_TOP_FIELD
> > >   FLAG_BOTTOM_FIELD
> > >   FLAG_FRAME = (FLAG_TOP_FIELD | FLAG_BOTTOM_FIELD)
> > > 
> > > So it can be used as a flag, but also is a proper enum and there is no longer an
> > > invalid combination.
> > >   
> > > > > but I might be wrong.
> > 
> > There seems to be some misunderstanding here, the top/bottom flagging should
> > not be used to describe if the picture is a field, field pair or frame, it
> > should be used to flag if a frame or the top and/or bottom field (in case of
> > a field pair) is "used for short-term reference".
> > 
> 
> I'm not sure why "used for short-term reference" instead
> of "used for reference".
> 
> > FLAG_TOP_REF
> > FLAG_BOTTOM_REF
> > FLAG_FRAME_REF = (FLAG_TOP_REF | FLAG_BOTTOM_REF)
> > 
> > Would be a more appropriate naming.
> > 
> > The FIELD_PIC flag would then be used to describe if the picture is a
> > reference frame or a complementary reference field pair.
> > 
> > As described in hantro h264 driver [1] the MV buffer is split in two
> > for field encoded frames, and I guess the rkvdec block does something
> > similar and therefore the HW blocks probably needs to know if the reference
> > picture is a reference frame or a complementary reference field pair.
> > It should be possible to keep such state in driver but since such information
> > was easily available in ffmpeg and the driver being "stateless" using a flag
> > seamed like a good choice at the time.
> > 
> > Please note that I have not done any test without the "field pic" flagging
> > but both mpp and the imx/hantro reference code are configuring this bit.
> > 
> > [1] https://git.linuxtv.org/media_tree.git/tree/drivers/staging/media/hantro/hantro_g1_h264_dec.c#n265
> > 
> 
> How about this:
> 
> #define V4L2_H264_DPB_ENTRY_FLAG_VALID          0x01
> #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE         0x02
> #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM      0x04
> #define V4L2_H264_DPB_ENTRY_FLAG_FIELD          0x08
> 
> enum v4l2_h264_dpb_reference {
>         V4L2_H264_DPB_TOP_REF = 0x1,
>         V4L2_H264_DPB_BOTTOM_REF = 0x2,
>         V4L2_H264_DPB_FRAME_REF = 0x3,
> };
> 
> With the following semantics (which should be
> specified in the documentation):
> 
> * VALID: non-empty DPB entry.
> * ACTIVE: picture is marked as "used for reference" (short-term or long-term).
> * LONG_TERM: picture is marked as "used for long-term".
> * FIELD: picture is a single field, or a complementary field pair. 
> 
> The v4l2_h264_dpb_reference enum would flag which
> of the fields as used for reference.
> 
> This enum seems less ambiguous and easier to use for both
> drivers and applications.
> 
> I am not exactly sure why a driver would ever need to
> configure an "unused for reference" decoded picture
> (i.e. VALID=1, ACTIVE=0), but I guess it's just clearer
> to include this in the interface.

Indeed, that might have leaked from what we do in userspace, were we
need to track this. I haven't seen anything that would do concealment
or anything anyway.

I don't have definitive opinion on the above, but I think it's getting
in the right direction.

> 
> Thanks,
> Ezequiel
> 
> > Regards,
> > Jonas
> > 
> > > > Yes, perhaps that's correct. I was trying to think strictly
> > > > in terms of the H264 semantics, to define a clean interface.
> > > > 
> > > > From the mpp code, looks like the above is enough for rkvdec
> > > > (although I haven't done any tests).
> > > > 
> > > > Ezequiel
> > > > 
> > > > 
> > > > 
> 
> 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-07-14 16:04 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-01 12:42 [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Jonas Karlman
2019-09-01 12:45 ` [PATCH 01/12] media: hantro: Fix H264 max frmsize supported on RK3288 Jonas Karlman
2019-09-04 13:07   ` Ezequiel Garcia
2019-09-09 19:25     ` Jonas Karlman
     [not found] ` <20190901124531.23645-1-jonas@kwiboo.se>
2019-09-01 12:45   ` [PATCH 02/12] media: hantro: Do not reorder H264 scaling list Jonas Karlman
2019-09-02 14:00     ` Philipp Zabel
2019-09-02 16:18       ` Jonas Karlman
2019-09-03  7:54         ` Jonas Karlman
2019-09-03 12:53           ` Philipp Zabel
2019-09-03  9:56         ` Philipp Zabel
2019-09-10 10:14         ` Ezequiel Garcia
2019-09-01 12:45   ` [PATCH 03/12] media: hantro: Fix H264 motion vector buffer offset Jonas Karlman
2019-09-03 10:58     ` Philipp Zabel
2019-09-03 20:13       ` Jonas Karlman
2019-09-10 10:18     ` Ezequiel Garcia
2019-09-10 11:34     ` Ezequiel Garcia
2019-09-01 12:45   ` [PATCH 05/12] media: hantro: Remove now unused H264 pic_size Jonas Karlman
2019-09-01 12:45   ` [PATCH 04/12] media: hantro: Reduce H264 extra space for motion vectors Jonas Karlman
2019-09-01 12:45   ` [PATCH 06/12] media: hantro: Set H264 FIELDPIC_FLAG_E flag correctly Jonas Karlman
2019-09-01 12:45   ` [RFC 08/12] media: hantro: Fix H264 decoding of field encoded content Jonas Karlman
2019-09-03 13:21     ` Philipp Zabel
2019-09-03 14:02       ` Jonas Karlman
2019-09-03 15:01         ` Philipp Zabel
2019-09-03 19:47           ` Jonas Karlman
2019-09-01 12:45   ` [RFC 07/12] media: uapi: h264: Add DPB entry field reference flags Jonas Karlman
2020-07-10  4:21     ` Ezequiel Garcia
2020-07-10  8:13       ` Boris Brezillon
2020-07-10  8:48         ` Jonas Karlman
2020-07-10 12:18           ` Ezequiel Garcia
2020-07-10 11:50         ` Ezequiel Garcia
2020-07-10 12:05           ` Boris Brezillon
2020-07-10 12:25             ` Ezequiel Garcia
2020-07-10 21:49               ` Nicolas Dufresne
2020-07-11 10:21                 ` Jonas Karlman
2020-07-11 18:36                   ` Nicolas Dufresne
2020-07-12 22:59                   ` Ezequiel Garcia
2020-07-14 16:04                     ` Nicolas Dufresne
2019-09-01 12:45   ` [RFC 09/12] media: hantro: Refactor G1 H264 code Jonas Karlman
2019-09-01 12:45   ` [RFC 10/12] media: hantro: Add support for H264 decoding on RK3399 Jonas Karlman
2019-09-02 11:46     ` Hans Verkuil
2019-09-02 15:25       ` Jonas Karlman
2019-09-01 12:45   ` [RFC 11/12] media: hantro: Enable " Jonas Karlman
2019-09-01 12:45   ` [RFC 12/12] media: hantro: Enable H264 decoding on RK3328 Jonas Karlman
2019-09-02 13:02 ` [PATCH RFC 00/12] media: hantro: H264 fixes and improvements Ezequiel Garcia
2019-09-02 16:28   ` Jonas Karlman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).