All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/15] media: mtk-vcodec: support for MT8183 decoder
@ 2021-02-26 10:01 ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Please ignore the v2 of this series since I neglected a final check on
it and realized later it did not compile. >_<

This series adds support for the stateless API into mtk-vcodec, by first
separating the stateful ops into their own source file, and introducing
a new set of ops suitable for stateless decoding. As such, support for
stateful decoders should remain completely unaffected.

This series has been tested with both MT8183 and MT8173. Decoding was
working for both chips, and in the case of MT8173 no regression has been
spotted.

Patches 1-9 add MT8183 support to the decoder using the stateless API.
MT8183 only support H.264 acceleration.

Patches 10-15 are follow-ups that further improve compliance for the
decoder and encoder, by fixing support for commands on both. Patch 11
also makes sure that supported H.264 profiles are exported on MT8173.

Changes since v2:
* Actually compiles (duh),
* Add follow-up patches fixing support for START/STOP commands for the
  encoder, and stateful decoder.

Alexandre Courbot (8):
  media: mtk-vcodec: vdec: handle firmware version field
  media: mtk-vcodec: support version 2 of decoder firmware ABI
  media: add Mediatek's MM21 format
  dt-bindings: media: document mediatek,mt8183-vcodec-dec
  media: mtk-vcodec: vdec: use helpers in VIDIOC_(TRY_)DECODER_CMD
  media: mtk-vcodec: vdec: clamp OUTPUT resolution to hardware limits
  media: mtk-vcodec: make flush buffer reusable by encoder
  media: mtk-vcodec: venc: support START and STOP commands

Hirokazu Honda (1):
  media: mtk-vcodec: vdec: Support H264 profile control

Hsin-Yi Wang (1):
  media: mtk-vcodec: venc: make sure buffer exists in list before
    removing

Yunfei Dong (5):
  media: mtk-vcodec: vdec: move stateful ops into their own file
  media: mtk-vcodec: vdec: support stateless API
  media: mtk-vcodec: vdec: support stateless H.264 decoding
  media: mtk-vcodec: vdec: add media device if using stateless api
  media: mtk-vcodec: enable MT8183 decoder

 .../bindings/media/mediatek-vcodec.txt        |   1 +
 .../media/v4l/pixfmt-reserved.rst             |   7 +
 drivers/media/platform/Kconfig                |   2 +
 drivers/media/platform/mtk-vcodec/Makefile    |   3 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 800 +++--------------
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |  30 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  66 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      | 647 ++++++++++++++
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 +++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  58 +-
 .../platform/mtk-vcodec/mtk_vcodec_enc.c      | 135 ++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |   4 +
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |  23 +-
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   |  43 +-
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |   5 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |   1 +
 include/uapi/linux/videodev2.h                |   1 +
 20 files changed, 2360 insertions(+), 704 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c

-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 00/15] media: mtk-vcodec: support for MT8183 decoder
@ 2021-02-26 10:01 ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Please ignore the v2 of this series since I neglected a final check on
it and realized later it did not compile. >_<

This series adds support for the stateless API into mtk-vcodec, by first
separating the stateful ops into their own source file, and introducing
a new set of ops suitable for stateless decoding. As such, support for
stateful decoders should remain completely unaffected.

This series has been tested with both MT8183 and MT8173. Decoding was
working for both chips, and in the case of MT8173 no regression has been
spotted.

Patches 1-9 add MT8183 support to the decoder using the stateless API.
MT8183 only support H.264 acceleration.

Patches 10-15 are follow-ups that further improve compliance for the
decoder and encoder, by fixing support for commands on both. Patch 11
also makes sure that supported H.264 profiles are exported on MT8173.

Changes since v2:
* Actually compiles (duh),
* Add follow-up patches fixing support for START/STOP commands for the
  encoder, and stateful decoder.

Alexandre Courbot (8):
  media: mtk-vcodec: vdec: handle firmware version field
  media: mtk-vcodec: support version 2 of decoder firmware ABI
  media: add Mediatek's MM21 format
  dt-bindings: media: document mediatek,mt8183-vcodec-dec
  media: mtk-vcodec: vdec: use helpers in VIDIOC_(TRY_)DECODER_CMD
  media: mtk-vcodec: vdec: clamp OUTPUT resolution to hardware limits
  media: mtk-vcodec: make flush buffer reusable by encoder
  media: mtk-vcodec: venc: support START and STOP commands

Hirokazu Honda (1):
  media: mtk-vcodec: vdec: Support H264 profile control

Hsin-Yi Wang (1):
  media: mtk-vcodec: venc: make sure buffer exists in list before
    removing

Yunfei Dong (5):
  media: mtk-vcodec: vdec: move stateful ops into their own file
  media: mtk-vcodec: vdec: support stateless API
  media: mtk-vcodec: vdec: support stateless H.264 decoding
  media: mtk-vcodec: vdec: add media device if using stateless api
  media: mtk-vcodec: enable MT8183 decoder

 .../bindings/media/mediatek-vcodec.txt        |   1 +
 .../media/v4l/pixfmt-reserved.rst             |   7 +
 drivers/media/platform/Kconfig                |   2 +
 drivers/media/platform/mtk-vcodec/Makefile    |   3 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 800 +++--------------
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |  30 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  66 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      | 647 ++++++++++++++
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 +++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  58 +-
 .../platform/mtk-vcodec/mtk_vcodec_enc.c      | 135 ++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |   4 +
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |  23 +-
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   |  43 +-
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |   5 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |   1 +
 include/uapi/linux/videodev2.h                |   1 +
 20 files changed, 2360 insertions(+), 704 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c

-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 01/15] media: mtk-vcodec: vdec: move stateful ops into their own file
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

From: Yunfei Dong <yunfei.dong@mediatek.com>

We are planning to add support for stateless decoders to this driver.
Part of the driver will be shared between stateful and stateless
codecs, but a few ops need to be specialized for both. Extract the
stateful part of the driver and move it into its own file, accessible
through ops that the common driver parts can call.

This patch only moves code around and introduces a set of abstractions ;
the behavior of the driver should not be changed in any way.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 699 ++----------------
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |  19 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  10 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      | 634 ++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  41 +
 6 files changed, 759 insertions(+), 645 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 4618d43dbbc8..9c3cbb5b800e 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -11,6 +11,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
 		mtk_vcodec_dec.o \
+		mtk_vcodec_dec_stateful.o \
 		mtk_vcodec_dec_pm.o \
 
 mtk-vcodec-enc-y := venc/venc_vp8_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 56d86e59421e..4a91d294002b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -16,68 +16,17 @@
 #include "vdec_drv_if.h"
 #include "mtk_vcodec_dec_pm.h"
 
-#define OUT_FMT_IDX	0
-#define CAP_FMT_IDX	3
-
-#define MTK_VDEC_MIN_W	64U
-#define MTK_VDEC_MIN_H	64U
 #define DFT_CFG_WIDTH	MTK_VDEC_MIN_W
 #define DFT_CFG_HEIGHT	MTK_VDEC_MIN_H
 
-static const struct mtk_video_fmt mtk_video_formats[] = {
-	{
-		.fourcc = V4L2_PIX_FMT_H264,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP8,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP9,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_MT21C,
-		.type = MTK_FMT_FRAME,
-		.num_planes = 2,
-	},
-};
-
-static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
-	{
-		.fourcc	= V4L2_PIX_FMT_H264,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-	{
-		.fourcc	= V4L2_PIX_FMT_VP8,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP9,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-};
-
-#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
-#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
-
-static const struct mtk_video_fmt *mtk_vdec_find_format(struct v4l2_format *f)
+static const struct mtk_video_fmt *mtk_vdec_find_format(struct v4l2_format *f,
+				   const struct mtk_vcodec_dec_pdata *dec_pdata)
 {
 	const struct mtk_video_fmt *fmt;
 	unsigned int k;
 
-	for (k = 0; k < NUM_FORMATS; k++) {
-		fmt = &mtk_video_formats[k];
+	for (k = 0; k < dec_pdata->num_formats; k++) {
+		fmt = &dec_pdata->vdec_formats[k];
 		if (fmt->fourcc == f->fmt.pix_mp.pixelformat)
 			return fmt;
 	}
@@ -94,393 +43,6 @@ static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
 	return &ctx->q_data[MTK_Q_DATA_DST];
 }
 
-/*
- * This function tries to clean all display buffers, the buffers will return
- * in display order.
- * Note the buffers returned from codec driver may still be in driver's
- * reference list.
- */
-static struct vb2_buffer *get_display_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vdec_fb *disp_frame_buffer = NULL;
-	struct mtk_video_dec_buf *dstbuf;
-	struct vb2_v4l2_buffer *vb;
-
-	mtk_v4l2_debug(3, "[%d]", ctx->id);
-	if (vdec_if_get_param(ctx,
-			GET_PARAM_DISP_FRAME_BUFFER,
-			&disp_frame_buffer)) {
-		mtk_v4l2_err("[%d]Cannot get param : GET_PARAM_DISP_FRAME_BUFFER",
-			ctx->id);
-		return NULL;
-	}
-
-	if (disp_frame_buffer == NULL) {
-		mtk_v4l2_debug(3, "No display frame buffer");
-		return NULL;
-	}
-
-	dstbuf = container_of(disp_frame_buffer, struct mtk_video_dec_buf,
-				frame_buffer);
-	vb = &dstbuf->m2m_buf.vb;
-	mutex_lock(&ctx->lock);
-	if (dstbuf->used) {
-		vb2_set_plane_payload(&vb->vb2_buf, 0,
-				      ctx->picinfo.fb_sz[0]);
-		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
-			vb2_set_plane_payload(&vb->vb2_buf, 1,
-					      ctx->picinfo.fb_sz[1]);
-
-		mtk_v4l2_debug(2,
-				"[%d]status=%x queue id=%d to done_list %d",
-				ctx->id, disp_frame_buffer->status,
-				vb->vb2_buf.index,
-				dstbuf->queued_in_vb2);
-
-		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
-		ctx->decoded_frame_cnt++;
-	}
-	mutex_unlock(&ctx->lock);
-	return &vb->vb2_buf;
-}
-
-/*
- * This function tries to clean all capture buffers that are not used as
- * reference buffers by codec driver any more
- * In this case, we need re-queue buffer to vb2 buffer if user space
- * already returns this buffer to v4l2 or this buffer is just the output of
- * previous sps/pps/resolution change decode, or do nothing if user
- * space still owns this buffer
- */
-static struct vb2_buffer *get_free_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct mtk_video_dec_buf *dstbuf;
-	struct vdec_fb *free_frame_buffer = NULL;
-	struct vb2_v4l2_buffer *vb;
-
-	if (vdec_if_get_param(ctx,
-				GET_PARAM_FREE_FRAME_BUFFER,
-				&free_frame_buffer)) {
-		mtk_v4l2_err("[%d] Error!! Cannot get param", ctx->id);
-		return NULL;
-	}
-	if (free_frame_buffer == NULL) {
-		mtk_v4l2_debug(3, " No free frame buffer");
-		return NULL;
-	}
-
-	mtk_v4l2_debug(3, "[%d] tmp_frame_addr = 0x%p",
-			ctx->id, free_frame_buffer);
-
-	dstbuf = container_of(free_frame_buffer, struct mtk_video_dec_buf,
-				frame_buffer);
-	vb = &dstbuf->m2m_buf.vb;
-
-	mutex_lock(&ctx->lock);
-	if (dstbuf->used) {
-		if ((dstbuf->queued_in_vb2) &&
-		    (dstbuf->queued_in_v4l2) &&
-		    (free_frame_buffer->status == FB_ST_FREE)) {
-			/*
-			 * After decode sps/pps or non-display buffer, we don't
-			 * need to return capture buffer to user space, but
-			 * just re-queue this capture buffer to vb2 queue.
-			 * This reduce overheads that dq/q unused capture
-			 * buffer. In this case, queued_in_vb2 = true.
-			 */
-			mtk_v4l2_debug(2,
-				"[%d]status=%x queue id=%d to rdy_queue %d",
-				ctx->id, free_frame_buffer->status,
-				vb->vb2_buf.index,
-				dstbuf->queued_in_vb2);
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
-		} else if (!dstbuf->queued_in_vb2 && dstbuf->queued_in_v4l2) {
-			/*
-			 * If buffer in v4l2 driver but not in vb2 queue yet,
-			 * and we get this buffer from free_list, it means
-			 * that codec driver do not use this buffer as
-			 * reference buffer anymore. We should q buffer to vb2
-			 * queue, so later work thread could get this buffer
-			 * for decode. In this case, queued_in_vb2 = false
-			 * means this buffer is not from previous decode
-			 * output.
-			 */
-			mtk_v4l2_debug(2,
-					"[%d]status=%x queue id=%d to rdy_queue",
-					ctx->id, free_frame_buffer->status,
-					vb->vb2_buf.index);
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
-			dstbuf->queued_in_vb2 = true;
-		} else {
-			/*
-			 * Codec driver do not need to reference this capture
-			 * buffer and this buffer is not in v4l2 driver.
-			 * Then we don't need to do any thing, just add log when
-			 * we need to debug buffer flow.
-			 * When this buffer q from user space, it could
-			 * directly q to vb2 buffer
-			 */
-			mtk_v4l2_debug(3, "[%d]status=%x err queue id=%d %d %d",
-					ctx->id, free_frame_buffer->status,
-					vb->vb2_buf.index,
-					dstbuf->queued_in_vb2,
-					dstbuf->queued_in_v4l2);
-		}
-		dstbuf->used = false;
-	}
-	mutex_unlock(&ctx->lock);
-	return &vb->vb2_buf;
-}
-
-static void clean_display_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vb2_buffer *framptr;
-
-	do {
-		framptr = get_display_buffer(ctx);
-	} while (framptr);
-}
-
-static void clean_free_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vb2_buffer *framptr;
-
-	do {
-		framptr = get_free_buffer(ctx);
-	} while (framptr);
-}
-
-static void mtk_vdec_queue_res_chg_event(struct mtk_vcodec_ctx *ctx)
-{
-	static const struct v4l2_event ev_src_ch = {
-		.type = V4L2_EVENT_SOURCE_CHANGE,
-		.u.src_change.changes =
-		V4L2_EVENT_SRC_CH_RESOLUTION,
-	};
-
-	mtk_v4l2_debug(1, "[%d]", ctx->id);
-	v4l2_event_queue_fh(&ctx->fh, &ev_src_ch);
-}
-
-static void mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
-{
-	bool res_chg;
-	int ret = 0;
-
-	ret = vdec_if_decode(ctx, NULL, NULL, &res_chg);
-	if (ret)
-		mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
-
-	clean_display_buffer(ctx);
-	clean_free_buffer(ctx);
-}
-
-static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
-				unsigned int pixelformat)
-{
-	const struct mtk_video_fmt *fmt;
-	struct mtk_q_data *dst_q_data;
-	unsigned int k;
-
-	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
-	for (k = 0; k < NUM_FORMATS; k++) {
-		fmt = &mtk_video_formats[k];
-		if (fmt->fourcc == pixelformat) {
-			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
-				dst_q_data->fmt->fourcc, pixelformat);
-			dst_q_data->fmt = fmt;
-			return;
-		}
-	}
-
-	mtk_v4l2_err("Cannot get fourcc(%d), using init value", pixelformat);
-}
-
-static int mtk_vdec_pic_info_update(struct mtk_vcodec_ctx *ctx)
-{
-	unsigned int dpbsize = 0;
-	int ret;
-
-	if (vdec_if_get_param(ctx,
-				GET_PARAM_PIC_INFO,
-				&ctx->last_decoded_picinfo)) {
-		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
-				ctx->id);
-		return -EINVAL;
-	}
-
-	if (ctx->last_decoded_picinfo.pic_w == 0 ||
-		ctx->last_decoded_picinfo.pic_h == 0 ||
-		ctx->last_decoded_picinfo.buf_w == 0 ||
-		ctx->last_decoded_picinfo.buf_h == 0) {
-		mtk_v4l2_err("Cannot get correct pic info");
-		return -EINVAL;
-	}
-
-	if (ctx->last_decoded_picinfo.cap_fourcc != ctx->picinfo.cap_fourcc &&
-		ctx->picinfo.cap_fourcc != 0)
-		mtk_vdec_update_fmt(ctx, ctx->picinfo.cap_fourcc);
-
-	if ((ctx->last_decoded_picinfo.pic_w == ctx->picinfo.pic_w) ||
-	    (ctx->last_decoded_picinfo.pic_h == ctx->picinfo.pic_h))
-		return 0;
-
-	mtk_v4l2_debug(1,
-			"[%d]-> new(%d,%d), old(%d,%d), real(%d,%d)",
-			ctx->id, ctx->last_decoded_picinfo.pic_w,
-			ctx->last_decoded_picinfo.pic_h,
-			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
-			ctx->last_decoded_picinfo.buf_w,
-			ctx->last_decoded_picinfo.buf_h);
-
-	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
-	if (dpbsize == 0)
-		mtk_v4l2_err("Incorrect dpb size, ret=%d", ret);
-
-	ctx->dpb_size = dpbsize;
-
-	return ret;
-}
-
-static void mtk_vdec_worker(struct work_struct *work)
-{
-	struct mtk_vcodec_ctx *ctx = container_of(work, struct mtk_vcodec_ctx,
-				decode_work);
-	struct mtk_vcodec_dev *dev = ctx->dev;
-	struct vb2_v4l2_buffer *src_buf, *dst_buf;
-	struct mtk_vcodec_mem buf;
-	struct vdec_fb *pfb;
-	bool res_chg = false;
-	int ret;
-	struct mtk_video_dec_buf *dst_buf_info, *src_buf_info;
-
-	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	if (src_buf == NULL) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_debug(1, "[%d] src_buf empty!!", ctx->id);
-		return;
-	}
-
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
-	if (dst_buf == NULL) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
-		return;
-	}
-
-	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
-	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
-
-	pfb = &dst_buf_info->frame_buffer;
-	pfb->base_y.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
-	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
-	pfb->base_y.size = ctx->picinfo.fb_sz[0];
-
-	pfb->base_c.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 1);
-	pfb->base_c.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 1);
-	pfb->base_c.size = ctx->picinfo.fb_sz[1];
-	pfb->status = 0;
-	mtk_v4l2_debug(3, "===>[%d] vdec_if_decode() ===>", ctx->id);
-
-	mtk_v4l2_debug(3,
-			"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx",
-			dst_buf->vb2_buf.index, pfb,
-			pfb->base_y.va, &pfb->base_y.dma_addr,
-			&pfb->base_c.dma_addr, pfb->base_y.size);
-
-	if (src_buf_info->lastframe) {
-		mtk_v4l2_debug(1, "Got empty flush input buffer.");
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-
-		/* update dst buf status */
-		dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
-		mutex_lock(&ctx->lock);
-		dst_buf_info->used = false;
-		mutex_unlock(&ctx->lock);
-
-		vdec_if_decode(ctx, NULL, NULL, &res_chg);
-		clean_display_buffer(ctx);
-		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
-		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
-			vb2_set_plane_payload(&dst_buf->vb2_buf, 1, 0);
-		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
-		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
-		clean_free_buffer(ctx);
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		return;
-	}
-	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
-	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
-	if (!buf.va) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_err("[%d] id=%d src_addr is NULL!!",
-				ctx->id, src_buf->vb2_buf.index);
-		return;
-	}
-	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
-			ctx->id, buf.va, &buf.dma_addr, buf.size, src_buf);
-	dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
-	dst_buf->timecode = src_buf->timecode;
-	mutex_lock(&ctx->lock);
-	dst_buf_info->used = true;
-	mutex_unlock(&ctx->lock);
-	src_buf_info->used = true;
-
-	ret = vdec_if_decode(ctx, &buf, pfb, &res_chg);
-
-	if (ret) {
-		mtk_v4l2_err(
-			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu dst_buf[%d] vdec_if_decode() ret=%d res_chg=%d===>",
-			ctx->id,
-			src_buf->vb2_buf.index,
-			buf.size,
-			src_buf->vb2_buf.timestamp,
-			dst_buf->vb2_buf.index,
-			ret, res_chg);
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		if (ret == -EIO) {
-			mutex_lock(&ctx->lock);
-			src_buf_info->error = true;
-			mutex_unlock(&ctx->lock);
-		}
-		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
-	} else if (!res_chg) {
-		/*
-		 * we only return src buffer with VB2_BUF_STATE_DONE
-		 * when decode success without resolution change
-		 */
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
-	}
-
-	dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
-	clean_display_buffer(ctx);
-	clean_free_buffer(ctx);
-
-	if (!ret && res_chg) {
-		mtk_vdec_pic_info_update(ctx);
-		/*
-		 * On encountering a resolution change in the stream.
-		 * The driver must first process and decode all
-		 * remaining buffers from before the resolution change
-		 * point, so call flush decode here
-		 */
-		mtk_vdec_flush_decoder(ctx);
-		/*
-		 * After all buffers containing decoded frames from
-		 * before the resolution change point ready to be
-		 * dequeued on the CAPTURE queue, the driver sends a
-		 * V4L2_EVENT_SOURCE_CHANGE event for source change
-		 * type V4L2_EVENT_SRC_CH_RESOLUTION
-		 */
-		mtk_vdec_queue_res_chg_event(ctx);
-	}
-	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-}
-
 static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
@@ -561,10 +123,12 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 {
 	struct mtk_q_data *q_data;
 
+	ctx->dev->vdec_pdata->init_vdec_params(ctx);
+
 	ctx->m2m_ctx->q_lock = &ctx->dev->dev_mutex;
 	ctx->fh.m2m_ctx = ctx->m2m_ctx;
 	ctx->fh.ctrl_handler = &ctx->ctrl_hdl;
-	INIT_WORK(&ctx->decode_work, mtk_vdec_worker);
+	INIT_WORK(&ctx->decode_work, ctx->dev->vdec_pdata->worker);
 	ctx->colorspace = V4L2_COLORSPACE_REC709;
 	ctx->ycbcr_enc = V4L2_YCBCR_ENC_DEFAULT;
 	ctx->quantization = V4L2_QUANTIZATION_DEFAULT;
@@ -574,7 +138,7 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 	memset(q_data, 0, sizeof(struct mtk_q_data));
 	q_data->visible_width = DFT_CFG_WIDTH;
 	q_data->visible_height = DFT_CFG_HEIGHT;
-	q_data->fmt = &mtk_video_formats[OUT_FMT_IDX];
+	q_data->fmt = ctx->dev->vdec_pdata->default_out_fmt;
 	q_data->field = V4L2_FIELD_NONE;
 
 	q_data->sizeimage[0] = DFT_CFG_WIDTH * DFT_CFG_HEIGHT;
@@ -586,7 +150,7 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 	q_data->visible_height = DFT_CFG_HEIGHT;
 	q_data->coded_width = DFT_CFG_WIDTH;
 	q_data->coded_height = DFT_CFG_HEIGHT;
-	q_data->fmt = &mtk_video_formats[CAP_FMT_IDX];
+	q_data->fmt = ctx->dev->vdec_pdata->default_cap_fmt;
 	q_data->field = V4L2_FIELD_NONE;
 
 	v4l_bound_align_image(&q_data->coded_width,
@@ -722,11 +286,14 @@ static int vidioc_try_fmt_vid_cap_mplane(struct file *file, void *priv,
 				struct v4l2_format *f)
 {
 	const struct mtk_video_fmt *fmt;
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (!fmt) {
-		f->fmt.pix.pixelformat = mtk_video_formats[CAP_FMT_IDX].fourcc;
-		fmt = mtk_vdec_find_format(f);
+		f->fmt.pix.pixelformat =
+			ctx->q_data[MTK_Q_DATA_DST].fmt->fourcc;
+		fmt = mtk_vdec_find_format(f, dec_pdata);
 	}
 
 	return vidioc_try_fmt(f, fmt);
@@ -737,11 +304,14 @@ static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
 {
 	struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
 	const struct mtk_video_fmt *fmt;
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (!fmt) {
-		f->fmt.pix.pixelformat = mtk_video_formats[OUT_FMT_IDX].fourcc;
-		fmt = mtk_vdec_find_format(f);
+		f->fmt.pix.pixelformat =
+			ctx->q_data[MTK_Q_DATA_SRC].fmt->fourcc;
+		fmt = mtk_vdec_find_format(f, dec_pdata);
 	}
 
 	if (pix_fmt_mp->plane_fmt[0].sizeimage == 0) {
@@ -831,6 +401,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 	struct mtk_q_data *q_data;
 	int ret = 0;
 	const struct mtk_video_fmt *fmt;
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
 	mtk_v4l2_debug(3, "[%d]", ctx->id);
 
@@ -859,16 +430,16 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		ret = -EBUSY;
 	}
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (fmt == NULL) {
 		if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 			f->fmt.pix.pixelformat =
-				mtk_video_formats[OUT_FMT_IDX].fourcc;
-			fmt = mtk_vdec_find_format(f);
+				dec_pdata->default_out_fmt->fourcc;
+			fmt = mtk_vdec_find_format(f, dec_pdata);
 		} else if (f->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 			f->fmt.pix.pixelformat =
-				mtk_video_formats[CAP_FMT_IDX].fourcc;
-			fmt = mtk_vdec_find_format(f);
+				dec_pdata->default_cap_fmt->fourcc;
+			fmt = mtk_vdec_find_format(f, dec_pdata);
 		}
 	}
 	if (fmt == NULL)
@@ -905,16 +476,17 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 {
 	int i = 0;
 	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
 	if (fsize->index != 0)
 		return -EINVAL;
 
-	for (i = 0; i < NUM_SUPPORTED_FRAMESIZE; ++i) {
-		if (fsize->pixel_format != mtk_vdec_framesizes[i].fourcc)
+	for (i = 0; i < dec_pdata->num_framesizes; ++i) {
+		if (fsize->pixel_format != dec_pdata->vdec_framesizes[i].fourcc)
 			continue;
 
 		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
-		fsize->stepwise = mtk_vdec_framesizes[i].stepwise;
+		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
 		if (!(ctx->dev->dec_capability &
 				VCODEC_CAPABILITY_4K_DISABLED)) {
 			mtk_v4l2_debug(3, "4K is enabled");
@@ -937,16 +509,20 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 	return -EINVAL;
 }
 
-static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
+static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
+				bool output_queue)
 {
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 	const struct mtk_video_fmt *fmt;
 	int i, j = 0;
 
-	for (i = 0; i < NUM_FORMATS; i++) {
-		if (output_queue && (mtk_video_formats[i].type != MTK_FMT_DEC))
+	for (i = 0; i < dec_pdata->num_formats; i++) {
+		if (output_queue &&
+			(dec_pdata->vdec_formats[i].type != MTK_FMT_DEC))
 			continue;
 		if (!output_queue &&
-			(mtk_video_formats[i].type != MTK_FMT_FRAME))
+			(dec_pdata->vdec_formats[i].type != MTK_FMT_FRAME))
 			continue;
 
 		if (j == f->index)
@@ -954,10 +530,10 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
 		++j;
 	}
 
-	if (i == NUM_FORMATS)
+	if (i == dec_pdata->num_formats)
 		return -EINVAL;
 
-	fmt = &mtk_video_formats[i];
+	fmt = &dec_pdata->vdec_formats[i];
 	f->pixelformat = fmt->fourcc;
 	f->flags = fmt->flags;
 
@@ -967,13 +543,13 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
 static int vidioc_vdec_enum_fmt_vid_cap(struct file *file, void *priv,
 					struct v4l2_fmtdesc *f)
 {
-	return vidioc_enum_fmt(f, false);
+	return vidioc_enum_fmt(f, priv, false);
 }
 
 static int vidioc_vdec_enum_fmt_vid_out(struct file *file, void *priv,
 					struct v4l2_fmtdesc *f)
 {
-	return vidioc_enum_fmt(f, true);
+	return vidioc_enum_fmt(f, priv, true);
 }
 
 static int vidioc_vdec_g_fmt(struct file *file, void *priv,
@@ -1064,7 +640,7 @@ static int vidioc_vdec_g_fmt(struct file *file, void *priv,
 	return 0;
 }
 
-static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
+int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 				unsigned int *nbuffers,
 				unsigned int *nplanes,
 				unsigned int sizes[],
@@ -1088,7 +664,7 @@ static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 		}
 	} else {
 		if (vq->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
-			*nplanes = 2;
+			*nplanes = q_data->fmt->num_planes;
 		else
 			*nplanes = 1;
 
@@ -1104,7 +680,7 @@ static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 	return 0;
 }
 
-static int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
+int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct mtk_q_data *q_data;
@@ -1126,128 +702,7 @@ static int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
 	return 0;
 }
 
-static void vb2ops_vdec_buf_queue(struct vb2_buffer *vb)
-{
-	struct vb2_v4l2_buffer *src_buf;
-	struct mtk_vcodec_mem src_mem;
-	bool res_chg = false;
-	int ret = 0;
-	unsigned int dpbsize = 1, i = 0;
-	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
-	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
-	struct mtk_video_dec_buf *buf = NULL;
-	struct mtk_q_data *dst_q_data;
-
-	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
-			ctx->id, vb->vb2_queue->type,
-			vb->index, vb);
-	/*
-	 * check if this buffer is ready to be used after decode
-	 */
-	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
-		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
-		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
-				   m2m_buf.vb);
-		mutex_lock(&ctx->lock);
-		if (!buf->used) {
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
-			buf->queued_in_vb2 = true;
-			buf->queued_in_v4l2 = true;
-		} else {
-			buf->queued_in_vb2 = false;
-			buf->queued_in_v4l2 = true;
-		}
-		mutex_unlock(&ctx->lock);
-		return;
-	}
-
-	v4l2_m2m_buf_queue(ctx->m2m_ctx, to_vb2_v4l2_buffer(vb));
-
-	if (ctx->state != MTK_STATE_INIT) {
-		mtk_v4l2_debug(3, "[%d] already init driver %d",
-				ctx->id, ctx->state);
-		return;
-	}
-
-	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	if (!src_buf) {
-		mtk_v4l2_err("No src buffer");
-		return;
-	}
-	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-	if (buf->lastframe) {
-		/* This shouldn't happen. Just in case. */
-		mtk_v4l2_err("Invalid flush buffer.");
-		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		return;
-	}
-
-	src_mem.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
-	src_mem.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-	src_mem.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
-	mtk_v4l2_debug(2,
-			"[%d] buf id=%d va=%p dma=%pad size=%zx",
-			ctx->id, src_buf->vb2_buf.index,
-			src_mem.va, &src_mem.dma_addr,
-			src_mem.size);
-
-	ret = vdec_if_decode(ctx, &src_mem, NULL, &res_chg);
-	if (ret || !res_chg) {
-		/*
-		 * fb == NULL means to parse SPS/PPS header or
-		 * resolution info in src_mem. Decode can fail
-		 * if there is no SPS header or picture info
-		 * in bs
-		 */
-
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		if (ret == -EIO) {
-			mtk_v4l2_err("[%d] Unrecoverable error in vdec_if_decode.",
-					ctx->id);
-			ctx->state = MTK_STATE_ABORT;
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
-		} else {
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
-		}
-		mtk_v4l2_debug(ret ? 0 : 1,
-			       "[%d] vdec_if_decode() src_buf=%d, size=%zu, fail=%d, res_chg=%d",
-			       ctx->id, src_buf->vb2_buf.index,
-			       src_mem.size, ret, res_chg);
-		return;
-	}
-
-	if (vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo)) {
-		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
-				ctx->id);
-		return;
-	}
-
-	ctx->last_decoded_picinfo = ctx->picinfo;
-	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
-	for (i = 0; i < dst_q_data->fmt->num_planes; i++) {
-		dst_q_data->sizeimage[i] = ctx->picinfo.fb_sz[i];
-		dst_q_data->bytesperline[i] = ctx->picinfo.buf_w;
-	}
-
-	mtk_v4l2_debug(2, "[%d] vdec_if_init() OK wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
-			ctx->id,
-			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
-			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
-			dst_q_data->sizeimage[0],
-			dst_q_data->sizeimage[1]);
-
-	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
-	if (dpbsize == 0)
-		mtk_v4l2_err("[%d] GET_PARAM_DPB_SIZE fail=%d", ctx->id, ret);
-
-	ctx->dpb_size = dpbsize;
-	ctx->state = MTK_STATE_HEADER;
-	mtk_v4l2_debug(1, "[%d] dpbsize=%d", ctx->id, ctx->dpb_size);
-
-	mtk_vdec_queue_res_chg_event(ctx);
-}
-
-static void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
+void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct vb2_v4l2_buffer *vb2_v4l2;
@@ -1270,7 +725,7 @@ static void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
 	}
 }
 
-static int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
+int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 {
 	struct vb2_v4l2_buffer *vb2_v4l2 = container_of(vb,
 					struct vb2_v4l2_buffer, vb2_buf);
@@ -1287,7 +742,7 @@ static int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 	return 0;
 }
 
-static int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
+int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(q);
 
@@ -1297,10 +752,11 @@ static int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
 	return 0;
 }
 
-static void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
+void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 {
 	struct vb2_v4l2_buffer *src_buf = NULL, *dst_buf = NULL;
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(q);
+	int ret;
 
 	mtk_v4l2_debug(3, "[%d] (%d) state=(%x) ctx->decoded_frame_cnt=%d",
 			ctx->id, q->type, ctx->state, ctx->decoded_frame_cnt);
@@ -1334,7 +790,9 @@ static void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 				ctx->last_decoded_picinfo.buf_w,
 				ctx->last_decoded_picinfo.buf_h);
 
-		mtk_vdec_flush_decoder(ctx);
+		ret = ctx->dev->vdec_pdata->flush_decoder(ctx);
+		if (ret)
+			mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
 	}
 	ctx->state = MTK_STATE_FLUSH;
 
@@ -1381,7 +839,7 @@ static void m2mops_vdec_job_abort(void *priv)
 	ctx->state = MTK_STATE_ABORT;
 }
 
-static int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
+int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
 {
 	struct mtk_vcodec_ctx *ctx = ctrl_to_ctx(ctrl);
 	int ret = 0;
@@ -1401,55 +859,12 @@ static int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
 	return ret;
 }
 
-static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
-	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
-};
-
-int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
-{
-	struct v4l2_ctrl *ctrl;
-
-	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, 1);
-
-	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
-				&mtk_vcodec_dec_ctrl_ops,
-				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
-				0, 32, 1, 1);
-	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
-	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
-				&mtk_vcodec_dec_ctrl_ops,
-				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
-				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
-				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
-
-	if (ctx->ctrl_hdl.error) {
-		mtk_v4l2_err("Adding control failed %d",
-				ctx->ctrl_hdl.error);
-		return ctx->ctrl_hdl.error;
-	}
-
-	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
-	return 0;
-}
-
 const struct v4l2_m2m_ops mtk_vdec_m2m_ops = {
 	.device_run	= m2mops_vdec_device_run,
 	.job_ready	= m2mops_vdec_job_ready,
 	.job_abort	= m2mops_vdec_job_abort,
 };
 
-static const struct vb2_ops mtk_vdec_vb2_ops = {
-	.queue_setup	= vb2ops_vdec_queue_setup,
-	.buf_prepare	= vb2ops_vdec_buf_prepare,
-	.buf_queue	= vb2ops_vdec_buf_queue,
-	.wait_prepare	= vb2_ops_wait_prepare,
-	.wait_finish	= vb2_ops_wait_finish,
-	.buf_init	= vb2ops_vdec_buf_init,
-	.buf_finish	= vb2ops_vdec_buf_finish,
-	.start_streaming	= vb2ops_vdec_start_streaming,
-	.stop_streaming	= vb2ops_vdec_stop_streaming,
-};
-
 const struct v4l2_ioctl_ops mtk_vdec_ioctl_ops = {
 	.vidioc_streamon	= v4l2_m2m_ioctl_streamon,
 	.vidioc_streamoff	= v4l2_m2m_ioctl_streamoff,
@@ -1496,7 +911,7 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes	= VB2_DMABUF | VB2_MMAP;
 	src_vq->drv_priv	= ctx;
 	src_vq->buf_struct_size = sizeof(struct mtk_video_dec_buf);
-	src_vq->ops		= &mtk_vdec_vb2_ops;
+	src_vq->ops		= ctx->dev->vdec_pdata->vdec_vb2_ops;
 	src_vq->mem_ops		= &vb2_dma_contig_memops;
 	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	src_vq->lock		= &ctx->dev->dev_mutex;
@@ -1511,7 +926,7 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes	= VB2_DMABUF | VB2_MMAP;
 	dst_vq->drv_priv	= ctx;
 	dst_vq->buf_struct_size = sizeof(struct mtk_video_dec_buf);
-	dst_vq->ops		= &mtk_vdec_vb2_ops;
+	dst_vq->ops		= ctx->dev->vdec_pdata->vdec_vb2_ops;
 	dst_vq->mem_ops		= &vb2_dma_contig_memops;
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	dst_vq->lock		= &ctx->dev->dev_mutex;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index cf26b6c1486a..97a8304f6600 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -16,6 +16,8 @@
 #define VCODEC_DEC_4K_CODED_HEIGHT	2304U
 #define MTK_VDEC_MAX_W	2048U
 #define MTK_VDEC_MAX_H	1088U
+#define MTK_VDEC_MIN_W	64U
+#define MTK_VDEC_MIN_H	64U
 
 #define MTK_VDEC_IRQ_STATUS_DEC_SUCCESS        0x10000
 
@@ -73,7 +75,22 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 			   struct vb2_queue *dst_vq);
 void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx);
 void mtk_vcodec_dec_release(struct mtk_vcodec_ctx *ctx);
-int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx);
+
+int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl);
+
+/*
+ * VB2 ops
+ */
+int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
+				unsigned int *nbuffers,
+				unsigned int *nplanes,
+				unsigned int sizes[],
+				struct device *alloc_devs[]);
+int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb);
+void vb2ops_vdec_buf_finish(struct vb2_buffer *vb);
+int vb2ops_vdec_buf_init(struct vb2_buffer *vb);
+int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count);
+void vb2ops_vdec_stop_streaming(struct vb2_queue *q);
 
 
 #endif /* _MTK_VCODEC_DEC_H_ */
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 147dfef1638d..533781d4680a 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -106,7 +106,7 @@ static int fops_vcodec_open(struct file *file)
 	mutex_init(&ctx->lock);
 
 	ctx->type = MTK_INST_DECODER;
-	ret = mtk_vcodec_dec_ctrls_setup(ctx);
+	ret = dev->vdec_pdata->ctrls_setup(ctx);
 	if (ret) {
 		mtk_v4l2_err("Failed to setup mt vcodec controls");
 		goto err_ctrls_setup;
@@ -222,6 +222,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	INIT_LIST_HEAD(&dev->ctx_list);
 	dev->plat_dev = pdev;
 
+	dev->vdec_pdata = of_device_get_match_data(&pdev->dev);
 	if (!of_property_read_u32(pdev->dev.of_node, "mediatek,vpu",
 				  &rproc_phandle)) {
 		fw_type = VPU;
@@ -349,8 +350,13 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	return ret;
 }
 
+extern const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata;
+
 static const struct of_device_id mtk_vcodec_match[] = {
-	{.compatible = "mediatek,mt8173-vcodec-dec",},
+	{
+		.compatible = "mediatek,mt8173-vcodec-dec",
+		.data = &mtk_vdec_8173_pdata,
+	},
 	{},
 };
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
new file mode 100644
index 000000000000..48b7524bc8fb
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -0,0 +1,634 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <media/v4l2-event.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "mtk_vcodec_drv.h"
+#include "mtk_vcodec_dec.h"
+#include "mtk_vcodec_intr.h"
+#include "mtk_vcodec_util.h"
+#include "vdec_drv_if.h"
+#include "mtk_vcodec_dec_pm.h"
+
+static const struct mtk_video_fmt mtk_video_formats[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_H264,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP8,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_MT21C,
+		.type = MTK_FMT_FRAME,
+		.num_planes = 2,
+	},
+};
+
+#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
+#define DEFAULT_OUT_FMT_IDX	0
+#define DEFAULT_CAP_FMT_IDX	3
+
+static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
+	{
+		.fourcc	= V4L2_PIX_FMT_H264,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+	{
+		.fourcc	= V4L2_PIX_FMT_VP8,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+};
+
+#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
+
+/*
+ * This function tries to clean all display buffers, the buffers will return
+ * in display order.
+ * Note the buffers returned from codec driver may still be in driver's
+ * reference list.
+ */
+static struct vb2_buffer *get_display_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_fb *disp_frame_buffer = NULL;
+	struct mtk_video_dec_buf *dstbuf;
+	struct vb2_v4l2_buffer *vb;
+
+	mtk_v4l2_debug(3, "[%d]", ctx->id);
+	if (vdec_if_get_param(ctx,
+			GET_PARAM_DISP_FRAME_BUFFER,
+			&disp_frame_buffer)) {
+		mtk_v4l2_err("[%d]Cannot get param : GET_PARAM_DISP_FRAME_BUFFER",
+			ctx->id);
+		return NULL;
+	}
+
+	if (disp_frame_buffer == NULL) {
+		mtk_v4l2_debug(3, "No display frame buffer");
+		return NULL;
+	}
+
+	dstbuf = container_of(disp_frame_buffer, struct mtk_video_dec_buf,
+				frame_buffer);
+	vb = &dstbuf->m2m_buf.vb;
+	mutex_lock(&ctx->lock);
+	if (dstbuf->used) {
+		vb2_set_plane_payload(&vb->vb2_buf, 0,
+				      ctx->picinfo.fb_sz[0]);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			vb2_set_plane_payload(&vb->vb2_buf, 1,
+					      ctx->picinfo.fb_sz[1]);
+
+		mtk_v4l2_debug(2,
+				"[%d]status=%x queue id=%d to done_list %d",
+				ctx->id, disp_frame_buffer->status,
+				vb->vb2_buf.index,
+				dstbuf->queued_in_vb2);
+
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
+		ctx->decoded_frame_cnt++;
+	}
+	mutex_unlock(&ctx->lock);
+	return &vb->vb2_buf;
+}
+
+/*
+ * This function tries to clean all capture buffers that are not used as
+ * reference buffers by codec driver any more
+ * In this case, we need re-queue buffer to vb2 buffer if user space
+ * already returns this buffer to v4l2 or this buffer is just the output of
+ * previous sps/pps/resolution change decode, or do nothing if user
+ * space still owns this buffer
+ */
+static struct vb2_buffer *get_free_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct mtk_video_dec_buf *dstbuf;
+	struct vdec_fb *free_frame_buffer = NULL;
+	struct vb2_v4l2_buffer *vb;
+
+	if (vdec_if_get_param(ctx,
+				GET_PARAM_FREE_FRAME_BUFFER,
+				&free_frame_buffer)) {
+		mtk_v4l2_err("[%d] Error!! Cannot get param", ctx->id);
+		return NULL;
+	}
+	if (free_frame_buffer == NULL) {
+		mtk_v4l2_debug(3, " No free frame buffer");
+		return NULL;
+	}
+
+	mtk_v4l2_debug(3, "[%d] tmp_frame_addr = 0x%p",
+			ctx->id, free_frame_buffer);
+
+	dstbuf = container_of(free_frame_buffer, struct mtk_video_dec_buf,
+				frame_buffer);
+	vb = &dstbuf->m2m_buf.vb;
+
+	mutex_lock(&ctx->lock);
+	if (dstbuf->used) {
+		if ((dstbuf->queued_in_vb2) &&
+		    (dstbuf->queued_in_v4l2) &&
+		    (free_frame_buffer->status == FB_ST_FREE)) {
+			/*
+			 * After decode sps/pps or non-display buffer, we don't
+			 * need to return capture buffer to user space, but
+			 * just re-queue this capture buffer to vb2 queue.
+			 * This reduce overheads that dq/q unused capture
+			 * buffer. In this case, queued_in_vb2 = true.
+			 */
+			mtk_v4l2_debug(2,
+				"[%d]status=%x queue id=%d to rdy_queue %d",
+				ctx->id, free_frame_buffer->status,
+				vb->vb2_buf.index,
+				dstbuf->queued_in_vb2);
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
+		} else if (!dstbuf->queued_in_vb2 && dstbuf->queued_in_v4l2) {
+			/*
+			 * If buffer in v4l2 driver but not in vb2 queue yet,
+			 * and we get this buffer from free_list, it means
+			 * that codec driver do not use this buffer as
+			 * reference buffer anymore. We should q buffer to vb2
+			 * queue, so later work thread could get this buffer
+			 * for decode. In this case, queued_in_vb2 = false
+			 * means this buffer is not from previous decode
+			 * output.
+			 */
+			mtk_v4l2_debug(2,
+					"[%d]status=%x queue id=%d to rdy_queue",
+					ctx->id, free_frame_buffer->status,
+					vb->vb2_buf.index);
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
+			dstbuf->queued_in_vb2 = true;
+		} else {
+			/*
+			 * Codec driver do not need to reference this capture
+			 * buffer and this buffer is not in v4l2 driver.
+			 * Then we don't need to do any thing, just add log when
+			 * we need to debug buffer flow.
+			 * When this buffer q from user space, it could
+			 * directly q to vb2 buffer
+			 */
+			mtk_v4l2_debug(3, "[%d]status=%x err queue id=%d %d %d",
+					ctx->id, free_frame_buffer->status,
+					vb->vb2_buf.index,
+					dstbuf->queued_in_vb2,
+					dstbuf->queued_in_v4l2);
+		}
+		dstbuf->used = false;
+	}
+	mutex_unlock(&ctx->lock);
+	return &vb->vb2_buf;
+}
+
+static void clean_display_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_buffer *framptr;
+
+	do {
+		framptr = get_display_buffer(ctx);
+	} while (framptr);
+}
+
+static void clean_free_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_buffer *framptr;
+
+	do {
+		framptr = get_free_buffer(ctx);
+	} while (framptr);
+}
+
+static void mtk_vdec_queue_res_chg_event(struct mtk_vcodec_ctx *ctx)
+{
+	static const struct v4l2_event ev_src_ch = {
+		.type = V4L2_EVENT_SOURCE_CHANGE,
+		.u.src_change.changes =
+		V4L2_EVENT_SRC_CH_RESOLUTION,
+	};
+
+	mtk_v4l2_debug(1, "[%d]", ctx->id);
+	v4l2_event_queue_fh(&ctx->fh, &ev_src_ch);
+}
+
+static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
+{
+	bool res_chg;
+	int ret = 0;
+
+	ret = vdec_if_decode(ctx, NULL, NULL, &res_chg);
+	if (ret)
+		mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
+
+	clean_display_buffer(ctx);
+	clean_free_buffer(ctx);
+
+	return 0;
+}
+
+static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
+				unsigned int pixelformat)
+{
+	const struct mtk_video_fmt *fmt;
+	struct mtk_q_data *dst_q_data;
+	unsigned int k;
+
+	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
+	for (k = 0; k < NUM_FORMATS; k++) {
+		fmt = &mtk_video_formats[k];
+		if (fmt->fourcc == pixelformat) {
+			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
+				dst_q_data->fmt->fourcc, pixelformat);
+			dst_q_data->fmt = fmt;
+			return;
+		}
+	}
+
+	mtk_v4l2_err("Cannot get fourcc(%d), using init value", pixelformat);
+}
+
+static int mtk_vdec_pic_info_update(struct mtk_vcodec_ctx *ctx)
+{
+	unsigned int dpbsize = 0;
+	int ret;
+
+	if (vdec_if_get_param(ctx,
+				GET_PARAM_PIC_INFO,
+				&ctx->last_decoded_picinfo)) {
+		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
+				ctx->id);
+		return -EINVAL;
+	}
+
+	if (ctx->last_decoded_picinfo.pic_w == 0 ||
+		ctx->last_decoded_picinfo.pic_h == 0 ||
+		ctx->last_decoded_picinfo.buf_w == 0 ||
+		ctx->last_decoded_picinfo.buf_h == 0) {
+		mtk_v4l2_err("Cannot get correct pic info");
+		return -EINVAL;
+	}
+
+	if (ctx->last_decoded_picinfo.cap_fourcc != ctx->picinfo.cap_fourcc &&
+		ctx->picinfo.cap_fourcc != 0)
+		mtk_vdec_update_fmt(ctx, ctx->picinfo.cap_fourcc);
+
+	if ((ctx->last_decoded_picinfo.pic_w == ctx->picinfo.pic_w) ||
+	    (ctx->last_decoded_picinfo.pic_h == ctx->picinfo.pic_h))
+		return 0;
+
+	mtk_v4l2_debug(1,
+			"[%d]-> new(%d,%d), old(%d,%d), real(%d,%d)",
+			ctx->id, ctx->last_decoded_picinfo.pic_w,
+			ctx->last_decoded_picinfo.pic_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			ctx->last_decoded_picinfo.buf_w,
+			ctx->last_decoded_picinfo.buf_h);
+
+	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
+	if (dpbsize == 0)
+		mtk_v4l2_err("Incorrect dpb size, ret=%d", ret);
+
+	ctx->dpb_size = dpbsize;
+
+	return ret;
+}
+
+static void mtk_vdec_worker(struct work_struct *work)
+{
+	struct mtk_vcodec_ctx *ctx = container_of(work, struct mtk_vcodec_ctx,
+				decode_work);
+	struct mtk_vcodec_dev *dev = ctx->dev;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct mtk_vcodec_mem buf;
+	struct vdec_fb *pfb;
+	bool res_chg = false;
+	int ret;
+	struct mtk_video_dec_buf *dst_buf_info, *src_buf_info;
+
+	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (src_buf == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] src_buf empty!!", ctx->id);
+		return;
+	}
+
+	dst_buf = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	if (dst_buf == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
+		return;
+	}
+
+	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+
+	pfb = &dst_buf_info->frame_buffer;
+	pfb->base_y.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
+	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	pfb->base_y.size = ctx->picinfo.fb_sz[0];
+
+	pfb->base_c.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 1);
+	pfb->base_c.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 1);
+	pfb->base_c.size = ctx->picinfo.fb_sz[1];
+	pfb->status = 0;
+	mtk_v4l2_debug(3, "===>[%d] vdec_if_decode() ===>", ctx->id);
+
+	mtk_v4l2_debug(3,
+			"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx",
+			dst_buf->vb2_buf.index, pfb,
+			pfb->base_y.va, &pfb->base_y.dma_addr,
+			&pfb->base_c.dma_addr, pfb->base_y.size);
+
+	if (src_buf_info->lastframe) {
+		mtk_v4l2_debug(1, "Got empty flush input buffer.");
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+
+		/* update dst buf status */
+		dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
+		mutex_lock(&ctx->lock);
+		dst_buf_info->used = false;
+		mutex_unlock(&ctx->lock);
+
+		vdec_if_decode(ctx, NULL, NULL, &res_chg);
+		clean_display_buffer(ctx);
+		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			vb2_set_plane_payload(&dst_buf->vb2_buf, 1, 0);
+		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
+		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
+		clean_free_buffer(ctx);
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		return;
+	}
+	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
+	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
+	if (!buf.va) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_err("[%d] id=%d src_addr is NULL!!",
+				ctx->id, src_buf->vb2_buf.index);
+		return;
+	}
+	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
+			ctx->id, buf.va, &buf.dma_addr, buf.size, src_buf);
+	dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
+	dst_buf->timecode = src_buf->timecode;
+	mutex_lock(&ctx->lock);
+	dst_buf_info->used = true;
+	mutex_unlock(&ctx->lock);
+	src_buf_info->used = true;
+
+	ret = vdec_if_decode(ctx, &buf, pfb, &res_chg);
+
+	if (ret) {
+		mtk_v4l2_err(
+			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu dst_buf[%d] vdec_if_decode() ret=%d res_chg=%d===>",
+			ctx->id,
+			src_buf->vb2_buf.index,
+			buf.size,
+			src_buf->vb2_buf.timestamp,
+			dst_buf->vb2_buf.index,
+			ret, res_chg);
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		if (ret == -EIO) {
+			mutex_lock(&ctx->lock);
+			src_buf_info->error = true;
+			mutex_unlock(&ctx->lock);
+		}
+		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+	} else if (!res_chg) {
+		/*
+		 * we only return src buffer with VB2_BUF_STATE_DONE
+		 * when decode success without resolution change
+		 */
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+	}
+
+	dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
+	clean_display_buffer(ctx);
+	clean_free_buffer(ctx);
+
+	if (!ret && res_chg) {
+		mtk_vdec_pic_info_update(ctx);
+		/*
+		 * On encountering a resolution change in the stream.
+		 * The driver must first process and decode all
+		 * remaining buffers from before the resolution change
+		 * point, so call flush decode here
+		 */
+		mtk_vdec_flush_decoder(ctx);
+		/*
+		 * After all buffers containing decoded frames from
+		 * before the resolution change point ready to be
+		 * dequeued on the CAPTURE queue, the driver sends a
+		 * V4L2_EVENT_SOURCE_CHANGE event for source change
+		 * type V4L2_EVENT_SRC_CH_RESOLUTION
+		 */
+		mtk_vdec_queue_res_chg_event(ctx);
+	}
+	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+}
+
+static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
+{
+	struct vb2_v4l2_buffer *src_buf;
+	struct mtk_vcodec_mem src_mem;
+	bool res_chg = false;
+	int ret = 0;
+	unsigned int dpbsize = 1, i = 0;
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
+	struct mtk_video_dec_buf *buf = NULL;
+	struct mtk_q_data *dst_q_data;
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
+			ctx->id, vb->vb2_queue->type,
+			vb->index, vb);
+	/*
+	 * check if this buffer is ready to be used after decode
+	 */
+	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
+		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
+		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
+				   m2m_buf.vb);
+		mutex_lock(&ctx->lock);
+		if (!buf->used) {
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
+			buf->queued_in_vb2 = true;
+			buf->queued_in_v4l2 = true;
+		} else {
+			buf->queued_in_vb2 = false;
+			buf->queued_in_v4l2 = true;
+		}
+		mutex_unlock(&ctx->lock);
+		return;
+	}
+
+	v4l2_m2m_buf_queue(ctx->m2m_ctx, to_vb2_v4l2_buffer(vb));
+
+	if (ctx->state != MTK_STATE_INIT) {
+		mtk_v4l2_debug(3, "[%d] already init driver %d",
+				ctx->id, ctx->state);
+		return;
+	}
+
+	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (!src_buf) {
+		mtk_v4l2_err("No src buffer");
+		return;
+	}
+	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
+	if (buf->lastframe) {
+		/* This shouldn't happen. Just in case. */
+		mtk_v4l2_err("Invalid flush buffer.");
+		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		return;
+	}
+
+	src_mem.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
+	src_mem.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	src_mem.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
+	mtk_v4l2_debug(2,
+			"[%d] buf id=%d va=%p dma=%pad size=%zx",
+			ctx->id, src_buf->vb2_buf.index,
+			src_mem.va, &src_mem.dma_addr,
+			src_mem.size);
+
+	ret = vdec_if_decode(ctx, &src_mem, NULL, &res_chg);
+	if (ret || !res_chg) {
+		/*
+		 * fb == NULL means to parse SPS/PPS header or
+		 * resolution info in src_mem. Decode can fail
+		 * if there is no SPS header or picture info
+		 * in bs
+		 */
+
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		if (ret == -EIO) {
+			mtk_v4l2_err("[%d] Unrecoverable error in vdec_if_decode.",
+					ctx->id);
+			ctx->state = MTK_STATE_ABORT;
+			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		} else {
+			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+		}
+		mtk_v4l2_debug(ret ? 0 : 1,
+			       "[%d] vdec_if_decode() src_buf=%d, size=%zu, fail=%d, res_chg=%d",
+			       ctx->id, src_buf->vb2_buf.index,
+			       src_mem.size, ret, res_chg);
+		return;
+	}
+
+	if (vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo)) {
+		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
+				ctx->id);
+		return;
+	}
+
+	ctx->last_decoded_picinfo = ctx->picinfo;
+	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
+	for (i = 0; i < dst_q_data->fmt->num_planes; i++) {
+		dst_q_data->sizeimage[i] = ctx->picinfo.fb_sz[i];
+		dst_q_data->bytesperline[i] = ctx->picinfo.buf_w;
+	}
+
+	mtk_v4l2_debug(2, "[%d] vdec_if_init() OK wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
+			ctx->id,
+			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			dst_q_data->sizeimage[0],
+			dst_q_data->sizeimage[1]);
+
+	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
+	if (dpbsize == 0)
+		mtk_v4l2_err("[%d] GET_PARAM_DPB_SIZE fail=%d", ctx->id, ret);
+
+	ctx->dpb_size = dpbsize;
+	ctx->state = MTK_STATE_HEADER;
+	mtk_v4l2_debug(1, "[%d] dpbsize=%d", ctx->id, ctx->dpb_size);
+
+	mtk_vdec_queue_res_chg_event(ctx);
+}
+
+static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
+	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
+};
+
+static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
+{
+	struct v4l2_ctrl *ctrl;
+
+	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, 1);
+
+	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
+				0, 32, 1, 1);
+	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
+	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
+				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
+
+	if (ctx->ctrl_hdl.error) {
+		mtk_v4l2_err("Adding control failed %d",
+				ctx->ctrl_hdl.error);
+		return ctx->ctrl_hdl.error;
+	}
+
+	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
+	return 0;
+}
+
+static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
+{
+}
+
+static struct vb2_ops mtk_vdec_frame_vb2_ops = {
+	.queue_setup	= vb2ops_vdec_queue_setup,
+	.buf_prepare	= vb2ops_vdec_buf_prepare,
+	.wait_prepare	= vb2_ops_wait_prepare,
+	.wait_finish	= vb2_ops_wait_finish,
+	.start_streaming	= vb2ops_vdec_start_streaming,
+
+	.buf_queue	= vb2ops_vdec_stateful_buf_queue,
+	.buf_init	= vb2ops_vdec_buf_init,
+	.buf_finish	= vb2ops_vdec_buf_finish,
+	.stop_streaming	= vb2ops_vdec_stop_streaming,
+};
+
+const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
+	.init_vdec_params = mtk_init_vdec_params,
+	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
+	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
+	.vdec_formats = mtk_video_formats,
+	.num_formats = NUM_FORMATS,
+	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
+	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.vdec_framesizes = mtk_vdec_framesizes,
+	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.worker = mtk_vdec_worker,
+	.flush_decoder = mtk_vdec_flush_decoder,
+};
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 3dd010cba23e..9221c17a176b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -305,6 +305,45 @@ enum mtk_chip {
 	MTK_MT8183,
 };
 
+/**
+ * struct mtk_vcodec_dec_pdata - compatible data for each IC
+ * @init_vdec_params: init vdec params
+ * @ctrls_setup: init vcodec dec ctrls
+ * @worker: worker to start a decode job
+ * @flush_decoder: function that flushes the decoder
+ *
+ * @vdec_vb2_ops: struct vb2_ops
+ *
+ * @vdec_formats: supported video decoder formats
+ * @num_formats: count of video decoder formats
+ * @default_out_fmt: default output buffer format
+ * @default_cap_fmt: default capture buffer format
+ *
+ * @vdec_framesizes: supported video decoder frame sizes
+ * @num_framesizes: count of video decoder frame sizes
+ *
+ * @uses_stateless_api: whether the decoder uses the stateless API with requests
+ */
+
+struct mtk_vcodec_dec_pdata {
+	void (*init_vdec_params)(struct mtk_vcodec_ctx *ctx);
+	int (*ctrls_setup)(struct mtk_vcodec_ctx *ctx);
+	void (*worker)(struct work_struct *work);
+	int (*flush_decoder)(struct mtk_vcodec_ctx *ctx);
+
+	struct vb2_ops *vdec_vb2_ops;
+
+	const struct mtk_video_fmt *vdec_formats;
+	const int num_formats;
+	const struct mtk_video_fmt *default_out_fmt;
+	const struct mtk_video_fmt *default_cap_fmt;
+
+	const struct mtk_codec_framesizes *vdec_framesizes;
+	const int num_framesizes;
+
+	bool uses_stateless_api;
+};
+
 /**
  * struct mtk_vcodec_enc_pdata - compatible data for each IC
  *
@@ -348,6 +387,7 @@ struct mtk_vcodec_enc_pdata {
  * @curr_ctx: The context that is waiting for codec hardware
  *
  * @reg_base: Mapped address of MTK Vcodec registers.
+ * @vdec_pdata: Current arch private data.
  *
  * @fw_handler: used to communicate with the firmware.
  * @id_counter: used to identify current opened instance
@@ -382,6 +422,7 @@ struct mtk_vcodec_dev {
 	spinlock_t irqlock;
 	struct mtk_vcodec_ctx *curr_ctx;
 	void __iomem *reg_base[NUM_MAX_VCODEC_REG_BASE];
+	const struct mtk_vcodec_dec_pdata *vdec_pdata;
 	const struct mtk_vcodec_enc_pdata *venc_pdata;
 
 	struct mtk_vcodec_fw *fw_handler;
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 01/15] media: mtk-vcodec: vdec: move stateful ops into their own file
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

From: Yunfei Dong <yunfei.dong@mediatek.com>

We are planning to add support for stateless decoders to this driver.
Part of the driver will be shared between stateful and stateless
codecs, but a few ops need to be specialized for both. Extract the
stateful part of the driver and move it into its own file, accessible
through ops that the common driver parts can call.

This patch only moves code around and introduces a set of abstractions ;
the behavior of the driver should not be changed in any way.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 699 ++----------------
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |  19 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  10 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      | 634 ++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  41 +
 6 files changed, 759 insertions(+), 645 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 4618d43dbbc8..9c3cbb5b800e 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -11,6 +11,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
 		mtk_vcodec_dec.o \
+		mtk_vcodec_dec_stateful.o \
 		mtk_vcodec_dec_pm.o \
 
 mtk-vcodec-enc-y := venc/venc_vp8_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 56d86e59421e..4a91d294002b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -16,68 +16,17 @@
 #include "vdec_drv_if.h"
 #include "mtk_vcodec_dec_pm.h"
 
-#define OUT_FMT_IDX	0
-#define CAP_FMT_IDX	3
-
-#define MTK_VDEC_MIN_W	64U
-#define MTK_VDEC_MIN_H	64U
 #define DFT_CFG_WIDTH	MTK_VDEC_MIN_W
 #define DFT_CFG_HEIGHT	MTK_VDEC_MIN_H
 
-static const struct mtk_video_fmt mtk_video_formats[] = {
-	{
-		.fourcc = V4L2_PIX_FMT_H264,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP8,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP9,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_MT21C,
-		.type = MTK_FMT_FRAME,
-		.num_planes = 2,
-	},
-};
-
-static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
-	{
-		.fourcc	= V4L2_PIX_FMT_H264,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-	{
-		.fourcc	= V4L2_PIX_FMT_VP8,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_VP9,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-};
-
-#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
-#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
-
-static const struct mtk_video_fmt *mtk_vdec_find_format(struct v4l2_format *f)
+static const struct mtk_video_fmt *mtk_vdec_find_format(struct v4l2_format *f,
+				   const struct mtk_vcodec_dec_pdata *dec_pdata)
 {
 	const struct mtk_video_fmt *fmt;
 	unsigned int k;
 
-	for (k = 0; k < NUM_FORMATS; k++) {
-		fmt = &mtk_video_formats[k];
+	for (k = 0; k < dec_pdata->num_formats; k++) {
+		fmt = &dec_pdata->vdec_formats[k];
 		if (fmt->fourcc == f->fmt.pix_mp.pixelformat)
 			return fmt;
 	}
@@ -94,393 +43,6 @@ static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
 	return &ctx->q_data[MTK_Q_DATA_DST];
 }
 
-/*
- * This function tries to clean all display buffers, the buffers will return
- * in display order.
- * Note the buffers returned from codec driver may still be in driver's
- * reference list.
- */
-static struct vb2_buffer *get_display_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vdec_fb *disp_frame_buffer = NULL;
-	struct mtk_video_dec_buf *dstbuf;
-	struct vb2_v4l2_buffer *vb;
-
-	mtk_v4l2_debug(3, "[%d]", ctx->id);
-	if (vdec_if_get_param(ctx,
-			GET_PARAM_DISP_FRAME_BUFFER,
-			&disp_frame_buffer)) {
-		mtk_v4l2_err("[%d]Cannot get param : GET_PARAM_DISP_FRAME_BUFFER",
-			ctx->id);
-		return NULL;
-	}
-
-	if (disp_frame_buffer == NULL) {
-		mtk_v4l2_debug(3, "No display frame buffer");
-		return NULL;
-	}
-
-	dstbuf = container_of(disp_frame_buffer, struct mtk_video_dec_buf,
-				frame_buffer);
-	vb = &dstbuf->m2m_buf.vb;
-	mutex_lock(&ctx->lock);
-	if (dstbuf->used) {
-		vb2_set_plane_payload(&vb->vb2_buf, 0,
-				      ctx->picinfo.fb_sz[0]);
-		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
-			vb2_set_plane_payload(&vb->vb2_buf, 1,
-					      ctx->picinfo.fb_sz[1]);
-
-		mtk_v4l2_debug(2,
-				"[%d]status=%x queue id=%d to done_list %d",
-				ctx->id, disp_frame_buffer->status,
-				vb->vb2_buf.index,
-				dstbuf->queued_in_vb2);
-
-		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
-		ctx->decoded_frame_cnt++;
-	}
-	mutex_unlock(&ctx->lock);
-	return &vb->vb2_buf;
-}
-
-/*
- * This function tries to clean all capture buffers that are not used as
- * reference buffers by codec driver any more
- * In this case, we need re-queue buffer to vb2 buffer if user space
- * already returns this buffer to v4l2 or this buffer is just the output of
- * previous sps/pps/resolution change decode, or do nothing if user
- * space still owns this buffer
- */
-static struct vb2_buffer *get_free_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct mtk_video_dec_buf *dstbuf;
-	struct vdec_fb *free_frame_buffer = NULL;
-	struct vb2_v4l2_buffer *vb;
-
-	if (vdec_if_get_param(ctx,
-				GET_PARAM_FREE_FRAME_BUFFER,
-				&free_frame_buffer)) {
-		mtk_v4l2_err("[%d] Error!! Cannot get param", ctx->id);
-		return NULL;
-	}
-	if (free_frame_buffer == NULL) {
-		mtk_v4l2_debug(3, " No free frame buffer");
-		return NULL;
-	}
-
-	mtk_v4l2_debug(3, "[%d] tmp_frame_addr = 0x%p",
-			ctx->id, free_frame_buffer);
-
-	dstbuf = container_of(free_frame_buffer, struct mtk_video_dec_buf,
-				frame_buffer);
-	vb = &dstbuf->m2m_buf.vb;
-
-	mutex_lock(&ctx->lock);
-	if (dstbuf->used) {
-		if ((dstbuf->queued_in_vb2) &&
-		    (dstbuf->queued_in_v4l2) &&
-		    (free_frame_buffer->status == FB_ST_FREE)) {
-			/*
-			 * After decode sps/pps or non-display buffer, we don't
-			 * need to return capture buffer to user space, but
-			 * just re-queue this capture buffer to vb2 queue.
-			 * This reduce overheads that dq/q unused capture
-			 * buffer. In this case, queued_in_vb2 = true.
-			 */
-			mtk_v4l2_debug(2,
-				"[%d]status=%x queue id=%d to rdy_queue %d",
-				ctx->id, free_frame_buffer->status,
-				vb->vb2_buf.index,
-				dstbuf->queued_in_vb2);
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
-		} else if (!dstbuf->queued_in_vb2 && dstbuf->queued_in_v4l2) {
-			/*
-			 * If buffer in v4l2 driver but not in vb2 queue yet,
-			 * and we get this buffer from free_list, it means
-			 * that codec driver do not use this buffer as
-			 * reference buffer anymore. We should q buffer to vb2
-			 * queue, so later work thread could get this buffer
-			 * for decode. In this case, queued_in_vb2 = false
-			 * means this buffer is not from previous decode
-			 * output.
-			 */
-			mtk_v4l2_debug(2,
-					"[%d]status=%x queue id=%d to rdy_queue",
-					ctx->id, free_frame_buffer->status,
-					vb->vb2_buf.index);
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
-			dstbuf->queued_in_vb2 = true;
-		} else {
-			/*
-			 * Codec driver do not need to reference this capture
-			 * buffer and this buffer is not in v4l2 driver.
-			 * Then we don't need to do any thing, just add log when
-			 * we need to debug buffer flow.
-			 * When this buffer q from user space, it could
-			 * directly q to vb2 buffer
-			 */
-			mtk_v4l2_debug(3, "[%d]status=%x err queue id=%d %d %d",
-					ctx->id, free_frame_buffer->status,
-					vb->vb2_buf.index,
-					dstbuf->queued_in_vb2,
-					dstbuf->queued_in_v4l2);
-		}
-		dstbuf->used = false;
-	}
-	mutex_unlock(&ctx->lock);
-	return &vb->vb2_buf;
-}
-
-static void clean_display_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vb2_buffer *framptr;
-
-	do {
-		framptr = get_display_buffer(ctx);
-	} while (framptr);
-}
-
-static void clean_free_buffer(struct mtk_vcodec_ctx *ctx)
-{
-	struct vb2_buffer *framptr;
-
-	do {
-		framptr = get_free_buffer(ctx);
-	} while (framptr);
-}
-
-static void mtk_vdec_queue_res_chg_event(struct mtk_vcodec_ctx *ctx)
-{
-	static const struct v4l2_event ev_src_ch = {
-		.type = V4L2_EVENT_SOURCE_CHANGE,
-		.u.src_change.changes =
-		V4L2_EVENT_SRC_CH_RESOLUTION,
-	};
-
-	mtk_v4l2_debug(1, "[%d]", ctx->id);
-	v4l2_event_queue_fh(&ctx->fh, &ev_src_ch);
-}
-
-static void mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
-{
-	bool res_chg;
-	int ret = 0;
-
-	ret = vdec_if_decode(ctx, NULL, NULL, &res_chg);
-	if (ret)
-		mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
-
-	clean_display_buffer(ctx);
-	clean_free_buffer(ctx);
-}
-
-static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
-				unsigned int pixelformat)
-{
-	const struct mtk_video_fmt *fmt;
-	struct mtk_q_data *dst_q_data;
-	unsigned int k;
-
-	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
-	for (k = 0; k < NUM_FORMATS; k++) {
-		fmt = &mtk_video_formats[k];
-		if (fmt->fourcc == pixelformat) {
-			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
-				dst_q_data->fmt->fourcc, pixelformat);
-			dst_q_data->fmt = fmt;
-			return;
-		}
-	}
-
-	mtk_v4l2_err("Cannot get fourcc(%d), using init value", pixelformat);
-}
-
-static int mtk_vdec_pic_info_update(struct mtk_vcodec_ctx *ctx)
-{
-	unsigned int dpbsize = 0;
-	int ret;
-
-	if (vdec_if_get_param(ctx,
-				GET_PARAM_PIC_INFO,
-				&ctx->last_decoded_picinfo)) {
-		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
-				ctx->id);
-		return -EINVAL;
-	}
-
-	if (ctx->last_decoded_picinfo.pic_w == 0 ||
-		ctx->last_decoded_picinfo.pic_h == 0 ||
-		ctx->last_decoded_picinfo.buf_w == 0 ||
-		ctx->last_decoded_picinfo.buf_h == 0) {
-		mtk_v4l2_err("Cannot get correct pic info");
-		return -EINVAL;
-	}
-
-	if (ctx->last_decoded_picinfo.cap_fourcc != ctx->picinfo.cap_fourcc &&
-		ctx->picinfo.cap_fourcc != 0)
-		mtk_vdec_update_fmt(ctx, ctx->picinfo.cap_fourcc);
-
-	if ((ctx->last_decoded_picinfo.pic_w == ctx->picinfo.pic_w) ||
-	    (ctx->last_decoded_picinfo.pic_h == ctx->picinfo.pic_h))
-		return 0;
-
-	mtk_v4l2_debug(1,
-			"[%d]-> new(%d,%d), old(%d,%d), real(%d,%d)",
-			ctx->id, ctx->last_decoded_picinfo.pic_w,
-			ctx->last_decoded_picinfo.pic_h,
-			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
-			ctx->last_decoded_picinfo.buf_w,
-			ctx->last_decoded_picinfo.buf_h);
-
-	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
-	if (dpbsize == 0)
-		mtk_v4l2_err("Incorrect dpb size, ret=%d", ret);
-
-	ctx->dpb_size = dpbsize;
-
-	return ret;
-}
-
-static void mtk_vdec_worker(struct work_struct *work)
-{
-	struct mtk_vcodec_ctx *ctx = container_of(work, struct mtk_vcodec_ctx,
-				decode_work);
-	struct mtk_vcodec_dev *dev = ctx->dev;
-	struct vb2_v4l2_buffer *src_buf, *dst_buf;
-	struct mtk_vcodec_mem buf;
-	struct vdec_fb *pfb;
-	bool res_chg = false;
-	int ret;
-	struct mtk_video_dec_buf *dst_buf_info, *src_buf_info;
-
-	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	if (src_buf == NULL) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_debug(1, "[%d] src_buf empty!!", ctx->id);
-		return;
-	}
-
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
-	if (dst_buf == NULL) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
-		return;
-	}
-
-	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
-	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
-
-	pfb = &dst_buf_info->frame_buffer;
-	pfb->base_y.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
-	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
-	pfb->base_y.size = ctx->picinfo.fb_sz[0];
-
-	pfb->base_c.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 1);
-	pfb->base_c.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 1);
-	pfb->base_c.size = ctx->picinfo.fb_sz[1];
-	pfb->status = 0;
-	mtk_v4l2_debug(3, "===>[%d] vdec_if_decode() ===>", ctx->id);
-
-	mtk_v4l2_debug(3,
-			"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx",
-			dst_buf->vb2_buf.index, pfb,
-			pfb->base_y.va, &pfb->base_y.dma_addr,
-			&pfb->base_c.dma_addr, pfb->base_y.size);
-
-	if (src_buf_info->lastframe) {
-		mtk_v4l2_debug(1, "Got empty flush input buffer.");
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-
-		/* update dst buf status */
-		dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
-		mutex_lock(&ctx->lock);
-		dst_buf_info->used = false;
-		mutex_unlock(&ctx->lock);
-
-		vdec_if_decode(ctx, NULL, NULL, &res_chg);
-		clean_display_buffer(ctx);
-		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
-		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
-			vb2_set_plane_payload(&dst_buf->vb2_buf, 1, 0);
-		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
-		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
-		clean_free_buffer(ctx);
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		return;
-	}
-	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
-	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
-	if (!buf.va) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_err("[%d] id=%d src_addr is NULL!!",
-				ctx->id, src_buf->vb2_buf.index);
-		return;
-	}
-	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
-			ctx->id, buf.va, &buf.dma_addr, buf.size, src_buf);
-	dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
-	dst_buf->timecode = src_buf->timecode;
-	mutex_lock(&ctx->lock);
-	dst_buf_info->used = true;
-	mutex_unlock(&ctx->lock);
-	src_buf_info->used = true;
-
-	ret = vdec_if_decode(ctx, &buf, pfb, &res_chg);
-
-	if (ret) {
-		mtk_v4l2_err(
-			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu dst_buf[%d] vdec_if_decode() ret=%d res_chg=%d===>",
-			ctx->id,
-			src_buf->vb2_buf.index,
-			buf.size,
-			src_buf->vb2_buf.timestamp,
-			dst_buf->vb2_buf.index,
-			ret, res_chg);
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		if (ret == -EIO) {
-			mutex_lock(&ctx->lock);
-			src_buf_info->error = true;
-			mutex_unlock(&ctx->lock);
-		}
-		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
-	} else if (!res_chg) {
-		/*
-		 * we only return src buffer with VB2_BUF_STATE_DONE
-		 * when decode success without resolution change
-		 */
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
-	}
-
-	dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
-	clean_display_buffer(ctx);
-	clean_free_buffer(ctx);
-
-	if (!ret && res_chg) {
-		mtk_vdec_pic_info_update(ctx);
-		/*
-		 * On encountering a resolution change in the stream.
-		 * The driver must first process and decode all
-		 * remaining buffers from before the resolution change
-		 * point, so call flush decode here
-		 */
-		mtk_vdec_flush_decoder(ctx);
-		/*
-		 * After all buffers containing decoded frames from
-		 * before the resolution change point ready to be
-		 * dequeued on the CAPTURE queue, the driver sends a
-		 * V4L2_EVENT_SOURCE_CHANGE event for source change
-		 * type V4L2_EVENT_SRC_CH_RESOLUTION
-		 */
-		mtk_vdec_queue_res_chg_event(ctx);
-	}
-	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-}
-
 static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
@@ -561,10 +123,12 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 {
 	struct mtk_q_data *q_data;
 
+	ctx->dev->vdec_pdata->init_vdec_params(ctx);
+
 	ctx->m2m_ctx->q_lock = &ctx->dev->dev_mutex;
 	ctx->fh.m2m_ctx = ctx->m2m_ctx;
 	ctx->fh.ctrl_handler = &ctx->ctrl_hdl;
-	INIT_WORK(&ctx->decode_work, mtk_vdec_worker);
+	INIT_WORK(&ctx->decode_work, ctx->dev->vdec_pdata->worker);
 	ctx->colorspace = V4L2_COLORSPACE_REC709;
 	ctx->ycbcr_enc = V4L2_YCBCR_ENC_DEFAULT;
 	ctx->quantization = V4L2_QUANTIZATION_DEFAULT;
@@ -574,7 +138,7 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 	memset(q_data, 0, sizeof(struct mtk_q_data));
 	q_data->visible_width = DFT_CFG_WIDTH;
 	q_data->visible_height = DFT_CFG_HEIGHT;
-	q_data->fmt = &mtk_video_formats[OUT_FMT_IDX];
+	q_data->fmt = ctx->dev->vdec_pdata->default_out_fmt;
 	q_data->field = V4L2_FIELD_NONE;
 
 	q_data->sizeimage[0] = DFT_CFG_WIDTH * DFT_CFG_HEIGHT;
@@ -586,7 +150,7 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 	q_data->visible_height = DFT_CFG_HEIGHT;
 	q_data->coded_width = DFT_CFG_WIDTH;
 	q_data->coded_height = DFT_CFG_HEIGHT;
-	q_data->fmt = &mtk_video_formats[CAP_FMT_IDX];
+	q_data->fmt = ctx->dev->vdec_pdata->default_cap_fmt;
 	q_data->field = V4L2_FIELD_NONE;
 
 	v4l_bound_align_image(&q_data->coded_width,
@@ -722,11 +286,14 @@ static int vidioc_try_fmt_vid_cap_mplane(struct file *file, void *priv,
 				struct v4l2_format *f)
 {
 	const struct mtk_video_fmt *fmt;
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (!fmt) {
-		f->fmt.pix.pixelformat = mtk_video_formats[CAP_FMT_IDX].fourcc;
-		fmt = mtk_vdec_find_format(f);
+		f->fmt.pix.pixelformat =
+			ctx->q_data[MTK_Q_DATA_DST].fmt->fourcc;
+		fmt = mtk_vdec_find_format(f, dec_pdata);
 	}
 
 	return vidioc_try_fmt(f, fmt);
@@ -737,11 +304,14 @@ static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
 {
 	struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
 	const struct mtk_video_fmt *fmt;
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (!fmt) {
-		f->fmt.pix.pixelformat = mtk_video_formats[OUT_FMT_IDX].fourcc;
-		fmt = mtk_vdec_find_format(f);
+		f->fmt.pix.pixelformat =
+			ctx->q_data[MTK_Q_DATA_SRC].fmt->fourcc;
+		fmt = mtk_vdec_find_format(f, dec_pdata);
 	}
 
 	if (pix_fmt_mp->plane_fmt[0].sizeimage == 0) {
@@ -831,6 +401,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 	struct mtk_q_data *q_data;
 	int ret = 0;
 	const struct mtk_video_fmt *fmt;
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
 	mtk_v4l2_debug(3, "[%d]", ctx->id);
 
@@ -859,16 +430,16 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		ret = -EBUSY;
 	}
 
-	fmt = mtk_vdec_find_format(f);
+	fmt = mtk_vdec_find_format(f, dec_pdata);
 	if (fmt == NULL) {
 		if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 			f->fmt.pix.pixelformat =
-				mtk_video_formats[OUT_FMT_IDX].fourcc;
-			fmt = mtk_vdec_find_format(f);
+				dec_pdata->default_out_fmt->fourcc;
+			fmt = mtk_vdec_find_format(f, dec_pdata);
 		} else if (f->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 			f->fmt.pix.pixelformat =
-				mtk_video_formats[CAP_FMT_IDX].fourcc;
-			fmt = mtk_vdec_find_format(f);
+				dec_pdata->default_cap_fmt->fourcc;
+			fmt = mtk_vdec_find_format(f, dec_pdata);
 		}
 	}
 	if (fmt == NULL)
@@ -905,16 +476,17 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 {
 	int i = 0;
 	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 
 	if (fsize->index != 0)
 		return -EINVAL;
 
-	for (i = 0; i < NUM_SUPPORTED_FRAMESIZE; ++i) {
-		if (fsize->pixel_format != mtk_vdec_framesizes[i].fourcc)
+	for (i = 0; i < dec_pdata->num_framesizes; ++i) {
+		if (fsize->pixel_format != dec_pdata->vdec_framesizes[i].fourcc)
 			continue;
 
 		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
-		fsize->stepwise = mtk_vdec_framesizes[i].stepwise;
+		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
 		if (!(ctx->dev->dec_capability &
 				VCODEC_CAPABILITY_4K_DISABLED)) {
 			mtk_v4l2_debug(3, "4K is enabled");
@@ -937,16 +509,20 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 	return -EINVAL;
 }
 
-static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
+static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
+				bool output_queue)
 {
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
 	const struct mtk_video_fmt *fmt;
 	int i, j = 0;
 
-	for (i = 0; i < NUM_FORMATS; i++) {
-		if (output_queue && (mtk_video_formats[i].type != MTK_FMT_DEC))
+	for (i = 0; i < dec_pdata->num_formats; i++) {
+		if (output_queue &&
+			(dec_pdata->vdec_formats[i].type != MTK_FMT_DEC))
 			continue;
 		if (!output_queue &&
-			(mtk_video_formats[i].type != MTK_FMT_FRAME))
+			(dec_pdata->vdec_formats[i].type != MTK_FMT_FRAME))
 			continue;
 
 		if (j == f->index)
@@ -954,10 +530,10 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
 		++j;
 	}
 
-	if (i == NUM_FORMATS)
+	if (i == dec_pdata->num_formats)
 		return -EINVAL;
 
-	fmt = &mtk_video_formats[i];
+	fmt = &dec_pdata->vdec_formats[i];
 	f->pixelformat = fmt->fourcc;
 	f->flags = fmt->flags;
 
@@ -967,13 +543,13 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, bool output_queue)
 static int vidioc_vdec_enum_fmt_vid_cap(struct file *file, void *priv,
 					struct v4l2_fmtdesc *f)
 {
-	return vidioc_enum_fmt(f, false);
+	return vidioc_enum_fmt(f, priv, false);
 }
 
 static int vidioc_vdec_enum_fmt_vid_out(struct file *file, void *priv,
 					struct v4l2_fmtdesc *f)
 {
-	return vidioc_enum_fmt(f, true);
+	return vidioc_enum_fmt(f, priv, true);
 }
 
 static int vidioc_vdec_g_fmt(struct file *file, void *priv,
@@ -1064,7 +640,7 @@ static int vidioc_vdec_g_fmt(struct file *file, void *priv,
 	return 0;
 }
 
-static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
+int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 				unsigned int *nbuffers,
 				unsigned int *nplanes,
 				unsigned int sizes[],
@@ -1088,7 +664,7 @@ static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 		}
 	} else {
 		if (vq->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
-			*nplanes = 2;
+			*nplanes = q_data->fmt->num_planes;
 		else
 			*nplanes = 1;
 
@@ -1104,7 +680,7 @@ static int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
 	return 0;
 }
 
-static int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
+int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct mtk_q_data *q_data;
@@ -1126,128 +702,7 @@ static int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb)
 	return 0;
 }
 
-static void vb2ops_vdec_buf_queue(struct vb2_buffer *vb)
-{
-	struct vb2_v4l2_buffer *src_buf;
-	struct mtk_vcodec_mem src_mem;
-	bool res_chg = false;
-	int ret = 0;
-	unsigned int dpbsize = 1, i = 0;
-	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
-	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
-	struct mtk_video_dec_buf *buf = NULL;
-	struct mtk_q_data *dst_q_data;
-
-	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
-			ctx->id, vb->vb2_queue->type,
-			vb->index, vb);
-	/*
-	 * check if this buffer is ready to be used after decode
-	 */
-	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
-		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
-		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
-				   m2m_buf.vb);
-		mutex_lock(&ctx->lock);
-		if (!buf->used) {
-			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
-			buf->queued_in_vb2 = true;
-			buf->queued_in_v4l2 = true;
-		} else {
-			buf->queued_in_vb2 = false;
-			buf->queued_in_v4l2 = true;
-		}
-		mutex_unlock(&ctx->lock);
-		return;
-	}
-
-	v4l2_m2m_buf_queue(ctx->m2m_ctx, to_vb2_v4l2_buffer(vb));
-
-	if (ctx->state != MTK_STATE_INIT) {
-		mtk_v4l2_debug(3, "[%d] already init driver %d",
-				ctx->id, ctx->state);
-		return;
-	}
-
-	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	if (!src_buf) {
-		mtk_v4l2_err("No src buffer");
-		return;
-	}
-	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-	if (buf->lastframe) {
-		/* This shouldn't happen. Just in case. */
-		mtk_v4l2_err("Invalid flush buffer.");
-		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		return;
-	}
-
-	src_mem.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
-	src_mem.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-	src_mem.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
-	mtk_v4l2_debug(2,
-			"[%d] buf id=%d va=%p dma=%pad size=%zx",
-			ctx->id, src_buf->vb2_buf.index,
-			src_mem.va, &src_mem.dma_addr,
-			src_mem.size);
-
-	ret = vdec_if_decode(ctx, &src_mem, NULL, &res_chg);
-	if (ret || !res_chg) {
-		/*
-		 * fb == NULL means to parse SPS/PPS header or
-		 * resolution info in src_mem. Decode can fail
-		 * if there is no SPS header or picture info
-		 * in bs
-		 */
-
-		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
-		if (ret == -EIO) {
-			mtk_v4l2_err("[%d] Unrecoverable error in vdec_if_decode.",
-					ctx->id);
-			ctx->state = MTK_STATE_ABORT;
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
-		} else {
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
-		}
-		mtk_v4l2_debug(ret ? 0 : 1,
-			       "[%d] vdec_if_decode() src_buf=%d, size=%zu, fail=%d, res_chg=%d",
-			       ctx->id, src_buf->vb2_buf.index,
-			       src_mem.size, ret, res_chg);
-		return;
-	}
-
-	if (vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo)) {
-		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
-				ctx->id);
-		return;
-	}
-
-	ctx->last_decoded_picinfo = ctx->picinfo;
-	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
-	for (i = 0; i < dst_q_data->fmt->num_planes; i++) {
-		dst_q_data->sizeimage[i] = ctx->picinfo.fb_sz[i];
-		dst_q_data->bytesperline[i] = ctx->picinfo.buf_w;
-	}
-
-	mtk_v4l2_debug(2, "[%d] vdec_if_init() OK wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
-			ctx->id,
-			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
-			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
-			dst_q_data->sizeimage[0],
-			dst_q_data->sizeimage[1]);
-
-	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
-	if (dpbsize == 0)
-		mtk_v4l2_err("[%d] GET_PARAM_DPB_SIZE fail=%d", ctx->id, ret);
-
-	ctx->dpb_size = dpbsize;
-	ctx->state = MTK_STATE_HEADER;
-	mtk_v4l2_debug(1, "[%d] dpbsize=%d", ctx->id, ctx->dpb_size);
-
-	mtk_vdec_queue_res_chg_event(ctx);
-}
-
-static void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
+void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct vb2_v4l2_buffer *vb2_v4l2;
@@ -1270,7 +725,7 @@ static void vb2ops_vdec_buf_finish(struct vb2_buffer *vb)
 	}
 }
 
-static int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
+int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 {
 	struct vb2_v4l2_buffer *vb2_v4l2 = container_of(vb,
 					struct vb2_v4l2_buffer, vb2_buf);
@@ -1287,7 +742,7 @@ static int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 	return 0;
 }
 
-static int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
+int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
 {
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(q);
 
@@ -1297,10 +752,11 @@ static int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count)
 	return 0;
 }
 
-static void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
+void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 {
 	struct vb2_v4l2_buffer *src_buf = NULL, *dst_buf = NULL;
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(q);
+	int ret;
 
 	mtk_v4l2_debug(3, "[%d] (%d) state=(%x) ctx->decoded_frame_cnt=%d",
 			ctx->id, q->type, ctx->state, ctx->decoded_frame_cnt);
@@ -1334,7 +790,9 @@ static void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 				ctx->last_decoded_picinfo.buf_w,
 				ctx->last_decoded_picinfo.buf_h);
 
-		mtk_vdec_flush_decoder(ctx);
+		ret = ctx->dev->vdec_pdata->flush_decoder(ctx);
+		if (ret)
+			mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
 	}
 	ctx->state = MTK_STATE_FLUSH;
 
@@ -1381,7 +839,7 @@ static void m2mops_vdec_job_abort(void *priv)
 	ctx->state = MTK_STATE_ABORT;
 }
 
-static int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
+int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
 {
 	struct mtk_vcodec_ctx *ctx = ctrl_to_ctx(ctrl);
 	int ret = 0;
@@ -1401,55 +859,12 @@ static int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl)
 	return ret;
 }
 
-static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
-	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
-};
-
-int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
-{
-	struct v4l2_ctrl *ctrl;
-
-	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, 1);
-
-	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
-				&mtk_vcodec_dec_ctrl_ops,
-				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
-				0, 32, 1, 1);
-	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
-	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
-				&mtk_vcodec_dec_ctrl_ops,
-				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
-				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
-				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
-
-	if (ctx->ctrl_hdl.error) {
-		mtk_v4l2_err("Adding control failed %d",
-				ctx->ctrl_hdl.error);
-		return ctx->ctrl_hdl.error;
-	}
-
-	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
-	return 0;
-}
-
 const struct v4l2_m2m_ops mtk_vdec_m2m_ops = {
 	.device_run	= m2mops_vdec_device_run,
 	.job_ready	= m2mops_vdec_job_ready,
 	.job_abort	= m2mops_vdec_job_abort,
 };
 
-static const struct vb2_ops mtk_vdec_vb2_ops = {
-	.queue_setup	= vb2ops_vdec_queue_setup,
-	.buf_prepare	= vb2ops_vdec_buf_prepare,
-	.buf_queue	= vb2ops_vdec_buf_queue,
-	.wait_prepare	= vb2_ops_wait_prepare,
-	.wait_finish	= vb2_ops_wait_finish,
-	.buf_init	= vb2ops_vdec_buf_init,
-	.buf_finish	= vb2ops_vdec_buf_finish,
-	.start_streaming	= vb2ops_vdec_start_streaming,
-	.stop_streaming	= vb2ops_vdec_stop_streaming,
-};
-
 const struct v4l2_ioctl_ops mtk_vdec_ioctl_ops = {
 	.vidioc_streamon	= v4l2_m2m_ioctl_streamon,
 	.vidioc_streamoff	= v4l2_m2m_ioctl_streamoff,
@@ -1496,7 +911,7 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 	src_vq->io_modes	= VB2_DMABUF | VB2_MMAP;
 	src_vq->drv_priv	= ctx;
 	src_vq->buf_struct_size = sizeof(struct mtk_video_dec_buf);
-	src_vq->ops		= &mtk_vdec_vb2_ops;
+	src_vq->ops		= ctx->dev->vdec_pdata->vdec_vb2_ops;
 	src_vq->mem_ops		= &vb2_dma_contig_memops;
 	src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	src_vq->lock		= &ctx->dev->dev_mutex;
@@ -1511,7 +926,7 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 	dst_vq->io_modes	= VB2_DMABUF | VB2_MMAP;
 	dst_vq->drv_priv	= ctx;
 	dst_vq->buf_struct_size = sizeof(struct mtk_video_dec_buf);
-	dst_vq->ops		= &mtk_vdec_vb2_ops;
+	dst_vq->ops		= ctx->dev->vdec_pdata->vdec_vb2_ops;
 	dst_vq->mem_ops		= &vb2_dma_contig_memops;
 	dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
 	dst_vq->lock		= &ctx->dev->dev_mutex;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index cf26b6c1486a..97a8304f6600 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -16,6 +16,8 @@
 #define VCODEC_DEC_4K_CODED_HEIGHT	2304U
 #define MTK_VDEC_MAX_W	2048U
 #define MTK_VDEC_MAX_H	1088U
+#define MTK_VDEC_MIN_W	64U
+#define MTK_VDEC_MIN_H	64U
 
 #define MTK_VDEC_IRQ_STATUS_DEC_SUCCESS        0x10000
 
@@ -73,7 +75,22 @@ int mtk_vcodec_dec_queue_init(void *priv, struct vb2_queue *src_vq,
 			   struct vb2_queue *dst_vq);
 void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx);
 void mtk_vcodec_dec_release(struct mtk_vcodec_ctx *ctx);
-int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx);
+
+int mtk_vdec_g_v_ctrl(struct v4l2_ctrl *ctrl);
+
+/*
+ * VB2 ops
+ */
+int vb2ops_vdec_queue_setup(struct vb2_queue *vq,
+				unsigned int *nbuffers,
+				unsigned int *nplanes,
+				unsigned int sizes[],
+				struct device *alloc_devs[]);
+int vb2ops_vdec_buf_prepare(struct vb2_buffer *vb);
+void vb2ops_vdec_buf_finish(struct vb2_buffer *vb);
+int vb2ops_vdec_buf_init(struct vb2_buffer *vb);
+int vb2ops_vdec_start_streaming(struct vb2_queue *q, unsigned int count);
+void vb2ops_vdec_stop_streaming(struct vb2_queue *q);
 
 
 #endif /* _MTK_VCODEC_DEC_H_ */
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 147dfef1638d..533781d4680a 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -106,7 +106,7 @@ static int fops_vcodec_open(struct file *file)
 	mutex_init(&ctx->lock);
 
 	ctx->type = MTK_INST_DECODER;
-	ret = mtk_vcodec_dec_ctrls_setup(ctx);
+	ret = dev->vdec_pdata->ctrls_setup(ctx);
 	if (ret) {
 		mtk_v4l2_err("Failed to setup mt vcodec controls");
 		goto err_ctrls_setup;
@@ -222,6 +222,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	INIT_LIST_HEAD(&dev->ctx_list);
 	dev->plat_dev = pdev;
 
+	dev->vdec_pdata = of_device_get_match_data(&pdev->dev);
 	if (!of_property_read_u32(pdev->dev.of_node, "mediatek,vpu",
 				  &rproc_phandle)) {
 		fw_type = VPU;
@@ -349,8 +350,13 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	return ret;
 }
 
+extern const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata;
+
 static const struct of_device_id mtk_vcodec_match[] = {
-	{.compatible = "mediatek,mt8173-vcodec-dec",},
+	{
+		.compatible = "mediatek,mt8173-vcodec-dec",
+		.data = &mtk_vdec_8173_pdata,
+	},
 	{},
 };
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
new file mode 100644
index 000000000000..48b7524bc8fb
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -0,0 +1,634 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <media/v4l2-event.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "mtk_vcodec_drv.h"
+#include "mtk_vcodec_dec.h"
+#include "mtk_vcodec_intr.h"
+#include "mtk_vcodec_util.h"
+#include "vdec_drv_if.h"
+#include "mtk_vcodec_dec_pm.h"
+
+static const struct mtk_video_fmt mtk_video_formats[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_H264,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP8,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+		.flags = V4L2_FMT_FLAG_DYN_RESOLUTION,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_MT21C,
+		.type = MTK_FMT_FRAME,
+		.num_planes = 2,
+	},
+};
+
+#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
+#define DEFAULT_OUT_FMT_IDX	0
+#define DEFAULT_CAP_FMT_IDX	3
+
+static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
+	{
+		.fourcc	= V4L2_PIX_FMT_H264,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+	{
+		.fourcc	= V4L2_PIX_FMT_VP8,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_VP9,
+		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
+	},
+};
+
+#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
+
+/*
+ * This function tries to clean all display buffers, the buffers will return
+ * in display order.
+ * Note the buffers returned from codec driver may still be in driver's
+ * reference list.
+ */
+static struct vb2_buffer *get_display_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_fb *disp_frame_buffer = NULL;
+	struct mtk_video_dec_buf *dstbuf;
+	struct vb2_v4l2_buffer *vb;
+
+	mtk_v4l2_debug(3, "[%d]", ctx->id);
+	if (vdec_if_get_param(ctx,
+			GET_PARAM_DISP_FRAME_BUFFER,
+			&disp_frame_buffer)) {
+		mtk_v4l2_err("[%d]Cannot get param : GET_PARAM_DISP_FRAME_BUFFER",
+			ctx->id);
+		return NULL;
+	}
+
+	if (disp_frame_buffer == NULL) {
+		mtk_v4l2_debug(3, "No display frame buffer");
+		return NULL;
+	}
+
+	dstbuf = container_of(disp_frame_buffer, struct mtk_video_dec_buf,
+				frame_buffer);
+	vb = &dstbuf->m2m_buf.vb;
+	mutex_lock(&ctx->lock);
+	if (dstbuf->used) {
+		vb2_set_plane_payload(&vb->vb2_buf, 0,
+				      ctx->picinfo.fb_sz[0]);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			vb2_set_plane_payload(&vb->vb2_buf, 1,
+					      ctx->picinfo.fb_sz[1]);
+
+		mtk_v4l2_debug(2,
+				"[%d]status=%x queue id=%d to done_list %d",
+				ctx->id, disp_frame_buffer->status,
+				vb->vb2_buf.index,
+				dstbuf->queued_in_vb2);
+
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
+		ctx->decoded_frame_cnt++;
+	}
+	mutex_unlock(&ctx->lock);
+	return &vb->vb2_buf;
+}
+
+/*
+ * This function tries to clean all capture buffers that are not used as
+ * reference buffers by codec driver any more
+ * In this case, we need re-queue buffer to vb2 buffer if user space
+ * already returns this buffer to v4l2 or this buffer is just the output of
+ * previous sps/pps/resolution change decode, or do nothing if user
+ * space still owns this buffer
+ */
+static struct vb2_buffer *get_free_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct mtk_video_dec_buf *dstbuf;
+	struct vdec_fb *free_frame_buffer = NULL;
+	struct vb2_v4l2_buffer *vb;
+
+	if (vdec_if_get_param(ctx,
+				GET_PARAM_FREE_FRAME_BUFFER,
+				&free_frame_buffer)) {
+		mtk_v4l2_err("[%d] Error!! Cannot get param", ctx->id);
+		return NULL;
+	}
+	if (free_frame_buffer == NULL) {
+		mtk_v4l2_debug(3, " No free frame buffer");
+		return NULL;
+	}
+
+	mtk_v4l2_debug(3, "[%d] tmp_frame_addr = 0x%p",
+			ctx->id, free_frame_buffer);
+
+	dstbuf = container_of(free_frame_buffer, struct mtk_video_dec_buf,
+				frame_buffer);
+	vb = &dstbuf->m2m_buf.vb;
+
+	mutex_lock(&ctx->lock);
+	if (dstbuf->used) {
+		if ((dstbuf->queued_in_vb2) &&
+		    (dstbuf->queued_in_v4l2) &&
+		    (free_frame_buffer->status == FB_ST_FREE)) {
+			/*
+			 * After decode sps/pps or non-display buffer, we don't
+			 * need to return capture buffer to user space, but
+			 * just re-queue this capture buffer to vb2 queue.
+			 * This reduce overheads that dq/q unused capture
+			 * buffer. In this case, queued_in_vb2 = true.
+			 */
+			mtk_v4l2_debug(2,
+				"[%d]status=%x queue id=%d to rdy_queue %d",
+				ctx->id, free_frame_buffer->status,
+				vb->vb2_buf.index,
+				dstbuf->queued_in_vb2);
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
+		} else if (!dstbuf->queued_in_vb2 && dstbuf->queued_in_v4l2) {
+			/*
+			 * If buffer in v4l2 driver but not in vb2 queue yet,
+			 * and we get this buffer from free_list, it means
+			 * that codec driver do not use this buffer as
+			 * reference buffer anymore. We should q buffer to vb2
+			 * queue, so later work thread could get this buffer
+			 * for decode. In this case, queued_in_vb2 = false
+			 * means this buffer is not from previous decode
+			 * output.
+			 */
+			mtk_v4l2_debug(2,
+					"[%d]status=%x queue id=%d to rdy_queue",
+					ctx->id, free_frame_buffer->status,
+					vb->vb2_buf.index);
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb);
+			dstbuf->queued_in_vb2 = true;
+		} else {
+			/*
+			 * Codec driver do not need to reference this capture
+			 * buffer and this buffer is not in v4l2 driver.
+			 * Then we don't need to do any thing, just add log when
+			 * we need to debug buffer flow.
+			 * When this buffer q from user space, it could
+			 * directly q to vb2 buffer
+			 */
+			mtk_v4l2_debug(3, "[%d]status=%x err queue id=%d %d %d",
+					ctx->id, free_frame_buffer->status,
+					vb->vb2_buf.index,
+					dstbuf->queued_in_vb2,
+					dstbuf->queued_in_v4l2);
+		}
+		dstbuf->used = false;
+	}
+	mutex_unlock(&ctx->lock);
+	return &vb->vb2_buf;
+}
+
+static void clean_display_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_buffer *framptr;
+
+	do {
+		framptr = get_display_buffer(ctx);
+	} while (framptr);
+}
+
+static void clean_free_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_buffer *framptr;
+
+	do {
+		framptr = get_free_buffer(ctx);
+	} while (framptr);
+}
+
+static void mtk_vdec_queue_res_chg_event(struct mtk_vcodec_ctx *ctx)
+{
+	static const struct v4l2_event ev_src_ch = {
+		.type = V4L2_EVENT_SOURCE_CHANGE,
+		.u.src_change.changes =
+		V4L2_EVENT_SRC_CH_RESOLUTION,
+	};
+
+	mtk_v4l2_debug(1, "[%d]", ctx->id);
+	v4l2_event_queue_fh(&ctx->fh, &ev_src_ch);
+}
+
+static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
+{
+	bool res_chg;
+	int ret = 0;
+
+	ret = vdec_if_decode(ctx, NULL, NULL, &res_chg);
+	if (ret)
+		mtk_v4l2_err("DecodeFinal failed, ret=%d", ret);
+
+	clean_display_buffer(ctx);
+	clean_free_buffer(ctx);
+
+	return 0;
+}
+
+static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
+				unsigned int pixelformat)
+{
+	const struct mtk_video_fmt *fmt;
+	struct mtk_q_data *dst_q_data;
+	unsigned int k;
+
+	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
+	for (k = 0; k < NUM_FORMATS; k++) {
+		fmt = &mtk_video_formats[k];
+		if (fmt->fourcc == pixelformat) {
+			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
+				dst_q_data->fmt->fourcc, pixelformat);
+			dst_q_data->fmt = fmt;
+			return;
+		}
+	}
+
+	mtk_v4l2_err("Cannot get fourcc(%d), using init value", pixelformat);
+}
+
+static int mtk_vdec_pic_info_update(struct mtk_vcodec_ctx *ctx)
+{
+	unsigned int dpbsize = 0;
+	int ret;
+
+	if (vdec_if_get_param(ctx,
+				GET_PARAM_PIC_INFO,
+				&ctx->last_decoded_picinfo)) {
+		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
+				ctx->id);
+		return -EINVAL;
+	}
+
+	if (ctx->last_decoded_picinfo.pic_w == 0 ||
+		ctx->last_decoded_picinfo.pic_h == 0 ||
+		ctx->last_decoded_picinfo.buf_w == 0 ||
+		ctx->last_decoded_picinfo.buf_h == 0) {
+		mtk_v4l2_err("Cannot get correct pic info");
+		return -EINVAL;
+	}
+
+	if (ctx->last_decoded_picinfo.cap_fourcc != ctx->picinfo.cap_fourcc &&
+		ctx->picinfo.cap_fourcc != 0)
+		mtk_vdec_update_fmt(ctx, ctx->picinfo.cap_fourcc);
+
+	if ((ctx->last_decoded_picinfo.pic_w == ctx->picinfo.pic_w) ||
+	    (ctx->last_decoded_picinfo.pic_h == ctx->picinfo.pic_h))
+		return 0;
+
+	mtk_v4l2_debug(1,
+			"[%d]-> new(%d,%d), old(%d,%d), real(%d,%d)",
+			ctx->id, ctx->last_decoded_picinfo.pic_w,
+			ctx->last_decoded_picinfo.pic_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			ctx->last_decoded_picinfo.buf_w,
+			ctx->last_decoded_picinfo.buf_h);
+
+	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
+	if (dpbsize == 0)
+		mtk_v4l2_err("Incorrect dpb size, ret=%d", ret);
+
+	ctx->dpb_size = dpbsize;
+
+	return ret;
+}
+
+static void mtk_vdec_worker(struct work_struct *work)
+{
+	struct mtk_vcodec_ctx *ctx = container_of(work, struct mtk_vcodec_ctx,
+				decode_work);
+	struct mtk_vcodec_dev *dev = ctx->dev;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct mtk_vcodec_mem buf;
+	struct vdec_fb *pfb;
+	bool res_chg = false;
+	int ret;
+	struct mtk_video_dec_buf *dst_buf_info, *src_buf_info;
+
+	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (src_buf == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] src_buf empty!!", ctx->id);
+		return;
+	}
+
+	dst_buf = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	if (dst_buf == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
+		return;
+	}
+
+	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+
+	pfb = &dst_buf_info->frame_buffer;
+	pfb->base_y.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
+	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	pfb->base_y.size = ctx->picinfo.fb_sz[0];
+
+	pfb->base_c.va = vb2_plane_vaddr(&dst_buf->vb2_buf, 1);
+	pfb->base_c.dma_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 1);
+	pfb->base_c.size = ctx->picinfo.fb_sz[1];
+	pfb->status = 0;
+	mtk_v4l2_debug(3, "===>[%d] vdec_if_decode() ===>", ctx->id);
+
+	mtk_v4l2_debug(3,
+			"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx",
+			dst_buf->vb2_buf.index, pfb,
+			pfb->base_y.va, &pfb->base_y.dma_addr,
+			&pfb->base_c.dma_addr, pfb->base_y.size);
+
+	if (src_buf_info->lastframe) {
+		mtk_v4l2_debug(1, "Got empty flush input buffer.");
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+
+		/* update dst buf status */
+		dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
+		mutex_lock(&ctx->lock);
+		dst_buf_info->used = false;
+		mutex_unlock(&ctx->lock);
+
+		vdec_if_decode(ctx, NULL, NULL, &res_chg);
+		clean_display_buffer(ctx);
+		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			vb2_set_plane_payload(&dst_buf->vb2_buf, 1, 0);
+		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
+		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
+		clean_free_buffer(ctx);
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		return;
+	}
+	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
+	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
+	if (!buf.va) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_err("[%d] id=%d src_addr is NULL!!",
+				ctx->id, src_buf->vb2_buf.index);
+		return;
+	}
+	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
+			ctx->id, buf.va, &buf.dma_addr, buf.size, src_buf);
+	dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
+	dst_buf->timecode = src_buf->timecode;
+	mutex_lock(&ctx->lock);
+	dst_buf_info->used = true;
+	mutex_unlock(&ctx->lock);
+	src_buf_info->used = true;
+
+	ret = vdec_if_decode(ctx, &buf, pfb, &res_chg);
+
+	if (ret) {
+		mtk_v4l2_err(
+			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu dst_buf[%d] vdec_if_decode() ret=%d res_chg=%d===>",
+			ctx->id,
+			src_buf->vb2_buf.index,
+			buf.size,
+			src_buf->vb2_buf.timestamp,
+			dst_buf->vb2_buf.index,
+			ret, res_chg);
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		if (ret == -EIO) {
+			mutex_lock(&ctx->lock);
+			src_buf_info->error = true;
+			mutex_unlock(&ctx->lock);
+		}
+		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+	} else if (!res_chg) {
+		/*
+		 * we only return src buffer with VB2_BUF_STATE_DONE
+		 * when decode success without resolution change
+		 */
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+	}
+
+	dst_buf = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
+	clean_display_buffer(ctx);
+	clean_free_buffer(ctx);
+
+	if (!ret && res_chg) {
+		mtk_vdec_pic_info_update(ctx);
+		/*
+		 * On encountering a resolution change in the stream.
+		 * The driver must first process and decode all
+		 * remaining buffers from before the resolution change
+		 * point, so call flush decode here
+		 */
+		mtk_vdec_flush_decoder(ctx);
+		/*
+		 * After all buffers containing decoded frames from
+		 * before the resolution change point ready to be
+		 * dequeued on the CAPTURE queue, the driver sends a
+		 * V4L2_EVENT_SOURCE_CHANGE event for source change
+		 * type V4L2_EVENT_SRC_CH_RESOLUTION
+		 */
+		mtk_vdec_queue_res_chg_event(ctx);
+	}
+	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+}
+
+static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
+{
+	struct vb2_v4l2_buffer *src_buf;
+	struct mtk_vcodec_mem src_mem;
+	bool res_chg = false;
+	int ret = 0;
+	unsigned int dpbsize = 1, i = 0;
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
+	struct mtk_video_dec_buf *buf = NULL;
+	struct mtk_q_data *dst_q_data;
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
+			ctx->id, vb->vb2_queue->type,
+			vb->index, vb);
+	/*
+	 * check if this buffer is ready to be used after decode
+	 */
+	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
+		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
+		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
+				   m2m_buf.vb);
+		mutex_lock(&ctx->lock);
+		if (!buf->used) {
+			v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
+			buf->queued_in_vb2 = true;
+			buf->queued_in_v4l2 = true;
+		} else {
+			buf->queued_in_vb2 = false;
+			buf->queued_in_v4l2 = true;
+		}
+		mutex_unlock(&ctx->lock);
+		return;
+	}
+
+	v4l2_m2m_buf_queue(ctx->m2m_ctx, to_vb2_v4l2_buffer(vb));
+
+	if (ctx->state != MTK_STATE_INIT) {
+		mtk_v4l2_debug(3, "[%d] already init driver %d",
+				ctx->id, ctx->state);
+		return;
+	}
+
+	src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (!src_buf) {
+		mtk_v4l2_err("No src buffer");
+		return;
+	}
+	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
+	if (buf->lastframe) {
+		/* This shouldn't happen. Just in case. */
+		mtk_v4l2_err("Invalid flush buffer.");
+		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		return;
+	}
+
+	src_mem.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
+	src_mem.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	src_mem.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
+	mtk_v4l2_debug(2,
+			"[%d] buf id=%d va=%p dma=%pad size=%zx",
+			ctx->id, src_buf->vb2_buf.index,
+			src_mem.va, &src_mem.dma_addr,
+			src_mem.size);
+
+	ret = vdec_if_decode(ctx, &src_mem, NULL, &res_chg);
+	if (ret || !res_chg) {
+		/*
+		 * fb == NULL means to parse SPS/PPS header or
+		 * resolution info in src_mem. Decode can fail
+		 * if there is no SPS header or picture info
+		 * in bs
+		 */
+
+		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+		if (ret == -EIO) {
+			mtk_v4l2_err("[%d] Unrecoverable error in vdec_if_decode.",
+					ctx->id);
+			ctx->state = MTK_STATE_ABORT;
+			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		} else {
+			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+		}
+		mtk_v4l2_debug(ret ? 0 : 1,
+			       "[%d] vdec_if_decode() src_buf=%d, size=%zu, fail=%d, res_chg=%d",
+			       ctx->id, src_buf->vb2_buf.index,
+			       src_mem.size, ret, res_chg);
+		return;
+	}
+
+	if (vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo)) {
+		mtk_v4l2_err("[%d]Error!! Cannot get param : GET_PARAM_PICTURE_INFO ERR",
+				ctx->id);
+		return;
+	}
+
+	ctx->last_decoded_picinfo = ctx->picinfo;
+	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
+	for (i = 0; i < dst_q_data->fmt->num_planes; i++) {
+		dst_q_data->sizeimage[i] = ctx->picinfo.fb_sz[i];
+		dst_q_data->bytesperline[i] = ctx->picinfo.buf_w;
+	}
+
+	mtk_v4l2_debug(2, "[%d] vdec_if_init() OK wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
+			ctx->id,
+			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			dst_q_data->sizeimage[0],
+			dst_q_data->sizeimage[1]);
+
+	ret = vdec_if_get_param(ctx, GET_PARAM_DPB_SIZE, &dpbsize);
+	if (dpbsize == 0)
+		mtk_v4l2_err("[%d] GET_PARAM_DPB_SIZE fail=%d", ctx->id, ret);
+
+	ctx->dpb_size = dpbsize;
+	ctx->state = MTK_STATE_HEADER;
+	mtk_v4l2_debug(1, "[%d] dpbsize=%d", ctx->id, ctx->dpb_size);
+
+	mtk_vdec_queue_res_chg_event(ctx);
+}
+
+static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
+	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
+};
+
+static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
+{
+	struct v4l2_ctrl *ctrl;
+
+	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, 1);
+
+	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
+				0, 32, 1, 1);
+	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
+	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
+				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
+
+	if (ctx->ctrl_hdl.error) {
+		mtk_v4l2_err("Adding control failed %d",
+				ctx->ctrl_hdl.error);
+		return ctx->ctrl_hdl.error;
+	}
+
+	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
+	return 0;
+}
+
+static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
+{
+}
+
+static struct vb2_ops mtk_vdec_frame_vb2_ops = {
+	.queue_setup	= vb2ops_vdec_queue_setup,
+	.buf_prepare	= vb2ops_vdec_buf_prepare,
+	.wait_prepare	= vb2_ops_wait_prepare,
+	.wait_finish	= vb2_ops_wait_finish,
+	.start_streaming	= vb2ops_vdec_start_streaming,
+
+	.buf_queue	= vb2ops_vdec_stateful_buf_queue,
+	.buf_init	= vb2ops_vdec_buf_init,
+	.buf_finish	= vb2ops_vdec_buf_finish,
+	.stop_streaming	= vb2ops_vdec_stop_streaming,
+};
+
+const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
+	.init_vdec_params = mtk_init_vdec_params,
+	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
+	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
+	.vdec_formats = mtk_video_formats,
+	.num_formats = NUM_FORMATS,
+	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
+	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.vdec_framesizes = mtk_vdec_framesizes,
+	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.worker = mtk_vdec_worker,
+	.flush_decoder = mtk_vdec_flush_decoder,
+};
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 3dd010cba23e..9221c17a176b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -305,6 +305,45 @@ enum mtk_chip {
 	MTK_MT8183,
 };
 
+/**
+ * struct mtk_vcodec_dec_pdata - compatible data for each IC
+ * @init_vdec_params: init vdec params
+ * @ctrls_setup: init vcodec dec ctrls
+ * @worker: worker to start a decode job
+ * @flush_decoder: function that flushes the decoder
+ *
+ * @vdec_vb2_ops: struct vb2_ops
+ *
+ * @vdec_formats: supported video decoder formats
+ * @num_formats: count of video decoder formats
+ * @default_out_fmt: default output buffer format
+ * @default_cap_fmt: default capture buffer format
+ *
+ * @vdec_framesizes: supported video decoder frame sizes
+ * @num_framesizes: count of video decoder frame sizes
+ *
+ * @uses_stateless_api: whether the decoder uses the stateless API with requests
+ */
+
+struct mtk_vcodec_dec_pdata {
+	void (*init_vdec_params)(struct mtk_vcodec_ctx *ctx);
+	int (*ctrls_setup)(struct mtk_vcodec_ctx *ctx);
+	void (*worker)(struct work_struct *work);
+	int (*flush_decoder)(struct mtk_vcodec_ctx *ctx);
+
+	struct vb2_ops *vdec_vb2_ops;
+
+	const struct mtk_video_fmt *vdec_formats;
+	const int num_formats;
+	const struct mtk_video_fmt *default_out_fmt;
+	const struct mtk_video_fmt *default_cap_fmt;
+
+	const struct mtk_codec_framesizes *vdec_framesizes;
+	const int num_framesizes;
+
+	bool uses_stateless_api;
+};
+
 /**
  * struct mtk_vcodec_enc_pdata - compatible data for each IC
  *
@@ -348,6 +387,7 @@ struct mtk_vcodec_enc_pdata {
  * @curr_ctx: The context that is waiting for codec hardware
  *
  * @reg_base: Mapped address of MTK Vcodec registers.
+ * @vdec_pdata: Current arch private data.
  *
  * @fw_handler: used to communicate with the firmware.
  * @id_counter: used to identify current opened instance
@@ -382,6 +422,7 @@ struct mtk_vcodec_dev {
 	spinlock_t irqlock;
 	struct mtk_vcodec_ctx *curr_ctx;
 	void __iomem *reg_base[NUM_MAX_VCODEC_REG_BASE];
+	const struct mtk_vcodec_dec_pdata *vdec_pdata;
 	const struct mtk_vcodec_enc_pdata *venc_pdata;
 
 	struct mtk_vcodec_fw *fw_handler;
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 02/15] media: mtk-vcodec: vdec: handle firmware version field
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Firmwares for decoders newer than MT8173 will include an ABI version
number in their initialization ack message. Add the capacity to manage
it and make initialization fail if the firmware ABI is of a version that
we don't support.

For MT8173, this ABI version field does not exist ; thus ignore it on
this chip. There should only be one firmware version available for it
anyway.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |  1 +
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 ++++
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |  5 +++++
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 21 +++++++++++++++++--
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 48b7524bc8fb..f9db7ef19c28 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -620,6 +620,7 @@ static struct vb2_ops mtk_vdec_frame_vb2_ops = {
 };
 
 const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
+	.chip = MTK_MT8173,
 	.init_vdec_params = mtk_init_vdec_params,
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 9221c17a176b..60bc39efa20d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -322,6 +322,8 @@ enum mtk_chip {
  * @vdec_framesizes: supported video decoder frame sizes
  * @num_framesizes: count of video decoder frame sizes
  *
+ * @chip: chip this decoder is compatible with
+ *
  * @uses_stateless_api: whether the decoder uses the stateless API with requests
  */
 
@@ -341,6 +343,8 @@ struct mtk_vcodec_dec_pdata {
 	const struct mtk_codec_framesizes *vdec_framesizes;
 	const int num_framesizes;
 
+	enum mtk_chip chip;
+
 	bool uses_stateless_api;
 };
 
diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
index 47a1c1c0fd04..eb66729fda63 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
@@ -83,12 +83,17 @@ struct vdec_ap_ipi_dec_start {
  * @status	: VPU exeuction result
  * @ap_inst_addr	: AP vcodec_vpu_inst instance address
  * @vpu_inst_addr	: VPU decoder instance address
+ * @vdec_abi_version:	ABI version of the firmware. Kernel can use it to
+ *			ensure that it is compatible with the firmware.
+ *			This field is not valid for MT8173 and must not be
+ *			accessed for this chip.
  */
 struct vdec_vpu_ipi_init_ack {
 	uint32_t msg_id;
 	int32_t status;
 	uint64_t ap_inst_addr;
 	uint32_t vpu_inst_addr;
+	uint32_t vdec_abi_version;
 };
 
 #endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 58b0e6fa8fd2..203089213e67 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -24,6 +24,22 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 	vpu->inst_addr = msg->vpu_inst_addr;
 
 	mtk_vcodec_debug(vpu, "- vpu_inst_addr = 0x%x", vpu->inst_addr);
+
+	/* Firmware version field does not exist on MT8173. */
+	if (vpu->ctx->dev->vdec_pdata->chip == MTK_MT8173)
+		return;
+
+	/* Check firmware version. */
+	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", msg->vdec_abi_version);
+	switch (msg->vdec_abi_version) {
+	case 1:
+		break;
+	default:
+		mtk_vcodec_err(vpu, "unhandled firmware version 0x%x\n",
+			       msg->vdec_abi_version);
+		vpu->failure = 1;
+		break;
+	}
 }
 
 /*
@@ -44,6 +60,9 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 
 	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
 
+	vpu->failure = msg->status;
+	vpu->signaled = 1;
+
 	if (msg->status == 0) {
 		switch (msg->msg_id) {
 		case VPU_IPIMSG_DEC_INIT_ACK:
@@ -63,8 +82,6 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 	}
 
 	mtk_vcodec_debug(vpu, "- id=%X", msg->msg_id);
-	vpu->failure = msg->status;
-	vpu->signaled = 1;
 }
 
 static int vcodec_vpu_send_msg(struct vdec_vpu_inst *vpu, void *msg, int len)
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 02/15] media: mtk-vcodec: vdec: handle firmware version field
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Firmwares for decoders newer than MT8173 will include an ABI version
number in their initialization ack message. Add the capacity to manage
it and make initialization fail if the firmware ABI is of a version that
we don't support.

For MT8173, this ABI version field does not exist ; thus ignore it on
this chip. There should only be one firmware version available for it
anyway.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |  1 +
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 ++++
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |  5 +++++
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 21 +++++++++++++++++--
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 48b7524bc8fb..f9db7ef19c28 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -620,6 +620,7 @@ static struct vb2_ops mtk_vdec_frame_vb2_ops = {
 };
 
 const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
+	.chip = MTK_MT8173,
 	.init_vdec_params = mtk_init_vdec_params,
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 9221c17a176b..60bc39efa20d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -322,6 +322,8 @@ enum mtk_chip {
  * @vdec_framesizes: supported video decoder frame sizes
  * @num_framesizes: count of video decoder frame sizes
  *
+ * @chip: chip this decoder is compatible with
+ *
  * @uses_stateless_api: whether the decoder uses the stateless API with requests
  */
 
@@ -341,6 +343,8 @@ struct mtk_vcodec_dec_pdata {
 	const struct mtk_codec_framesizes *vdec_framesizes;
 	const int num_framesizes;
 
+	enum mtk_chip chip;
+
 	bool uses_stateless_api;
 };
 
diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
index 47a1c1c0fd04..eb66729fda63 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
@@ -83,12 +83,17 @@ struct vdec_ap_ipi_dec_start {
  * @status	: VPU exeuction result
  * @ap_inst_addr	: AP vcodec_vpu_inst instance address
  * @vpu_inst_addr	: VPU decoder instance address
+ * @vdec_abi_version:	ABI version of the firmware. Kernel can use it to
+ *			ensure that it is compatible with the firmware.
+ *			This field is not valid for MT8173 and must not be
+ *			accessed for this chip.
  */
 struct vdec_vpu_ipi_init_ack {
 	uint32_t msg_id;
 	int32_t status;
 	uint64_t ap_inst_addr;
 	uint32_t vpu_inst_addr;
+	uint32_t vdec_abi_version;
 };
 
 #endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 58b0e6fa8fd2..203089213e67 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -24,6 +24,22 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 	vpu->inst_addr = msg->vpu_inst_addr;
 
 	mtk_vcodec_debug(vpu, "- vpu_inst_addr = 0x%x", vpu->inst_addr);
+
+	/* Firmware version field does not exist on MT8173. */
+	if (vpu->ctx->dev->vdec_pdata->chip == MTK_MT8173)
+		return;
+
+	/* Check firmware version. */
+	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", msg->vdec_abi_version);
+	switch (msg->vdec_abi_version) {
+	case 1:
+		break;
+	default:
+		mtk_vcodec_err(vpu, "unhandled firmware version 0x%x\n",
+			       msg->vdec_abi_version);
+		vpu->failure = 1;
+		break;
+	}
 }
 
 /*
@@ -44,6 +60,9 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 
 	mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
 
+	vpu->failure = msg->status;
+	vpu->signaled = 1;
+
 	if (msg->status == 0) {
 		switch (msg->msg_id) {
 		case VPU_IPIMSG_DEC_INIT_ACK:
@@ -63,8 +82,6 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 	}
 
 	mtk_vcodec_debug(vpu, "- id=%X", msg->msg_id);
-	vpu->failure = msg->status;
-	vpu->signaled = 1;
 }
 
 static int vcodec_vpu_send_msg(struct vdec_vpu_inst *vpu, void *msg, int len)
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 03/15] media: mtk-vcodec: support version 2 of decoder firmware ABI
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Add support for decoder firmware version 2, which makes the kernel
responsible for managing the VSI context and is used for stateless
codecs.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  | 18 +++++++++---
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 28 +++++++++++++++----
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |  5 ++++
 3 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
index eb66729fda63..a0e773ae3ab3 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
@@ -29,11 +29,15 @@ enum vdec_ipi_msgid {
 /**
  * struct vdec_ap_ipi_cmd - generic AP to VPU ipi command format
  * @msg_id	: vdec_ipi_msgid
- * @vpu_inst_addr	: VPU decoder instance address
+ * @vpu_inst_addr : VPU decoder instance address. Used if ABI version < 2.
+ * @inst_id     : instance ID. Used if the ABI version >= 2.
  */
 struct vdec_ap_ipi_cmd {
 	uint32_t msg_id;
-	uint32_t vpu_inst_addr;
+	union {
+		uint32_t vpu_inst_addr;
+		uint32_t inst_id;
+	};
 };
 
 /**
@@ -63,7 +67,8 @@ struct vdec_ap_ipi_init {
 /**
  * struct vdec_ap_ipi_dec_start - for AP_IPIMSG_DEC_START
  * @msg_id	: AP_IPIMSG_DEC_START
- * @vpu_inst_addr	: VPU decoder instance address
+ * @vpu_inst_addr : VPU decoder instance address. Used if ABI version < 2.
+ * @inst_id     : instance ID. Used if the ABI version >= 2.
  * @data	: Header info
  *	H264 decoder [0]:buf_sz [1]:nal_start
  *	VP8 decoder  [0]:width/height
@@ -72,7 +77,10 @@ struct vdec_ap_ipi_init {
  */
 struct vdec_ap_ipi_dec_start {
 	uint32_t msg_id;
-	uint32_t vpu_inst_addr;
+	union {
+		uint32_t vpu_inst_addr;
+		uint32_t inst_id;
+	};
 	uint32_t data[3];
 	uint32_t reserved;
 };
@@ -87,6 +95,7 @@ struct vdec_ap_ipi_dec_start {
  *			ensure that it is compatible with the firmware.
  *			This field is not valid for MT8173 and must not be
  *			accessed for this chip.
+ * @inst_id     : instance ID. Valid only if the ABI version >= 2.
  */
 struct vdec_vpu_ipi_init_ack {
 	uint32_t msg_id;
@@ -94,6 +103,7 @@ struct vdec_vpu_ipi_init_ack {
 	uint64_t ap_inst_addr;
 	uint32_t vpu_inst_addr;
 	uint32_t vdec_abi_version;
+	uint32_t inst_id;
 };
 
 #endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 203089213e67..5dffc459a33d 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -25,18 +25,30 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 
 	mtk_vcodec_debug(vpu, "- vpu_inst_addr = 0x%x", vpu->inst_addr);
 
+	/* Set default ABI version if dealing with unversioned firmware. */
+	vpu->fw_abi_version = 0;
+	/*
+	 * Instance ID is only used if ABI version >= 2. Initialize it with
+	 * garbage by default.
+	 */
+	vpu->inst_id = 0xdeadbeef;
+
 	/* Firmware version field does not exist on MT8173. */
 	if (vpu->ctx->dev->vdec_pdata->chip == MTK_MT8173)
 		return;
 
 	/* Check firmware version. */
-	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", msg->vdec_abi_version);
-	switch (msg->vdec_abi_version) {
+	vpu->fw_abi_version = msg->vdec_abi_version;
+	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", vpu->fw_abi_version);
+	switch (vpu->fw_abi_version) {
 	case 1:
 		break;
+	case 2:
+		vpu->inst_id = msg->inst_id;
+		break;
 	default:
 		mtk_vcodec_err(vpu, "unhandled firmware version 0x%x\n",
-			       msg->vdec_abi_version);
+			       vpu->fw_abi_version);
 		vpu->failure = 1;
 		break;
 	}
@@ -113,7 +125,10 @@ static int vcodec_send_ap_ipi(struct vdec_vpu_inst *vpu, unsigned int msg_id)
 
 	memset(&msg, 0, sizeof(msg));
 	msg.msg_id = msg_id;
-	msg.vpu_inst_addr = vpu->inst_addr;
+	if (vpu->fw_abi_version < 2)
+		msg.vpu_inst_addr = vpu->inst_addr;
+	else
+		msg.inst_id = vpu->inst_id;
 
 	err = vcodec_vpu_send_msg(vpu, &msg, sizeof(msg));
 	mtk_vcodec_debug(vpu, "- id=%X ret=%d", msg_id, err);
@@ -163,7 +178,10 @@ int vpu_dec_start(struct vdec_vpu_inst *vpu, uint32_t *data, unsigned int len)
 
 	memset(&msg, 0, sizeof(msg));
 	msg.msg_id = AP_IPIMSG_DEC_START;
-	msg.vpu_inst_addr = vpu->inst_addr;
+	if (vpu->fw_abi_version < 2)
+		msg.vpu_inst_addr = vpu->inst_addr;
+	else
+		msg.inst_id = vpu->inst_id;
 
 	for (i = 0; i < len; i++)
 		msg.data[i] = data[i];
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
index 85224eb7e34b..c2ed5b6cab8b 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
@@ -18,6 +18,9 @@ struct mtk_vcodec_ctx;
  *                for control and info share
  * @failure     : VPU execution result status, 0: success, others: fail
  * @inst_addr	: VPU decoder instance address
+ * @fw_abi_version : ABI version of the firmware.
+ * @inst_id	: if fw_abi_version >= 2, contains the instance ID to be given
+ *                in place of inst_addr in messages.
  * @signaled    : 1 - Host has received ack message from VPU, 0 - not received
  * @ctx         : context for v4l2 layer integration
  * @dev		: platform device of VPU
@@ -29,6 +32,8 @@ struct vdec_vpu_inst {
 	void *vsi;
 	int32_t failure;
 	uint32_t inst_addr;
+	uint32_t fw_abi_version;
+	uint32_t inst_id;
 	unsigned int signaled;
 	struct mtk_vcodec_ctx *ctx;
 	wait_queue_head_t wq;
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 03/15] media: mtk-vcodec: support version 2 of decoder firmware ABI
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Add support for decoder firmware version 2, which makes the kernel
responsible for managing the VSI context and is used for stateless
codecs.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  | 18 +++++++++---
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 28 +++++++++++++++----
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |  5 ++++
 3 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
index eb66729fda63..a0e773ae3ab3 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
@@ -29,11 +29,15 @@ enum vdec_ipi_msgid {
 /**
  * struct vdec_ap_ipi_cmd - generic AP to VPU ipi command format
  * @msg_id	: vdec_ipi_msgid
- * @vpu_inst_addr	: VPU decoder instance address
+ * @vpu_inst_addr : VPU decoder instance address. Used if ABI version < 2.
+ * @inst_id     : instance ID. Used if the ABI version >= 2.
  */
 struct vdec_ap_ipi_cmd {
 	uint32_t msg_id;
-	uint32_t vpu_inst_addr;
+	union {
+		uint32_t vpu_inst_addr;
+		uint32_t inst_id;
+	};
 };
 
 /**
@@ -63,7 +67,8 @@ struct vdec_ap_ipi_init {
 /**
  * struct vdec_ap_ipi_dec_start - for AP_IPIMSG_DEC_START
  * @msg_id	: AP_IPIMSG_DEC_START
- * @vpu_inst_addr	: VPU decoder instance address
+ * @vpu_inst_addr : VPU decoder instance address. Used if ABI version < 2.
+ * @inst_id     : instance ID. Used if the ABI version >= 2.
  * @data	: Header info
  *	H264 decoder [0]:buf_sz [1]:nal_start
  *	VP8 decoder  [0]:width/height
@@ -72,7 +77,10 @@ struct vdec_ap_ipi_init {
  */
 struct vdec_ap_ipi_dec_start {
 	uint32_t msg_id;
-	uint32_t vpu_inst_addr;
+	union {
+		uint32_t vpu_inst_addr;
+		uint32_t inst_id;
+	};
 	uint32_t data[3];
 	uint32_t reserved;
 };
@@ -87,6 +95,7 @@ struct vdec_ap_ipi_dec_start {
  *			ensure that it is compatible with the firmware.
  *			This field is not valid for MT8173 and must not be
  *			accessed for this chip.
+ * @inst_id     : instance ID. Valid only if the ABI version >= 2.
  */
 struct vdec_vpu_ipi_init_ack {
 	uint32_t msg_id;
@@ -94,6 +103,7 @@ struct vdec_vpu_ipi_init_ack {
 	uint64_t ap_inst_addr;
 	uint32_t vpu_inst_addr;
 	uint32_t vdec_abi_version;
+	uint32_t inst_id;
 };
 
 #endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 203089213e67..5dffc459a33d 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -25,18 +25,30 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 
 	mtk_vcodec_debug(vpu, "- vpu_inst_addr = 0x%x", vpu->inst_addr);
 
+	/* Set default ABI version if dealing with unversioned firmware. */
+	vpu->fw_abi_version = 0;
+	/*
+	 * Instance ID is only used if ABI version >= 2. Initialize it with
+	 * garbage by default.
+	 */
+	vpu->inst_id = 0xdeadbeef;
+
 	/* Firmware version field does not exist on MT8173. */
 	if (vpu->ctx->dev->vdec_pdata->chip == MTK_MT8173)
 		return;
 
 	/* Check firmware version. */
-	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", msg->vdec_abi_version);
-	switch (msg->vdec_abi_version) {
+	vpu->fw_abi_version = msg->vdec_abi_version;
+	mtk_vcodec_debug(vpu, "firmware version 0x%x\n", vpu->fw_abi_version);
+	switch (vpu->fw_abi_version) {
 	case 1:
 		break;
+	case 2:
+		vpu->inst_id = msg->inst_id;
+		break;
 	default:
 		mtk_vcodec_err(vpu, "unhandled firmware version 0x%x\n",
-			       msg->vdec_abi_version);
+			       vpu->fw_abi_version);
 		vpu->failure = 1;
 		break;
 	}
@@ -113,7 +125,10 @@ static int vcodec_send_ap_ipi(struct vdec_vpu_inst *vpu, unsigned int msg_id)
 
 	memset(&msg, 0, sizeof(msg));
 	msg.msg_id = msg_id;
-	msg.vpu_inst_addr = vpu->inst_addr;
+	if (vpu->fw_abi_version < 2)
+		msg.vpu_inst_addr = vpu->inst_addr;
+	else
+		msg.inst_id = vpu->inst_id;
 
 	err = vcodec_vpu_send_msg(vpu, &msg, sizeof(msg));
 	mtk_vcodec_debug(vpu, "- id=%X ret=%d", msg_id, err);
@@ -163,7 +178,10 @@ int vpu_dec_start(struct vdec_vpu_inst *vpu, uint32_t *data, unsigned int len)
 
 	memset(&msg, 0, sizeof(msg));
 	msg.msg_id = AP_IPIMSG_DEC_START;
-	msg.vpu_inst_addr = vpu->inst_addr;
+	if (vpu->fw_abi_version < 2)
+		msg.vpu_inst_addr = vpu->inst_addr;
+	else
+		msg.inst_id = vpu->inst_id;
 
 	for (i = 0; i < len; i++)
 		msg.data[i] = data[i];
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
index 85224eb7e34b..c2ed5b6cab8b 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
@@ -18,6 +18,9 @@ struct mtk_vcodec_ctx;
  *                for control and info share
  * @failure     : VPU execution result status, 0: success, others: fail
  * @inst_addr	: VPU decoder instance address
+ * @fw_abi_version : ABI version of the firmware.
+ * @inst_id	: if fw_abi_version >= 2, contains the instance ID to be given
+ *                in place of inst_addr in messages.
  * @signaled    : 1 - Host has received ack message from VPU, 0 - not received
  * @ctx         : context for v4l2 layer integration
  * @dev		: platform device of VPU
@@ -29,6 +32,8 @@ struct vdec_vpu_inst {
 	void *vsi;
 	int32_t failure;
 	uint32_t inst_addr;
+	uint32_t fw_abi_version;
+	uint32_t inst_id;
 	unsigned int signaled;
 	struct mtk_vcodec_ctx *ctx;
 	wait_queue_head_t wq;
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 04/15] media: add Mediatek's MM21 format
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Add Mediatek's non-compressed 8 bit block video mode. This format is
produced by the MT8183 codec and can be converted to a non-proprietary
format by the MDP3 component.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 Documentation/userspace-api/media/v4l/pixfmt-reserved.rst | 7 +++++++
 drivers/media/v4l2-core/v4l2-ioctl.c                      | 1 +
 include/uapi/linux/videodev2.h                            | 1 +
 3 files changed, 9 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
index c9231e18859b..187ea89f7a25 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
@@ -242,6 +242,13 @@ please make a proposal on the linux-media mailing list.
 	It is an opaque intermediate format and the MDP hardware must be
 	used to convert ``V4L2_PIX_FMT_MT21C`` to ``V4L2_PIX_FMT_NV12M``,
 	``V4L2_PIX_FMT_YUV420M`` or ``V4L2_PIX_FMT_YVU420``.
+    * .. _V4L2-PIX-FMT-MM21:
+
+      - ``V4L2_PIX_FMT_MM21``
+      - 'MM21'
+      - Non-compressed, tiled two-planar format used by Mediatek MT8183.
+	This is an opaque intermediate format and the MDP3 hardware can be
+	used to convert it to other formats.
     * .. _V4L2-PIX-FMT-SUNXI-TILED-NV12:
 
       - ``V4L2_PIX_FMT_SUNXI_TILED_NV12``
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index 31d1342e61e8..0b85b2bbc628 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1384,6 +1384,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_TM6000:	descr = "A/V + VBI Mux Packet"; break;
 	case V4L2_PIX_FMT_CIT_YYVYUY:	descr = "GSPCA CIT YYVYUY"; break;
 	case V4L2_PIX_FMT_KONICA420:	descr = "GSPCA KONICA420"; break;
+	case V4L2_PIX_FMT_MM21:		descr = "Mediatek 8-bit block format"; break;
 	case V4L2_PIX_FMT_HSV24:	descr = "24-bit HSV 8-8-8"; break;
 	case V4L2_PIX_FMT_HSV32:	descr = "32-bit XHSV 8-8-8-8"; break;
 	case V4L2_SDR_FMT_CU8:		descr = "Complex U8"; break;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 79dbde3bcf8d..e6890dae76ec 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -731,6 +731,7 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_Y12I     v4l2_fourcc('Y', '1', '2', 'I') /* Greyscale 12-bit L/R interleaved */
 #define V4L2_PIX_FMT_Z16      v4l2_fourcc('Z', '1', '6', ' ') /* Depth data 16-bit */
 #define V4L2_PIX_FMT_MT21C    v4l2_fourcc('M', 'T', '2', '1') /* Mediatek compressed block mode  */
+#define V4L2_PIX_FMT_MM21     v4l2_fourcc('M', 'M', '2', '1') /* Mediatek 8-bit block mode, two non-contiguous planes */
 #define V4L2_PIX_FMT_INZI     v4l2_fourcc('I', 'N', 'Z', 'I') /* Intel Planar Greyscale 10-bit and Depth 16-bit */
 #define V4L2_PIX_FMT_SUNXI_TILED_NV12 v4l2_fourcc('S', 'T', '1', '2') /* Sunxi Tiled NV12 Format */
 #define V4L2_PIX_FMT_CNF4     v4l2_fourcc('C', 'N', 'F', '4') /* Intel 4-bit packed depth confidence information */
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 04/15] media: add Mediatek's MM21 format
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Add Mediatek's non-compressed 8 bit block video mode. This format is
produced by the MT8183 codec and can be converted to a non-proprietary
format by the MDP3 component.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 Documentation/userspace-api/media/v4l/pixfmt-reserved.rst | 7 +++++++
 drivers/media/v4l2-core/v4l2-ioctl.c                      | 1 +
 include/uapi/linux/videodev2.h                            | 1 +
 3 files changed, 9 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
index c9231e18859b..187ea89f7a25 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
@@ -242,6 +242,13 @@ please make a proposal on the linux-media mailing list.
 	It is an opaque intermediate format and the MDP hardware must be
 	used to convert ``V4L2_PIX_FMT_MT21C`` to ``V4L2_PIX_FMT_NV12M``,
 	``V4L2_PIX_FMT_YUV420M`` or ``V4L2_PIX_FMT_YVU420``.
+    * .. _V4L2-PIX-FMT-MM21:
+
+      - ``V4L2_PIX_FMT_MM21``
+      - 'MM21'
+      - Non-compressed, tiled two-planar format used by Mediatek MT8183.
+	This is an opaque intermediate format and the MDP3 hardware can be
+	used to convert it to other formats.
     * .. _V4L2-PIX-FMT-SUNXI-TILED-NV12:
 
       - ``V4L2_PIX_FMT_SUNXI_TILED_NV12``
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index 31d1342e61e8..0b85b2bbc628 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1384,6 +1384,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_TM6000:	descr = "A/V + VBI Mux Packet"; break;
 	case V4L2_PIX_FMT_CIT_YYVYUY:	descr = "GSPCA CIT YYVYUY"; break;
 	case V4L2_PIX_FMT_KONICA420:	descr = "GSPCA KONICA420"; break;
+	case V4L2_PIX_FMT_MM21:		descr = "Mediatek 8-bit block format"; break;
 	case V4L2_PIX_FMT_HSV24:	descr = "24-bit HSV 8-8-8"; break;
 	case V4L2_PIX_FMT_HSV32:	descr = "32-bit XHSV 8-8-8-8"; break;
 	case V4L2_SDR_FMT_CU8:		descr = "Complex U8"; break;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 79dbde3bcf8d..e6890dae76ec 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -731,6 +731,7 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_Y12I     v4l2_fourcc('Y', '1', '2', 'I') /* Greyscale 12-bit L/R interleaved */
 #define V4L2_PIX_FMT_Z16      v4l2_fourcc('Z', '1', '6', ' ') /* Depth data 16-bit */
 #define V4L2_PIX_FMT_MT21C    v4l2_fourcc('M', 'T', '2', '1') /* Mediatek compressed block mode  */
+#define V4L2_PIX_FMT_MM21     v4l2_fourcc('M', 'M', '2', '1') /* Mediatek 8-bit block mode, two non-contiguous planes */
 #define V4L2_PIX_FMT_INZI     v4l2_fourcc('I', 'N', 'Z', 'I') /* Intel Planar Greyscale 10-bit and Depth 16-bit */
 #define V4L2_PIX_FMT_SUNXI_TILED_NV12 v4l2_fourcc('S', 'T', '1', '2') /* Sunxi Tiled NV12 Format */
 #define V4L2_PIX_FMT_CNF4     v4l2_fourcc('C', 'N', 'F', '4') /* Intel 4-bit packed depth confidence information */
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

From: Yunfei Dong <yunfei.dong@mediatek.com>

Support the stateless codec API that will be used by MT8183.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
 5 files changed, 503 insertions(+), 3 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 9c3cbb5b800e..4ba93d838ab6 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -12,6 +12,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec_vpu_if.o \
 		mtk_vcodec_dec.o \
 		mtk_vcodec_dec_stateful.o \
+		mtk_vcodec_dec_stateless.o \
 		mtk_vcodec_dec_pm.o \
 
 mtk-vcodec-enc-y := venc/venc_vp8_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 4a91d294002b..c286cc0f239f 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -46,6 +46,13 @@ static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
 static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+
+	/* Use M2M stateless helper if relevant */
+	if (ctx->dev->vdec_pdata->uses_stateless_api)
+		return v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv,
+								cmd);
+
 	switch (cmd->cmd) {
 	case V4L2_DEC_CMD_STOP:
 	case V4L2_DEC_CMD_START:
@@ -72,6 +79,10 @@ static int vidioc_decoder_cmd(struct file *file, void *priv,
 	if (ret)
 		return ret;
 
+	/* Use M2M stateless helper if relevant */
+	if (ctx->dev->vdec_pdata->uses_stateless_api)
+		return v4l2_m2m_ioctl_stateless_decoder_cmd(file, priv, cmd);
+
 	mtk_v4l2_debug(1, "decoder cmd=%u", cmd->cmd);
 	dst_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
 				V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
@@ -414,7 +425,8 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 	 * Setting OUTPUT format after OUTPUT buffers are allocated is invalid
 	 * if using the stateful API.
 	 */
-	if ((f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) &&
+	if (!dec_pdata->uses_stateless_api &&
+	    (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) &&
 	    vb2_is_busy(&ctx->m2m_ctx->out_q_ctx.q)) {
 		mtk_v4l2_err("out_q_ctx buffers already requested");
 		ret = -EBUSY;
@@ -457,6 +469,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		ctx->quantization = pix_mp->quantization;
 		ctx->xfer_func = pix_mp->xfer_func;
 
+		ctx->current_codec = fmt->fourcc;
 		if (ctx->state == MTK_STATE_FREE) {
 			ret = vdec_if_init(ctx, q_data->fmt->fourcc);
 			if (ret) {
@@ -468,6 +481,49 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		}
 	}
 
+	/*
+	 * If using the stateless API, S_FMT should have the effect of setting
+	 * the CAPTURE queue resolution no matter which queue it was called on.
+	 */
+	if (dec_pdata->uses_stateless_api) {
+		ctx->picinfo.pic_w = pix_mp->width;
+		ctx->picinfo.pic_h = pix_mp->height;
+
+		ret = vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo);
+		if (ret) {
+			mtk_v4l2_err("[%d]Error!! Get GET_PARAM_PICTURE_INFO Fail",
+				ctx->id);
+			return -EINVAL;
+		}
+
+		ctx->last_decoded_picinfo = ctx->picinfo;
+
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1) {
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0] =
+				ctx->picinfo.fb_sz[0] +
+				ctx->picinfo.fb_sz[1];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[0] =
+				ctx->picinfo.buf_w;
+		} else {
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0] =
+				ctx->picinfo.fb_sz[0];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[0] =
+				ctx->picinfo.buf_w;
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1] =
+				ctx->picinfo.fb_sz[1];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[1] =
+				ctx->picinfo.buf_w;
+		}
+
+		ctx->q_data[MTK_Q_DATA_DST].coded_width = ctx->picinfo.buf_w;
+		ctx->q_data[MTK_Q_DATA_DST].coded_height = ctx->picinfo.buf_h;
+		mtk_v4l2_debug(2, "[%d] vdec_if_init() num_plane = %d wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
+			ctx->id, pix_mp->num_planes,
+			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0],
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1]);
+	}
 	return 0;
 }
 
@@ -765,9 +821,15 @@ void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
 			struct mtk_video_dec_buf *buf_info = container_of(
 				 src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-			if (!buf_info->lastframe)
+			if (!buf_info->lastframe) {
+				struct media_request *req =
+					src_buf->vb2_buf.req_obj.req;
 				v4l2_m2m_buf_done(src_buf,
 						VB2_BUF_STATE_ERROR);
+				if (req)
+					v4l2_ctrl_request_complete(req,
+								&ctx->ctrl_hdl);
+			}
 		}
 		return;
 	}
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index 97a8304f6600..a2949e1bc7fe 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -45,6 +45,7 @@ struct vdec_fb {
  * @lastframe:		Intput buffer is last buffer - EOS
  * @error:		An unrecoverable error occurs on this buffer.
  * @frame_buffer:	Decode status, and buffer information of Capture buffer
+ * @bs_buffer:	Output buffer info
  *
  * Note : These status information help us track and debug buffer state
  */
@@ -55,12 +56,18 @@ struct mtk_video_dec_buf {
 	bool	queued_in_vb2;
 	bool	queued_in_v4l2;
 	bool	lastframe;
+
 	bool	error;
-	struct vdec_fb	frame_buffer;
+
+	union {
+		struct vdec_fb	frame_buffer;
+		struct mtk_vcodec_mem	bs_buffer;
+	};
 };
 
 extern const struct v4l2_ioctl_ops mtk_vdec_ioctl_ops;
 extern const struct v4l2_m2m_ops mtk_vdec_m2m_ops;
+extern const struct media_device_ops mtk_vcodec_media_ops;
 
 
 /*
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
new file mode 100644
index 000000000000..e2e08f54109b
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -0,0 +1,427 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "media/videobuf2-v4l2.h"
+#include <media/videobuf2-dma-contig.h>
+#include <media/v4l2-event.h>
+#include <media/v4l2-mem2mem.h>
+#include <linux/module.h>
+
+#include "mtk_vcodec_drv.h"
+#include "mtk_vcodec_dec.h"
+#include "mtk_vcodec_intr.h"
+#include "mtk_vcodec_util.h"
+#include "vdec_drv_if.h"
+#include "mtk_vcodec_dec_pm.h"
+
+/**
+ * struct mtk_stateless_control  - CID control type
+ * @cfg: Control configuration
+ * @codec_type: codec type (V4L2 pixel format) for CID control type
+ * @needed_in_request: whether the control must be present with each request
+ */
+struct mtk_stateless_control {
+	struct v4l2_ctrl_config cfg;
+	int codec_type;
+	bool needed_in_request;
+};
+
+static const struct mtk_stateless_control mtk_stateless_controls[] = {
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_SPS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_PPS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+			.def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
+			.max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
+			.menu_skip_mask =
+				BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
+				BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_DECODE_MODE,
+			.min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+			.def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+			.max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+	},
+};
+#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
+
+static const struct mtk_video_fmt mtk_video_formats[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_H264_SLICE,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_MM21,
+		.type = MTK_FMT_FRAME,
+		.num_planes = 2,
+	},
+};
+#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
+#define DEFAULT_OUT_FMT_IDX    0
+#define DEFAULT_CAP_FMT_IDX    1
+
+static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
+	{
+		.fourcc	= V4L2_PIX_FMT_H264_SLICE,
+		.stepwise = {
+			MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+			MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
+		},
+	},
+};
+
+#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
+
+static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
+					       struct vdec_fb *fb)
+{
+	struct mtk_video_dec_buf *vdec_frame_buf =
+		container_of(fb, struct mtk_video_dec_buf, frame_buffer);
+	struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
+	unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+
+	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+		unsigned int cap_c_size =
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+
+		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
+	}
+}
+
+static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
+					   struct vb2_v4l2_buffer *vb2_v4l2)
+{
+	struct mtk_video_dec_buf *framebuf =
+		container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
+	struct vdec_fb *pfb = &framebuf->frame_buffer;
+	struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
+
+	pfb = &framebuf->frame_buffer;
+	pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
+	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
+	pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+		pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
+		pfb->base_c.dma_addr =
+			vb2_dma_contig_plane_dma_addr(dst_buf, 1);
+		pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+	}
+	mtk_v4l2_debug(1,
+		"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
+		dst_buf->index, pfb,
+		pfb->base_y.va, &pfb->base_y.dma_addr,
+		&pfb->base_c.dma_addr, pfb->base_y.size,
+		ctx->decoded_frame_cnt);
+
+	return pfb;
+}
+
+static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
+{
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+
+	v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
+}
+
+static int fops_media_request_validate(struct media_request *mreq)
+{
+	const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
+	struct mtk_vcodec_ctx *ctx = NULL;
+	struct media_request_object *req_obj;
+	struct v4l2_ctrl_handler *parent_hdl, *hdl;
+	struct v4l2_ctrl *ctrl;
+	unsigned int i;
+
+	switch (buffer_cnt) {
+	case 1:
+		/* We expect exactly one buffer with the request */
+		break;
+	case 0:
+		mtk_v4l2_err("No buffer provided with the request");
+		return -ENOENT;
+	default:
+		mtk_v4l2_err("Too many buffers (%d) provided with the request",
+			     buffer_cnt);
+		return -EINVAL;
+	}
+
+	list_for_each_entry(req_obj, &mreq->objects, list) {
+		struct vb2_buffer *vb;
+
+		if (vb2_request_object_is_buffer(req_obj)) {
+			vb = container_of(req_obj, struct vb2_buffer, req_obj);
+			ctx = vb2_get_drv_priv(vb->vb2_queue);
+			break;
+		}
+	}
+
+	if (!ctx) {
+		mtk_v4l2_err("Cannot find buffer for request");
+		return -ENOENT;
+	}
+
+	parent_hdl = &ctx->ctrl_hdl;
+
+	hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
+	if (!hdl) {
+		mtk_v4l2_err("Cannot find control handler for request\n");
+		return -ENOENT;
+	}
+
+	for (i = 0; i < NUM_CTRLS; i++) {
+		if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
+			continue;
+		if (!mtk_stateless_controls[i].needed_in_request)
+			continue;
+
+		ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
+					  mtk_stateless_controls[i].cfg.id);
+		if (!ctrl) {
+			mtk_v4l2_err("Missing required codec control\n");
+			return -ENOENT;
+		}
+	}
+
+	v4l2_ctrl_request_hdl_put(hdl);
+
+	return vb2_request_validate(mreq);
+}
+
+static void mtk_vdec_worker(struct work_struct *work)
+{
+	struct mtk_vcodec_ctx *ctx =
+		container_of(work, struct mtk_vcodec_ctx, decode_work);
+	struct mtk_vcodec_dev *dev = ctx->dev;
+	struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
+	struct vb2_buffer *vb2_src;
+	struct mtk_vcodec_mem *bs_src;
+	struct mtk_video_dec_buf *dec_buf_src;
+	struct media_request *src_buf_req;
+	struct vdec_fb *dst_buf;
+	bool res_chg = false;
+	int ret;
+
+	vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (vb2_v4l2_src == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
+		return;
+	}
+
+	vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	if (vb2_v4l2_dst == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
+		return;
+	}
+
+	vb2_src = &vb2_v4l2_src->vb2_buf;
+	dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
+				   m2m_buf.vb);
+	bs_src = &dec_buf_src->bs_buffer;
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
+			ctx->id, src_buf->vb2_queue->type,
+			src_buf->index, src_buf, src_buf_info);
+
+	bs_src->va = vb2_plane_vaddr(vb2_src, 0);
+	bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
+	bs_src->size = (size_t)vb2_src->planes[0].bytesused;
+	if (!bs_src->va) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
+			     vb2_src->index);
+		return;
+	}
+
+	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
+			ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
+	/* Apply request controls. */
+	src_buf_req = vb2_src->req_obj.req;
+	if (src_buf_req)
+		v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
+	else
+		mtk_v4l2_err("vb2 buffer media request is NULL");
+
+	dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
+	v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
+	ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
+	if (ret) {
+		mtk_v4l2_err(
+			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
+			ctx->id, vb2_src->index, bs_src->size,
+			vb2_src->timestamp, ret, res_chg);
+		if (ret == -EIO) {
+			mutex_lock(&ctx->lock);
+			dec_buf_src->error = true;
+			mutex_unlock(&ctx->lock);
+		}
+	}
+
+	mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
+
+	v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
+		ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
+
+	v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
+}
+
+static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
+{
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+	struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
+			ctx->id, vb->vb2_queue->type,
+			vb->index, vb);
+
+	mutex_lock(&ctx->lock);
+	v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
+	mutex_unlock(&ctx->lock);
+	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
+		return;
+
+	mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
+		vb->vb2_queue->type, vb->index, src_buf);
+
+	/* If an OUTPUT buffer, we may need to update the state */
+	if (ctx->state == MTK_STATE_INIT) {
+		ctx->state = MTK_STATE_HEADER;
+		mtk_v4l2_debug(1, "Init driver from init to header.");
+	} else {
+		mtk_v4l2_debug(3, "[%d] already init driver %d",
+				ctx->id, ctx->state);
+	}
+}
+
+static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
+{
+	bool res_chg;
+
+	return vdec_if_decode(ctx, NULL, NULL, &res_chg);
+}
+
+static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
+	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
+};
+
+static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
+{
+	struct v4l2_ctrl *ctrl;
+	unsigned int i;
+
+	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
+	if (ctx->ctrl_hdl.error) {
+		mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
+		return ctx->ctrl_hdl.error;
+	}
+
+	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
+				0, 32, 1, 1);
+	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
+
+	for (i = 0; i < NUM_CTRLS; i++) {
+		struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
+
+		v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
+		if (ctx->ctrl_hdl.error) {
+			mtk_v4l2_err("Adding control %d failed %d",
+					i, ctx->ctrl_hdl.error);
+			return ctx->ctrl_hdl.error;
+		}
+	}
+
+	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
+
+	return 0;
+}
+
+const struct media_device_ops mtk_vcodec_media_ops = {
+	.req_validate	= fops_media_request_validate,
+	.req_queue	= v4l2_m2m_request_queue,
+};
+
+static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_queue *src_vq;
+
+	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				 V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+
+	/* Support request api for output plane */
+	src_vq->supports_requests = true;
+	src_vq->requires_requests = true;
+}
+
+static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
+{
+	return 0;
+}
+
+static struct vb2_ops mtk_vdec_request_vb2_ops = {
+	.queue_setup	= vb2ops_vdec_queue_setup,
+	.buf_prepare	= vb2ops_vdec_buf_prepare,
+	.wait_prepare	= vb2_ops_wait_prepare,
+	.wait_finish	= vb2_ops_wait_finish,
+	.start_streaming	= vb2ops_vdec_start_streaming,
+
+	.buf_queue	= vb2ops_vdec_stateless_buf_queue,
+	.buf_out_validate = vb2ops_vdec_out_buf_validate,
+	.buf_init	= vb2ops_vdec_buf_init,
+	.buf_finish	= vb2ops_vdec_buf_finish,
+	.stop_streaming	= vb2ops_vdec_stop_streaming,
+	.buf_request_complete = vb2ops_vdec_buf_request_complete,
+};
+
+const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
+	.chip = MTK_MT8183,
+	.init_vdec_params = mtk_init_vdec_params,
+	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
+	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
+	.vdec_formats = mtk_video_formats,
+	.num_formats = NUM_FORMATS,
+	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
+	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.vdec_framesizes = mtk_vdec_framesizes,
+	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.uses_stateless_api = true,
+	.worker = mtk_vdec_worker,
+	.flush_decoder = mtk_vdec_flush_decoder,
+};
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 60bc39efa20d..3b884a321883 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -251,6 +251,7 @@ struct vdec_pic_info {
  * @encode_work: worker for the encoding
  * @last_decoded_picinfo: pic information get from latest decode
  * @empty_flush_buf: a fake size-0 capture buffer that indicates flush
+ * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
  * @ycbcr_enc: enum v4l2_ycbcr_encoding, Y'CbCr encoding
@@ -290,6 +291,8 @@ struct mtk_vcodec_ctx {
 	struct vdec_pic_info last_decoded_picinfo;
 	struct mtk_video_dec_buf *empty_flush_buf;
 
+	u32 current_codec;
+
 	enum v4l2_colorspace colorspace;
 	enum v4l2_ycbcr_encoding ycbcr_enc;
 	enum v4l2_quantization quantization;
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

From: Yunfei Dong <yunfei.dong@mediatek.com>

Support the stateless codec API that will be used by MT8183.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
 5 files changed, 503 insertions(+), 3 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 9c3cbb5b800e..4ba93d838ab6 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -12,6 +12,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec_vpu_if.o \
 		mtk_vcodec_dec.o \
 		mtk_vcodec_dec_stateful.o \
+		mtk_vcodec_dec_stateless.o \
 		mtk_vcodec_dec_pm.o \
 
 mtk-vcodec-enc-y := venc/venc_vp8_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 4a91d294002b..c286cc0f239f 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -46,6 +46,13 @@ static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
 static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+
+	/* Use M2M stateless helper if relevant */
+	if (ctx->dev->vdec_pdata->uses_stateless_api)
+		return v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv,
+								cmd);
+
 	switch (cmd->cmd) {
 	case V4L2_DEC_CMD_STOP:
 	case V4L2_DEC_CMD_START:
@@ -72,6 +79,10 @@ static int vidioc_decoder_cmd(struct file *file, void *priv,
 	if (ret)
 		return ret;
 
+	/* Use M2M stateless helper if relevant */
+	if (ctx->dev->vdec_pdata->uses_stateless_api)
+		return v4l2_m2m_ioctl_stateless_decoder_cmd(file, priv, cmd);
+
 	mtk_v4l2_debug(1, "decoder cmd=%u", cmd->cmd);
 	dst_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
 				V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
@@ -414,7 +425,8 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 	 * Setting OUTPUT format after OUTPUT buffers are allocated is invalid
 	 * if using the stateful API.
 	 */
-	if ((f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) &&
+	if (!dec_pdata->uses_stateless_api &&
+	    (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) &&
 	    vb2_is_busy(&ctx->m2m_ctx->out_q_ctx.q)) {
 		mtk_v4l2_err("out_q_ctx buffers already requested");
 		ret = -EBUSY;
@@ -457,6 +469,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		ctx->quantization = pix_mp->quantization;
 		ctx->xfer_func = pix_mp->xfer_func;
 
+		ctx->current_codec = fmt->fourcc;
 		if (ctx->state == MTK_STATE_FREE) {
 			ret = vdec_if_init(ctx, q_data->fmt->fourcc);
 			if (ret) {
@@ -468,6 +481,49 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		}
 	}
 
+	/*
+	 * If using the stateless API, S_FMT should have the effect of setting
+	 * the CAPTURE queue resolution no matter which queue it was called on.
+	 */
+	if (dec_pdata->uses_stateless_api) {
+		ctx->picinfo.pic_w = pix_mp->width;
+		ctx->picinfo.pic_h = pix_mp->height;
+
+		ret = vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo);
+		if (ret) {
+			mtk_v4l2_err("[%d]Error!! Get GET_PARAM_PICTURE_INFO Fail",
+				ctx->id);
+			return -EINVAL;
+		}
+
+		ctx->last_decoded_picinfo = ctx->picinfo;
+
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1) {
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0] =
+				ctx->picinfo.fb_sz[0] +
+				ctx->picinfo.fb_sz[1];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[0] =
+				ctx->picinfo.buf_w;
+		} else {
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0] =
+				ctx->picinfo.fb_sz[0];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[0] =
+				ctx->picinfo.buf_w;
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1] =
+				ctx->picinfo.fb_sz[1];
+			ctx->q_data[MTK_Q_DATA_DST].bytesperline[1] =
+				ctx->picinfo.buf_w;
+		}
+
+		ctx->q_data[MTK_Q_DATA_DST].coded_width = ctx->picinfo.buf_w;
+		ctx->q_data[MTK_Q_DATA_DST].coded_height = ctx->picinfo.buf_h;
+		mtk_v4l2_debug(2, "[%d] vdec_if_init() num_plane = %d wxh=%dx%d pic wxh=%dx%d sz[0]=0x%x sz[1]=0x%x",
+			ctx->id, pix_mp->num_planes,
+			ctx->picinfo.buf_w, ctx->picinfo.buf_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[0],
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1]);
+	}
 	return 0;
 }
 
@@ -765,9 +821,15 @@ void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
 			struct mtk_video_dec_buf *buf_info = container_of(
 				 src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-			if (!buf_info->lastframe)
+			if (!buf_info->lastframe) {
+				struct media_request *req =
+					src_buf->vb2_buf.req_obj.req;
 				v4l2_m2m_buf_done(src_buf,
 						VB2_BUF_STATE_ERROR);
+				if (req)
+					v4l2_ctrl_request_complete(req,
+								&ctx->ctrl_hdl);
+			}
 		}
 		return;
 	}
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index 97a8304f6600..a2949e1bc7fe 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -45,6 +45,7 @@ struct vdec_fb {
  * @lastframe:		Intput buffer is last buffer - EOS
  * @error:		An unrecoverable error occurs on this buffer.
  * @frame_buffer:	Decode status, and buffer information of Capture buffer
+ * @bs_buffer:	Output buffer info
  *
  * Note : These status information help us track and debug buffer state
  */
@@ -55,12 +56,18 @@ struct mtk_video_dec_buf {
 	bool	queued_in_vb2;
 	bool	queued_in_v4l2;
 	bool	lastframe;
+
 	bool	error;
-	struct vdec_fb	frame_buffer;
+
+	union {
+		struct vdec_fb	frame_buffer;
+		struct mtk_vcodec_mem	bs_buffer;
+	};
 };
 
 extern const struct v4l2_ioctl_ops mtk_vdec_ioctl_ops;
 extern const struct v4l2_m2m_ops mtk_vdec_m2m_ops;
+extern const struct media_device_ops mtk_vcodec_media_ops;
 
 
 /*
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
new file mode 100644
index 000000000000..e2e08f54109b
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -0,0 +1,427 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "media/videobuf2-v4l2.h"
+#include <media/videobuf2-dma-contig.h>
+#include <media/v4l2-event.h>
+#include <media/v4l2-mem2mem.h>
+#include <linux/module.h>
+
+#include "mtk_vcodec_drv.h"
+#include "mtk_vcodec_dec.h"
+#include "mtk_vcodec_intr.h"
+#include "mtk_vcodec_util.h"
+#include "vdec_drv_if.h"
+#include "mtk_vcodec_dec_pm.h"
+
+/**
+ * struct mtk_stateless_control  - CID control type
+ * @cfg: Control configuration
+ * @codec_type: codec type (V4L2 pixel format) for CID control type
+ * @needed_in_request: whether the control must be present with each request
+ */
+struct mtk_stateless_control {
+	struct v4l2_ctrl_config cfg;
+	int codec_type;
+	bool needed_in_request;
+};
+
+static const struct mtk_stateless_control mtk_stateless_controls[] = {
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_SPS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_PPS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+		.needed_in_request = true,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+			.def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
+			.max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
+			.menu_skip_mask =
+				BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
+				BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_H264_DECODE_MODE,
+			.min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+			.def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+			.max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
+		},
+		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+	},
+};
+#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
+
+static const struct mtk_video_fmt mtk_video_formats[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_H264_SLICE,
+		.type = MTK_FMT_DEC,
+		.num_planes = 1,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_MM21,
+		.type = MTK_FMT_FRAME,
+		.num_planes = 2,
+	},
+};
+#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
+#define DEFAULT_OUT_FMT_IDX    0
+#define DEFAULT_CAP_FMT_IDX    1
+
+static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
+	{
+		.fourcc	= V4L2_PIX_FMT_H264_SLICE,
+		.stepwise = {
+			MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
+			MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
+		},
+	},
+};
+
+#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
+
+static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
+					       struct vdec_fb *fb)
+{
+	struct mtk_video_dec_buf *vdec_frame_buf =
+		container_of(fb, struct mtk_video_dec_buf, frame_buffer);
+	struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
+	unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+
+	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+		unsigned int cap_c_size =
+			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+
+		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
+	}
+}
+
+static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
+					   struct vb2_v4l2_buffer *vb2_v4l2)
+{
+	struct mtk_video_dec_buf *framebuf =
+		container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
+	struct vdec_fb *pfb = &framebuf->frame_buffer;
+	struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
+
+	pfb = &framebuf->frame_buffer;
+	pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
+	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
+	pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+		pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
+		pfb->base_c.dma_addr =
+			vb2_dma_contig_plane_dma_addr(dst_buf, 1);
+		pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+	}
+	mtk_v4l2_debug(1,
+		"id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
+		dst_buf->index, pfb,
+		pfb->base_y.va, &pfb->base_y.dma_addr,
+		&pfb->base_c.dma_addr, pfb->base_y.size,
+		ctx->decoded_frame_cnt);
+
+	return pfb;
+}
+
+static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
+{
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+
+	v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
+}
+
+static int fops_media_request_validate(struct media_request *mreq)
+{
+	const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
+	struct mtk_vcodec_ctx *ctx = NULL;
+	struct media_request_object *req_obj;
+	struct v4l2_ctrl_handler *parent_hdl, *hdl;
+	struct v4l2_ctrl *ctrl;
+	unsigned int i;
+
+	switch (buffer_cnt) {
+	case 1:
+		/* We expect exactly one buffer with the request */
+		break;
+	case 0:
+		mtk_v4l2_err("No buffer provided with the request");
+		return -ENOENT;
+	default:
+		mtk_v4l2_err("Too many buffers (%d) provided with the request",
+			     buffer_cnt);
+		return -EINVAL;
+	}
+
+	list_for_each_entry(req_obj, &mreq->objects, list) {
+		struct vb2_buffer *vb;
+
+		if (vb2_request_object_is_buffer(req_obj)) {
+			vb = container_of(req_obj, struct vb2_buffer, req_obj);
+			ctx = vb2_get_drv_priv(vb->vb2_queue);
+			break;
+		}
+	}
+
+	if (!ctx) {
+		mtk_v4l2_err("Cannot find buffer for request");
+		return -ENOENT;
+	}
+
+	parent_hdl = &ctx->ctrl_hdl;
+
+	hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
+	if (!hdl) {
+		mtk_v4l2_err("Cannot find control handler for request\n");
+		return -ENOENT;
+	}
+
+	for (i = 0; i < NUM_CTRLS; i++) {
+		if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
+			continue;
+		if (!mtk_stateless_controls[i].needed_in_request)
+			continue;
+
+		ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
+					  mtk_stateless_controls[i].cfg.id);
+		if (!ctrl) {
+			mtk_v4l2_err("Missing required codec control\n");
+			return -ENOENT;
+		}
+	}
+
+	v4l2_ctrl_request_hdl_put(hdl);
+
+	return vb2_request_validate(mreq);
+}
+
+static void mtk_vdec_worker(struct work_struct *work)
+{
+	struct mtk_vcodec_ctx *ctx =
+		container_of(work, struct mtk_vcodec_ctx, decode_work);
+	struct mtk_vcodec_dev *dev = ctx->dev;
+	struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
+	struct vb2_buffer *vb2_src;
+	struct mtk_vcodec_mem *bs_src;
+	struct mtk_video_dec_buf *dec_buf_src;
+	struct media_request *src_buf_req;
+	struct vdec_fb *dst_buf;
+	bool res_chg = false;
+	int ret;
+
+	vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+	if (vb2_v4l2_src == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
+		return;
+	}
+
+	vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	if (vb2_v4l2_dst == NULL) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
+		return;
+	}
+
+	vb2_src = &vb2_v4l2_src->vb2_buf;
+	dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
+				   m2m_buf.vb);
+	bs_src = &dec_buf_src->bs_buffer;
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
+			ctx->id, src_buf->vb2_queue->type,
+			src_buf->index, src_buf, src_buf_info);
+
+	bs_src->va = vb2_plane_vaddr(vb2_src, 0);
+	bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
+	bs_src->size = (size_t)vb2_src->planes[0].bytesused;
+	if (!bs_src->va) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
+			     vb2_src->index);
+		return;
+	}
+
+	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
+			ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
+	/* Apply request controls. */
+	src_buf_req = vb2_src->req_obj.req;
+	if (src_buf_req)
+		v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
+	else
+		mtk_v4l2_err("vb2 buffer media request is NULL");
+
+	dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
+	v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
+	ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
+	if (ret) {
+		mtk_v4l2_err(
+			" <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
+			ctx->id, vb2_src->index, bs_src->size,
+			vb2_src->timestamp, ret, res_chg);
+		if (ret == -EIO) {
+			mutex_lock(&ctx->lock);
+			dec_buf_src->error = true;
+			mutex_unlock(&ctx->lock);
+		}
+	}
+
+	mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
+
+	v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
+		ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
+
+	v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
+}
+
+static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
+{
+	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+	struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
+
+	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
+			ctx->id, vb->vb2_queue->type,
+			vb->index, vb);
+
+	mutex_lock(&ctx->lock);
+	v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
+	mutex_unlock(&ctx->lock);
+	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
+		return;
+
+	mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
+		vb->vb2_queue->type, vb->index, src_buf);
+
+	/* If an OUTPUT buffer, we may need to update the state */
+	if (ctx->state == MTK_STATE_INIT) {
+		ctx->state = MTK_STATE_HEADER;
+		mtk_v4l2_debug(1, "Init driver from init to header.");
+	} else {
+		mtk_v4l2_debug(3, "[%d] already init driver %d",
+				ctx->id, ctx->state);
+	}
+}
+
+static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
+{
+	bool res_chg;
+
+	return vdec_if_decode(ctx, NULL, NULL, &res_chg);
+}
+
+static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
+	.g_volatile_ctrl = mtk_vdec_g_v_ctrl,
+};
+
+static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
+{
+	struct v4l2_ctrl *ctrl;
+	unsigned int i;
+
+	v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
+	if (ctx->ctrl_hdl.error) {
+		mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
+		return ctx->ctrl_hdl.error;
+	}
+
+	ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
+				&mtk_vcodec_dec_ctrl_ops,
+				V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
+				0, 32, 1, 1);
+	ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
+
+	for (i = 0; i < NUM_CTRLS; i++) {
+		struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
+
+		v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
+		if (ctx->ctrl_hdl.error) {
+			mtk_v4l2_err("Adding control %d failed %d",
+					i, ctx->ctrl_hdl.error);
+			return ctx->ctrl_hdl.error;
+		}
+	}
+
+	v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
+
+	return 0;
+}
+
+const struct media_device_ops mtk_vcodec_media_ops = {
+	.req_validate	= fops_media_request_validate,
+	.req_queue	= v4l2_m2m_request_queue,
+};
+
+static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
+{
+	struct vb2_queue *src_vq;
+
+	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				 V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+
+	/* Support request api for output plane */
+	src_vq->supports_requests = true;
+	src_vq->requires_requests = true;
+}
+
+static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
+{
+	return 0;
+}
+
+static struct vb2_ops mtk_vdec_request_vb2_ops = {
+	.queue_setup	= vb2ops_vdec_queue_setup,
+	.buf_prepare	= vb2ops_vdec_buf_prepare,
+	.wait_prepare	= vb2_ops_wait_prepare,
+	.wait_finish	= vb2_ops_wait_finish,
+	.start_streaming	= vb2ops_vdec_start_streaming,
+
+	.buf_queue	= vb2ops_vdec_stateless_buf_queue,
+	.buf_out_validate = vb2ops_vdec_out_buf_validate,
+	.buf_init	= vb2ops_vdec_buf_init,
+	.buf_finish	= vb2ops_vdec_buf_finish,
+	.stop_streaming	= vb2ops_vdec_stop_streaming,
+	.buf_request_complete = vb2ops_vdec_buf_request_complete,
+};
+
+const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
+	.chip = MTK_MT8183,
+	.init_vdec_params = mtk_init_vdec_params,
+	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
+	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
+	.vdec_formats = mtk_video_formats,
+	.num_formats = NUM_FORMATS,
+	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
+	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.vdec_framesizes = mtk_vdec_framesizes,
+	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.uses_stateless_api = true,
+	.worker = mtk_vdec_worker,
+	.flush_decoder = mtk_vdec_flush_decoder,
+};
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 60bc39efa20d..3b884a321883 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -251,6 +251,7 @@ struct vdec_pic_info {
  * @encode_work: worker for the encoding
  * @last_decoded_picinfo: pic information get from latest decode
  * @empty_flush_buf: a fake size-0 capture buffer that indicates flush
+ * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
  * @ycbcr_enc: enum v4l2_ycbcr_encoding, Y'CbCr encoding
@@ -290,6 +291,8 @@ struct mtk_vcodec_ctx {
 	struct vdec_pic_info last_decoded_picinfo;
 	struct mtk_video_dec_buf *empty_flush_buf;
 
+	u32 current_codec;
+
 	enum v4l2_colorspace colorspace;
 	enum v4l2_ycbcr_encoding ycbcr_enc;
 	enum v4l2_quantization quantization;
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

From: Yunfei Dong <yunfei.dong@mediatek.com>

Add support for H.264 decoding using the stateless API, as supported by
MT8183. This support takes advantage of the V4L2 H.264 reference list
builders.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/Kconfig                |   1 +
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 5 files changed, 813 insertions(+)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index fd1831e97b22..c27db5643712 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
 	select V4L2_MEM2MEM_DEV
 	select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
 	select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
+	select V4L2_H264
 	help
 	  Mediatek video codec driver provides HW capability to
 	  encode and decode in a range of video formats on MT8173
diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 4ba93d838ab6..ca8e9e7a9c4e 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
 mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp8_if.o \
 		vdec/vdec_vp9_if.o \
+		vdec/vdec_h264_req_if.o \
 		mtk_vcodec_dec_drv.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
new file mode 100644
index 000000000000..2fbbfbbcfbec
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
@@ -0,0 +1,807 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/v4l2-h264.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "../vdec_drv_if.h"
+#include "../mtk_vcodec_util.h"
+#include "../mtk_vcodec_dec.h"
+#include "../mtk_vcodec_intr.h"
+#include "../vdec_vpu_if.h"
+#include "../vdec_drv_base.h"
+
+#define NAL_NON_IDR_SLICE			0x01
+#define NAL_IDR_SLICE				0x05
+#define NAL_H264_PPS				0x08
+#define NAL_TYPE(value)				((value) & 0x1F)
+
+#define BUF_PREDICTION_SZ			(64 * 4096)
+#define MB_UNIT_LEN				16
+
+/* get used parameters for sps/pps */
+#define GET_MTK_VDEC_FLAG(cond, flag) \
+	{ dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
+#define GET_MTK_VDEC_PARAM(param) \
+	{ dst_param->param = src_param->param; }
+/* motion vector size (bytes) for every macro block */
+#define HW_MB_STORE_SZ				64
+
+#define H264_MAX_FB_NUM				17
+#define H264_MAX_MV_NUM				32
+#define HDR_PARSING_BUF_SZ			1024
+
+/**
+ * struct mtk_h264_dpb_info  - h264 dpb information
+ * @y_dma_addr: Y bitstream physical address
+ * @c_dma_addr: CbCr bitstream physical address
+ * @reference_flag: reference picture flag (short/long term reference picture)
+ * @field: field picture flag
+ */
+struct mtk_h264_dpb_info {
+	dma_addr_t y_dma_addr;
+	dma_addr_t c_dma_addr;
+	int reference_flag;
+	int field;
+};
+
+/**
+ * struct mtk_h264_sps_param  - parameters for sps
+ */
+struct mtk_h264_sps_param {
+	unsigned char chroma_format_idc;
+	unsigned char bit_depth_luma_minus8;
+	unsigned char bit_depth_chroma_minus8;
+	unsigned char log2_max_frame_num_minus4;
+	unsigned char pic_order_cnt_type;
+	unsigned char log2_max_pic_order_cnt_lsb_minus4;
+	unsigned char max_num_ref_frames;
+	unsigned char separate_colour_plane_flag;
+	unsigned short pic_width_in_mbs_minus1;
+	unsigned short pic_height_in_map_units_minus1;
+	unsigned int max_frame_nums;
+	unsigned char qpprime_y_zero_transform_bypass_flag;
+	unsigned char delta_pic_order_always_zero_flag;
+	unsigned char frame_mbs_only_flag;
+	unsigned char mb_adaptive_frame_field_flag;
+	unsigned char direct_8x8_inference_flag;
+	unsigned char reserved[3];
+};
+
+/**
+ * struct mtk_h264_pps_param  - parameters for pps
+ */
+struct mtk_h264_pps_param {
+	unsigned char num_ref_idx_l0_default_active_minus1;
+	unsigned char num_ref_idx_l1_default_active_minus1;
+	unsigned char weighted_bipred_idc;
+	char pic_init_qp_minus26;
+	char chroma_qp_index_offset;
+	char second_chroma_qp_index_offset;
+	unsigned char entropy_coding_mode_flag;
+	unsigned char pic_order_present_flag;
+	unsigned char deblocking_filter_control_present_flag;
+	unsigned char constrained_intra_pred_flag;
+	unsigned char weighted_pred_flag;
+	unsigned char redundant_pic_cnt_present_flag;
+	unsigned char transform_8x8_mode_flag;
+	unsigned char scaling_matrix_present_flag;
+	unsigned char reserved[2];
+};
+
+struct slice_api_h264_scaling_matrix {
+	unsigned char scaling_list_4x4[6][16];
+	unsigned char scaling_list_8x8[6][64];
+};
+
+struct slice_h264_dpb_entry {
+	unsigned long long reference_ts;
+	unsigned short frame_num;
+	unsigned short pic_num;
+	/* Note that field is indicated by v4l2_buffer.field */
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
+};
+
+/**
+ * struct slice_api_h264_decode_param - parameters for decode.
+ */
+struct slice_api_h264_decode_param {
+	struct slice_h264_dpb_entry dpb[16];
+	unsigned short num_slices;
+	unsigned short nal_ref_idc;
+	unsigned char ref_pic_list_p0[32];
+	unsigned char ref_pic_list_b0[32];
+	unsigned char ref_pic_list_b1[32];
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
+};
+
+/**
+ * struct mtk_h264_dec_slice_param  - parameters for decode current frame
+ */
+struct mtk_h264_dec_slice_param {
+	struct mtk_h264_sps_param			sps;
+	struct mtk_h264_pps_param			pps;
+	struct slice_api_h264_scaling_matrix		scaling_matrix;
+	struct slice_api_h264_decode_param		decode_params;
+	struct mtk_h264_dpb_info h264_dpb_info[16];
+};
+
+/**
+ * struct h264_fb - h264 decode frame buffer information
+ * @vdec_fb_va  : virtual address of struct vdec_fb
+ * @y_fb_dma    : dma address of Y frame buffer (luma)
+ * @c_fb_dma    : dma address of C frame buffer (chroma)
+ * @poc         : picture order count of frame buffer
+ * @reserved    : for 8 bytes alignment
+ */
+struct h264_fb {
+	uint64_t vdec_fb_va;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+	int32_t poc;
+	uint32_t reserved;
+};
+
+/**
+ * struct vdec_h264_dec_info - decode information
+ * @dpb_sz		: decoding picture buffer size
+ * @resolution_changed  : resoltion change happen
+ * @realloc_mv_buf	: flag to notify driver to re-allocate mv buffer
+ * @cap_num_planes	: number planes of capture buffer
+ * @bs_dma		: Input bit-stream buffer dma address
+ * @y_fb_dma		: Y frame buffer dma address
+ * @c_fb_dma		: C frame buffer dma address
+ * @vdec_fb_va		: VDEC frame buffer struct virtual address
+ */
+struct vdec_h264_dec_info {
+	uint32_t dpb_sz;
+	uint32_t resolution_changed;
+	uint32_t realloc_mv_buf;
+	uint32_t cap_num_planes;
+	uint64_t bs_dma;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+	uint64_t vdec_fb_va;
+};
+
+/**
+ * struct vdec_h264_vsi - shared memory for decode information exchange
+ *                        between VPU and Host.
+ *                        The memory is allocated by VPU then mapping to Host
+ *                        in vpu_dec_init() and freed in vpu_dec_deinit()
+ *                        by VPU.
+ *                        AP-W/R : AP is writer/reader on this item
+ *                        VPU-W/R: VPU is write/reader on this item
+ * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
+ * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
+ * @dec          : decode information (AP-R, VPU-W)
+ * @pic          : picture information (AP-R, VPU-W)
+ * @crop         : crop information (AP-R, VPU-W)
+ */
+struct vdec_h264_vsi {
+	uint64_t pred_buf_dma;
+	uint64_t mv_buf_dma[H264_MAX_MV_NUM];
+	struct vdec_h264_dec_info dec;
+	struct vdec_pic_info pic;
+	struct v4l2_rect crop;
+	struct mtk_h264_dec_slice_param h264_slice_params;
+};
+
+/**
+ * struct vdec_h264_slice_inst - h264 decoder instance
+ * @num_nalu : how many nalus be decoded
+ * @ctx      : point to mtk_vcodec_ctx
+ * @pred_buf : HW working predication buffer
+ * @mv_buf   : HW working motion vector buffer
+ * @vpu      : VPU instance
+ * @vsi_ctx  : Local VSI data for this decoding context
+ */
+struct vdec_h264_slice_inst {
+	unsigned int num_nalu;
+	struct mtk_vcodec_ctx *ctx;
+	struct mtk_vcodec_mem pred_buf;
+	struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
+	struct vdec_vpu_inst vpu;
+	struct vdec_h264_vsi vsi_ctx;
+	struct mtk_h264_dec_slice_param h264_slice_param;
+
+	struct v4l2_h264_dpb_entry dpb[16];
+};
+
+static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
+				 int id)
+{
+	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
+
+	return ctrl->p_cur.p;
+}
+
+static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
+			      struct mtk_h264_dec_slice_param *slice_param)
+{
+	struct vb2_queue *vq;
+	struct vb2_buffer *vb;
+	struct vb2_v4l2_buffer *vb2_v4l2;
+	u64 index;
+
+	vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
+		V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+
+	for (index = 0; index < 16; index++) {
+		const struct slice_h264_dpb_entry *dpb;
+		int vb2_index;
+
+		dpb = &slice_param->decode_params.dpb[index];
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
+			slice_param->h264_dpb_info[index].reference_flag = 0;
+			continue;
+		}
+
+		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
+		if (vb2_index < 0) {
+			mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
+				index, dpb->reference_ts);
+			continue;
+		}
+		/* 1 for short term reference, 2 for long term reference */
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			slice_param->h264_dpb_info[index].reference_flag = 1;
+		else
+			slice_param->h264_dpb_info[index].reference_flag = 2;
+
+		vb = vq->bufs[vb2_index];
+		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
+		slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
+
+		slice_param->h264_dpb_info[index].y_dma_addr =
+			vb2_dma_contig_plane_dma_addr(vb, 0);
+		if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+			slice_param->h264_dpb_info[index].c_dma_addr =
+				vb2_dma_contig_plane_dma_addr(vb, 1);
+		}
+	}
+}
+
+static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
+	const struct v4l2_ctrl_h264_sps *src_param)
+{
+	GET_MTK_VDEC_PARAM(chroma_format_idc);
+	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
+	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
+	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
+	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
+	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
+	GET_MTK_VDEC_PARAM(max_num_ref_frames);
+	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
+	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
+
+	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
+		V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
+	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
+		V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
+	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
+		V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
+	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
+		V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
+	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
+		V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
+	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
+		V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
+}
+
+static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
+	const struct v4l2_ctrl_h264_pps *src_param)
+{
+	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
+	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
+	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
+	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
+	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
+	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
+
+	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
+		V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
+	GET_MTK_VDEC_FLAG(pic_order_present_flag,
+		V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
+	GET_MTK_VDEC_FLAG(weighted_pred_flag,
+		V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
+	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
+		V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
+	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
+		V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
+	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
+		V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
+	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
+		V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
+	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
+		V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
+}
+
+static void
+get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
+			const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
+{
+	memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
+	       sizeof(dst_matrix->scaling_list_4x4));
+
+	memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
+	       sizeof(dst_matrix->scaling_list_8x8));
+}
+
+static void get_h264_decode_parameters(
+	struct slice_api_h264_decode_param *dst_params,
+	const struct v4l2_ctrl_h264_decode_params *src_params,
+	const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
+		struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
+		const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
+
+		dst_entry->reference_ts = src_entry->reference_ts;
+		dst_entry->frame_num = src_entry->frame_num;
+		dst_entry->pic_num = src_entry->pic_num;
+		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
+		dst_entry->bottom_field_order_cnt =
+			src_entry->bottom_field_order_cnt;
+		dst_entry->flags = src_entry->flags;
+	}
+
+	// num_slices is a leftover from the old H.264 support and is ignored
+	// by the firmware.
+	dst_params->num_slices = 0;
+	dst_params->nal_ref_idc = src_params->nal_ref_idc;
+	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
+	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
+	dst_params->flags = src_params->flags;
+}
+
+static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
+			    const struct v4l2_h264_dpb_entry *b)
+{
+	return a->top_field_order_cnt == b->top_field_order_cnt &&
+	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+}
+
+/*
+ * Move DPB entries of dec_param that refer to a frame already existing in dpb
+ * into the already existing slot in dpb, and move other entries into new slots.
+ *
+ * This function is an adaptation of the similarly-named function in
+ * hantro_h264.c.
+ */
+static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
+		       struct v4l2_h264_dpb_entry *dpb)
+{
+	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	unsigned int i, j;
+
+	/* Disable all entries by default, and mark the ones in use. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+			set_bit(i, in_use);
+		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+	}
+
+	/* Try to match new DPB entries with existing ones by their POCs. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		/*
+		 * To cut off some comparisons, iterate only on target DPB
+		 * entries were already used.
+		 */
+		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
+			struct v4l2_h264_dpb_entry *cdpb;
+
+			cdpb = &dpb[j];
+			if (!dpb_entry_match(cdpb, ndpb))
+				continue;
+
+			*cdpb = *ndpb;
+			set_bit(j, used);
+			/* Don't reiterate on this one. */
+			clear_bit(j, in_use);
+			break;
+		}
+
+		if (j == ARRAY_SIZE(dec_param->dpb))
+			set_bit(i, new);
+	}
+
+	/* For entries that could not be matched, use remaining free slots. */
+	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+		struct v4l2_h264_dpb_entry *cdpb;
+
+		/*
+		 * Both arrays are of the same sizes, so there is no way
+		 * we can end up with no space in target array, unless
+		 * something is buggy.
+		 */
+		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
+		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
+			return;
+
+		cdpb = &dpb[j];
+		*cdpb = *ndpb;
+		set_bit(j, used);
+	}
+}
+
+/*
+ * The firmware expects unused reflist entries to have the value 0x20.
+ */
+static void fixup_ref_list(u8 *ref_list, size_t num_valid)
+{
+	memset(&ref_list[num_valid], 0x20, 32 - num_valid);
+}
+
+static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_params =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
+	const struct v4l2_ctrl_h264_sps *sps =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
+	const struct v4l2_ctrl_h264_pps *pps =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
+	struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
+	struct v4l2_h264_reflist_builder reflist_builder;
+	enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
+	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
+	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
+	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
+	int i;
+
+	update_dpb(dec_params, inst->dpb);
+
+	get_h264_sps_parameters(&slice_param->sps, sps);
+	get_h264_pps_parameters(&slice_param->pps, pps);
+	get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
+	get_h264_decode_parameters(&slice_param->decode_params, dec_params,
+				   inst->dpb);
+	get_h264_dpb_list(inst, slice_param);
+
+	/* Prepare the fields for our reference lists */
+	for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
+		dpb_fields[i] = slice_param->h264_dpb_info[i].field;
+	/* Build the reference lists */
+	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
+				       inst->dpb);
+	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
+	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
+	/* Adapt the built lists to the firmware's expectations */
+	fixup_ref_list(p0_reflist, reflist_builder.num_valid);
+	fixup_ref_list(b0_reflist, reflist_builder.num_valid);
+	fixup_ref_list(b1_reflist, reflist_builder.num_valid);
+
+	memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
+	       sizeof(inst->vsi_ctx.h264_slice_params));
+}
+
+static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
+{
+	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
+
+	return HW_MB_STORE_SZ * unit_size;
+}
+
+static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
+{
+	int err = 0;
+
+	inst->pred_buf.size = BUF_PREDICTION_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
+	if (err) {
+		mtk_vcodec_err(inst, "failed to allocate ppl buf");
+		return err;
+	}
+
+	inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
+	return 0;
+}
+
+static void free_predication_buf(struct vdec_h264_slice_inst *inst)
+{
+	struct mtk_vcodec_mem *mem = NULL;
+
+	mtk_vcodec_debug_enter(inst);
+
+	inst->vsi_ctx.pred_buf_dma = 0;
+	mem = &inst->pred_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+}
+
+static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
+	struct vdec_pic_info *pic)
+{
+	int i;
+	int err;
+	struct mtk_vcodec_mem *mem = NULL;
+	unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
+
+	mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+		mem->size = buf_sz;
+		err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+		if (err) {
+			mtk_vcodec_err(inst, "failed to allocate mv buf");
+			return err;
+		}
+		inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
+	}
+
+	return 0;
+}
+
+static void free_mv_buf(struct vdec_h264_slice_inst *inst)
+{
+	int i;
+	struct mtk_vcodec_mem *mem = NULL;
+
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		inst->vsi_ctx.mv_buf_dma[i] = 0;
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+	}
+}
+
+static void get_pic_info(struct vdec_h264_slice_inst *inst,
+			 struct vdec_pic_info *pic)
+{
+	struct mtk_vcodec_ctx *ctx = inst->ctx;
+
+	ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
+	ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
+	ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
+	ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
+	inst->vsi_ctx.dec.cap_num_planes =
+		ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
+
+	pic = &ctx->picinfo;
+	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
+			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
+	mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
+		ctx->picinfo.fb_sz[1]);
+
+	if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
+		(ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
+		inst->vsi_ctx.dec.resolution_changed = true;
+		if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
+			(ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
+			inst->vsi_ctx.dec.realloc_mv_buf = true;
+
+		mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
+			inst->vsi_ctx.dec.resolution_changed,
+			inst->vsi_ctx.dec.realloc_mv_buf,
+			ctx->last_decoded_picinfo.pic_w,
+			ctx->last_decoded_picinfo.pic_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h);
+	}
+}
+
+static void get_crop_info(struct vdec_h264_slice_inst *inst,
+	struct v4l2_rect *cr)
+{
+	cr->left = inst->vsi_ctx.crop.left;
+	cr->top = inst->vsi_ctx.crop.top;
+	cr->width = inst->vsi_ctx.crop.width;
+	cr->height = inst->vsi_ctx.crop.height;
+
+	mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
+			 cr->left, cr->top, cr->width, cr->height);
+}
+
+static void get_dpb_size(struct vdec_h264_slice_inst *inst,
+	unsigned int *dpb_sz)
+{
+	*dpb_sz = inst->vsi_ctx.dec.dpb_sz;
+	mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
+}
+
+static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_h264_slice_inst *inst = NULL;
+	int err;
+
+	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+
+	inst->ctx = ctx;
+
+	inst->vpu.id = SCP_IPI_VDEC_H264;
+	inst->vpu.ctx = ctx;
+
+	err = vpu_dec_init(&inst->vpu);
+	if (err) {
+		mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
+		goto error_free_inst;
+	}
+
+	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
+	inst->vsi_ctx.dec.resolution_changed = true;
+	inst->vsi_ctx.dec.realloc_mv_buf = true;
+
+	err = allocate_predication_buf(inst);
+	if (err)
+		goto error_deinit;
+
+	mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
+		sizeof(struct mtk_h264_sps_param),
+		sizeof(struct mtk_h264_pps_param),
+		sizeof(struct mtk_h264_dec_slice_param),
+		sizeof(struct mtk_h264_dpb_info));
+
+	mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
+
+	ctx->drv_handle = inst;
+	return 0;
+
+error_deinit:
+	vpu_dec_deinit(&inst->vpu);
+
+error_free_inst:
+	kfree(inst);
+	return err;
+}
+
+static void vdec_h264_slice_deinit(void *h_vdec)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+
+	mtk_vcodec_debug_enter(inst);
+
+	vpu_dec_deinit(&inst->vpu);
+	free_predication_buf(inst);
+	free_mv_buf(inst);
+
+	kfree(inst);
+}
+
+static int find_start_code(unsigned char *data, unsigned int data_sz)
+{
+	if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
+		return 3;
+
+	if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
+	    data[3] == 1)
+		return 4;
+
+	return -1;
+}
+
+static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
+				  struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+	struct vdec_vpu_inst *vpu = &inst->vpu;
+	struct mtk_video_dec_buf *src_buf_info;
+	int nal_start_idx = 0, err = 0;
+	uint32_t nal_type, data[2];
+	unsigned char *buf;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+
+	mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
+			 ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
+
+	/* bs NULL means flush decoder */
+	if (bs == NULL)
+		return vpu_dec_reset(vpu);
+
+	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+
+	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
+	c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
+
+	buf = (unsigned char *)bs->va;
+	nal_start_idx = find_start_code(buf, bs->size);
+	if (nal_start_idx < 0)
+		goto err_free_fb_out;
+
+	data[0] = bs->size;
+	data[1] = buf[nal_start_idx];
+	nal_type = NAL_TYPE(buf[nal_start_idx]);
+	mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
+			 nal_type);
+
+	inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
+	inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
+	inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
+	inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
+
+	get_vdec_decode_parameters(inst);
+	*res_chg = inst->vsi_ctx.dec.resolution_changed;
+	if (*res_chg) {
+		mtk_vcodec_debug(inst, "- resolution changed -");
+		if (inst->vsi_ctx.dec.realloc_mv_buf) {
+			err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
+			inst->vsi_ctx.dec.realloc_mv_buf = false;
+			if (err)
+				goto err_free_fb_out;
+		}
+		*res_chg = false;
+	}
+
+	memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
+	err = vpu_dec_start(vpu, data, 2);
+	if (err)
+		goto err_free_fb_out;
+
+	if (nal_type == NAL_NON_IDR_SLICE || nal_type == NAL_IDR_SLICE) {
+		/* wait decoder done interrupt */
+		err = mtk_vcodec_wait_for_done_ctx(inst->ctx,
+						   MTK_INST_IRQ_RECEIVED,
+						   WAIT_INTR_TIMEOUT_MS);
+		if (err)
+			goto err_free_fb_out;
+
+		vpu_dec_end(vpu);
+	}
+
+	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
+	mtk_vcodec_debug(inst, "\n - NALU[%d] type=%d -\n", inst->num_nalu,
+			 nal_type);
+	return 0;
+
+err_free_fb_out:
+	mtk_vcodec_err(inst, "\n - NALU[%d] err=%d -\n", inst->num_nalu, err);
+	return err;
+}
+
+static int vdec_h264_slice_get_param(void *h_vdec,
+			       enum vdec_get_param_type type, void *out)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+
+	switch (type) {
+	case GET_PARAM_PIC_INFO:
+		get_pic_info(inst, out);
+		break;
+
+	case GET_PARAM_DPB_SIZE:
+		get_dpb_size(inst, out);
+		break;
+
+	case GET_PARAM_CROP_INFO:
+		get_crop_info(inst, out);
+		break;
+
+	default:
+		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct vdec_common_if vdec_h264_slice_if = {
+	.init		= vdec_h264_slice_init,
+	.decode		= vdec_h264_slice_decode,
+	.get_param	= vdec_h264_slice_get_param,
+	.deinit		= vdec_h264_slice_deinit,
+};
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index b18743b906ea..42008243ceac 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -19,6 +19,9 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 	int ret = 0;
 
 	switch (fourcc) {
+	case V4L2_PIX_FMT_H264_SLICE:
+		ctx->dec_if = &vdec_h264_slice_if;
+		break;
 	case V4L2_PIX_FMT_H264:
 		ctx->dec_if = &vdec_h264_if;
 		break;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index 270d8dc9984b..961b2b6072b5 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -55,6 +55,7 @@ struct vdec_fb_node {
 };
 
 extern const struct vdec_common_if vdec_h264_if;
+extern const struct vdec_common_if vdec_h264_slice_if;
 extern const struct vdec_common_if vdec_vp8_if;
 extern const struct vdec_common_if vdec_vp9_if;
 
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

From: Yunfei Dong <yunfei.dong@mediatek.com>

Add support for H.264 decoding using the stateless API, as supported by
MT8183. This support takes advantage of the V4L2 H.264 reference list
builders.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/Kconfig                |   1 +
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 5 files changed, 813 insertions(+)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index fd1831e97b22..c27db5643712 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
 	select V4L2_MEM2MEM_DEV
 	select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
 	select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
+	select V4L2_H264
 	help
 	  Mediatek video codec driver provides HW capability to
 	  encode and decode in a range of video formats on MT8173
diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 4ba93d838ab6..ca8e9e7a9c4e 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
 mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp8_if.o \
 		vdec/vdec_vp9_if.o \
+		vdec/vdec_h264_req_if.o \
 		mtk_vcodec_dec_drv.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
new file mode 100644
index 000000000000..2fbbfbbcfbec
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
@@ -0,0 +1,807 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/v4l2-h264.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "../vdec_drv_if.h"
+#include "../mtk_vcodec_util.h"
+#include "../mtk_vcodec_dec.h"
+#include "../mtk_vcodec_intr.h"
+#include "../vdec_vpu_if.h"
+#include "../vdec_drv_base.h"
+
+#define NAL_NON_IDR_SLICE			0x01
+#define NAL_IDR_SLICE				0x05
+#define NAL_H264_PPS				0x08
+#define NAL_TYPE(value)				((value) & 0x1F)
+
+#define BUF_PREDICTION_SZ			(64 * 4096)
+#define MB_UNIT_LEN				16
+
+/* get used parameters for sps/pps */
+#define GET_MTK_VDEC_FLAG(cond, flag) \
+	{ dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
+#define GET_MTK_VDEC_PARAM(param) \
+	{ dst_param->param = src_param->param; }
+/* motion vector size (bytes) for every macro block */
+#define HW_MB_STORE_SZ				64
+
+#define H264_MAX_FB_NUM				17
+#define H264_MAX_MV_NUM				32
+#define HDR_PARSING_BUF_SZ			1024
+
+/**
+ * struct mtk_h264_dpb_info  - h264 dpb information
+ * @y_dma_addr: Y bitstream physical address
+ * @c_dma_addr: CbCr bitstream physical address
+ * @reference_flag: reference picture flag (short/long term reference picture)
+ * @field: field picture flag
+ */
+struct mtk_h264_dpb_info {
+	dma_addr_t y_dma_addr;
+	dma_addr_t c_dma_addr;
+	int reference_flag;
+	int field;
+};
+
+/**
+ * struct mtk_h264_sps_param  - parameters for sps
+ */
+struct mtk_h264_sps_param {
+	unsigned char chroma_format_idc;
+	unsigned char bit_depth_luma_minus8;
+	unsigned char bit_depth_chroma_minus8;
+	unsigned char log2_max_frame_num_minus4;
+	unsigned char pic_order_cnt_type;
+	unsigned char log2_max_pic_order_cnt_lsb_minus4;
+	unsigned char max_num_ref_frames;
+	unsigned char separate_colour_plane_flag;
+	unsigned short pic_width_in_mbs_minus1;
+	unsigned short pic_height_in_map_units_minus1;
+	unsigned int max_frame_nums;
+	unsigned char qpprime_y_zero_transform_bypass_flag;
+	unsigned char delta_pic_order_always_zero_flag;
+	unsigned char frame_mbs_only_flag;
+	unsigned char mb_adaptive_frame_field_flag;
+	unsigned char direct_8x8_inference_flag;
+	unsigned char reserved[3];
+};
+
+/**
+ * struct mtk_h264_pps_param  - parameters for pps
+ */
+struct mtk_h264_pps_param {
+	unsigned char num_ref_idx_l0_default_active_minus1;
+	unsigned char num_ref_idx_l1_default_active_minus1;
+	unsigned char weighted_bipred_idc;
+	char pic_init_qp_minus26;
+	char chroma_qp_index_offset;
+	char second_chroma_qp_index_offset;
+	unsigned char entropy_coding_mode_flag;
+	unsigned char pic_order_present_flag;
+	unsigned char deblocking_filter_control_present_flag;
+	unsigned char constrained_intra_pred_flag;
+	unsigned char weighted_pred_flag;
+	unsigned char redundant_pic_cnt_present_flag;
+	unsigned char transform_8x8_mode_flag;
+	unsigned char scaling_matrix_present_flag;
+	unsigned char reserved[2];
+};
+
+struct slice_api_h264_scaling_matrix {
+	unsigned char scaling_list_4x4[6][16];
+	unsigned char scaling_list_8x8[6][64];
+};
+
+struct slice_h264_dpb_entry {
+	unsigned long long reference_ts;
+	unsigned short frame_num;
+	unsigned short pic_num;
+	/* Note that field is indicated by v4l2_buffer.field */
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
+};
+
+/**
+ * struct slice_api_h264_decode_param - parameters for decode.
+ */
+struct slice_api_h264_decode_param {
+	struct slice_h264_dpb_entry dpb[16];
+	unsigned short num_slices;
+	unsigned short nal_ref_idc;
+	unsigned char ref_pic_list_p0[32];
+	unsigned char ref_pic_list_b0[32];
+	unsigned char ref_pic_list_b1[32];
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
+};
+
+/**
+ * struct mtk_h264_dec_slice_param  - parameters for decode current frame
+ */
+struct mtk_h264_dec_slice_param {
+	struct mtk_h264_sps_param			sps;
+	struct mtk_h264_pps_param			pps;
+	struct slice_api_h264_scaling_matrix		scaling_matrix;
+	struct slice_api_h264_decode_param		decode_params;
+	struct mtk_h264_dpb_info h264_dpb_info[16];
+};
+
+/**
+ * struct h264_fb - h264 decode frame buffer information
+ * @vdec_fb_va  : virtual address of struct vdec_fb
+ * @y_fb_dma    : dma address of Y frame buffer (luma)
+ * @c_fb_dma    : dma address of C frame buffer (chroma)
+ * @poc         : picture order count of frame buffer
+ * @reserved    : for 8 bytes alignment
+ */
+struct h264_fb {
+	uint64_t vdec_fb_va;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+	int32_t poc;
+	uint32_t reserved;
+};
+
+/**
+ * struct vdec_h264_dec_info - decode information
+ * @dpb_sz		: decoding picture buffer size
+ * @resolution_changed  : resoltion change happen
+ * @realloc_mv_buf	: flag to notify driver to re-allocate mv buffer
+ * @cap_num_planes	: number planes of capture buffer
+ * @bs_dma		: Input bit-stream buffer dma address
+ * @y_fb_dma		: Y frame buffer dma address
+ * @c_fb_dma		: C frame buffer dma address
+ * @vdec_fb_va		: VDEC frame buffer struct virtual address
+ */
+struct vdec_h264_dec_info {
+	uint32_t dpb_sz;
+	uint32_t resolution_changed;
+	uint32_t realloc_mv_buf;
+	uint32_t cap_num_planes;
+	uint64_t bs_dma;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+	uint64_t vdec_fb_va;
+};
+
+/**
+ * struct vdec_h264_vsi - shared memory for decode information exchange
+ *                        between VPU and Host.
+ *                        The memory is allocated by VPU then mapping to Host
+ *                        in vpu_dec_init() and freed in vpu_dec_deinit()
+ *                        by VPU.
+ *                        AP-W/R : AP is writer/reader on this item
+ *                        VPU-W/R: VPU is write/reader on this item
+ * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
+ * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
+ * @dec          : decode information (AP-R, VPU-W)
+ * @pic          : picture information (AP-R, VPU-W)
+ * @crop         : crop information (AP-R, VPU-W)
+ */
+struct vdec_h264_vsi {
+	uint64_t pred_buf_dma;
+	uint64_t mv_buf_dma[H264_MAX_MV_NUM];
+	struct vdec_h264_dec_info dec;
+	struct vdec_pic_info pic;
+	struct v4l2_rect crop;
+	struct mtk_h264_dec_slice_param h264_slice_params;
+};
+
+/**
+ * struct vdec_h264_slice_inst - h264 decoder instance
+ * @num_nalu : how many nalus be decoded
+ * @ctx      : point to mtk_vcodec_ctx
+ * @pred_buf : HW working predication buffer
+ * @mv_buf   : HW working motion vector buffer
+ * @vpu      : VPU instance
+ * @vsi_ctx  : Local VSI data for this decoding context
+ */
+struct vdec_h264_slice_inst {
+	unsigned int num_nalu;
+	struct mtk_vcodec_ctx *ctx;
+	struct mtk_vcodec_mem pred_buf;
+	struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
+	struct vdec_vpu_inst vpu;
+	struct vdec_h264_vsi vsi_ctx;
+	struct mtk_h264_dec_slice_param h264_slice_param;
+
+	struct v4l2_h264_dpb_entry dpb[16];
+};
+
+static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
+				 int id)
+{
+	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
+
+	return ctrl->p_cur.p;
+}
+
+static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
+			      struct mtk_h264_dec_slice_param *slice_param)
+{
+	struct vb2_queue *vq;
+	struct vb2_buffer *vb;
+	struct vb2_v4l2_buffer *vb2_v4l2;
+	u64 index;
+
+	vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
+		V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+
+	for (index = 0; index < 16; index++) {
+		const struct slice_h264_dpb_entry *dpb;
+		int vb2_index;
+
+		dpb = &slice_param->decode_params.dpb[index];
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
+			slice_param->h264_dpb_info[index].reference_flag = 0;
+			continue;
+		}
+
+		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
+		if (vb2_index < 0) {
+			mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
+				index, dpb->reference_ts);
+			continue;
+		}
+		/* 1 for short term reference, 2 for long term reference */
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			slice_param->h264_dpb_info[index].reference_flag = 1;
+		else
+			slice_param->h264_dpb_info[index].reference_flag = 2;
+
+		vb = vq->bufs[vb2_index];
+		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
+		slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
+
+		slice_param->h264_dpb_info[index].y_dma_addr =
+			vb2_dma_contig_plane_dma_addr(vb, 0);
+		if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+			slice_param->h264_dpb_info[index].c_dma_addr =
+				vb2_dma_contig_plane_dma_addr(vb, 1);
+		}
+	}
+}
+
+static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
+	const struct v4l2_ctrl_h264_sps *src_param)
+{
+	GET_MTK_VDEC_PARAM(chroma_format_idc);
+	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
+	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
+	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
+	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
+	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
+	GET_MTK_VDEC_PARAM(max_num_ref_frames);
+	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
+	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
+
+	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
+		V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
+	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
+		V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
+	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
+		V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
+	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
+		V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
+	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
+		V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
+	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
+		V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
+}
+
+static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
+	const struct v4l2_ctrl_h264_pps *src_param)
+{
+	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
+	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
+	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
+	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
+	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
+	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
+
+	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
+		V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
+	GET_MTK_VDEC_FLAG(pic_order_present_flag,
+		V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
+	GET_MTK_VDEC_FLAG(weighted_pred_flag,
+		V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
+	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
+		V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
+	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
+		V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
+	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
+		V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
+	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
+		V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
+	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
+		V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
+}
+
+static void
+get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
+			const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
+{
+	memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
+	       sizeof(dst_matrix->scaling_list_4x4));
+
+	memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
+	       sizeof(dst_matrix->scaling_list_8x8));
+}
+
+static void get_h264_decode_parameters(
+	struct slice_api_h264_decode_param *dst_params,
+	const struct v4l2_ctrl_h264_decode_params *src_params,
+	const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
+		struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
+		const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
+
+		dst_entry->reference_ts = src_entry->reference_ts;
+		dst_entry->frame_num = src_entry->frame_num;
+		dst_entry->pic_num = src_entry->pic_num;
+		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
+		dst_entry->bottom_field_order_cnt =
+			src_entry->bottom_field_order_cnt;
+		dst_entry->flags = src_entry->flags;
+	}
+
+	// num_slices is a leftover from the old H.264 support and is ignored
+	// by the firmware.
+	dst_params->num_slices = 0;
+	dst_params->nal_ref_idc = src_params->nal_ref_idc;
+	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
+	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
+	dst_params->flags = src_params->flags;
+}
+
+static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
+			    const struct v4l2_h264_dpb_entry *b)
+{
+	return a->top_field_order_cnt == b->top_field_order_cnt &&
+	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+}
+
+/*
+ * Move DPB entries of dec_param that refer to a frame already existing in dpb
+ * into the already existing slot in dpb, and move other entries into new slots.
+ *
+ * This function is an adaptation of the similarly-named function in
+ * hantro_h264.c.
+ */
+static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
+		       struct v4l2_h264_dpb_entry *dpb)
+{
+	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	unsigned int i, j;
+
+	/* Disable all entries by default, and mark the ones in use. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+			set_bit(i, in_use);
+		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+	}
+
+	/* Try to match new DPB entries with existing ones by their POCs. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		/*
+		 * To cut off some comparisons, iterate only on target DPB
+		 * entries were already used.
+		 */
+		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
+			struct v4l2_h264_dpb_entry *cdpb;
+
+			cdpb = &dpb[j];
+			if (!dpb_entry_match(cdpb, ndpb))
+				continue;
+
+			*cdpb = *ndpb;
+			set_bit(j, used);
+			/* Don't reiterate on this one. */
+			clear_bit(j, in_use);
+			break;
+		}
+
+		if (j == ARRAY_SIZE(dec_param->dpb))
+			set_bit(i, new);
+	}
+
+	/* For entries that could not be matched, use remaining free slots. */
+	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+		struct v4l2_h264_dpb_entry *cdpb;
+
+		/*
+		 * Both arrays are of the same sizes, so there is no way
+		 * we can end up with no space in target array, unless
+		 * something is buggy.
+		 */
+		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
+		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
+			return;
+
+		cdpb = &dpb[j];
+		*cdpb = *ndpb;
+		set_bit(j, used);
+	}
+}
+
+/*
+ * The firmware expects unused reflist entries to have the value 0x20.
+ */
+static void fixup_ref_list(u8 *ref_list, size_t num_valid)
+{
+	memset(&ref_list[num_valid], 0x20, 32 - num_valid);
+}
+
+static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_params =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
+	const struct v4l2_ctrl_h264_sps *sps =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
+	const struct v4l2_ctrl_h264_pps *pps =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
+		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
+	struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
+	struct v4l2_h264_reflist_builder reflist_builder;
+	enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
+	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
+	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
+	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
+	int i;
+
+	update_dpb(dec_params, inst->dpb);
+
+	get_h264_sps_parameters(&slice_param->sps, sps);
+	get_h264_pps_parameters(&slice_param->pps, pps);
+	get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
+	get_h264_decode_parameters(&slice_param->decode_params, dec_params,
+				   inst->dpb);
+	get_h264_dpb_list(inst, slice_param);
+
+	/* Prepare the fields for our reference lists */
+	for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
+		dpb_fields[i] = slice_param->h264_dpb_info[i].field;
+	/* Build the reference lists */
+	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
+				       inst->dpb);
+	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
+	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
+	/* Adapt the built lists to the firmware's expectations */
+	fixup_ref_list(p0_reflist, reflist_builder.num_valid);
+	fixup_ref_list(b0_reflist, reflist_builder.num_valid);
+	fixup_ref_list(b1_reflist, reflist_builder.num_valid);
+
+	memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
+	       sizeof(inst->vsi_ctx.h264_slice_params));
+}
+
+static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
+{
+	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
+
+	return HW_MB_STORE_SZ * unit_size;
+}
+
+static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
+{
+	int err = 0;
+
+	inst->pred_buf.size = BUF_PREDICTION_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
+	if (err) {
+		mtk_vcodec_err(inst, "failed to allocate ppl buf");
+		return err;
+	}
+
+	inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
+	return 0;
+}
+
+static void free_predication_buf(struct vdec_h264_slice_inst *inst)
+{
+	struct mtk_vcodec_mem *mem = NULL;
+
+	mtk_vcodec_debug_enter(inst);
+
+	inst->vsi_ctx.pred_buf_dma = 0;
+	mem = &inst->pred_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+}
+
+static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
+	struct vdec_pic_info *pic)
+{
+	int i;
+	int err;
+	struct mtk_vcodec_mem *mem = NULL;
+	unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
+
+	mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+		mem->size = buf_sz;
+		err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+		if (err) {
+			mtk_vcodec_err(inst, "failed to allocate mv buf");
+			return err;
+		}
+		inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
+	}
+
+	return 0;
+}
+
+static void free_mv_buf(struct vdec_h264_slice_inst *inst)
+{
+	int i;
+	struct mtk_vcodec_mem *mem = NULL;
+
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		inst->vsi_ctx.mv_buf_dma[i] = 0;
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+	}
+}
+
+static void get_pic_info(struct vdec_h264_slice_inst *inst,
+			 struct vdec_pic_info *pic)
+{
+	struct mtk_vcodec_ctx *ctx = inst->ctx;
+
+	ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
+	ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
+	ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
+	ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
+	inst->vsi_ctx.dec.cap_num_planes =
+		ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
+
+	pic = &ctx->picinfo;
+	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
+			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
+	mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
+		ctx->picinfo.fb_sz[1]);
+
+	if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
+		(ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
+		inst->vsi_ctx.dec.resolution_changed = true;
+		if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
+			(ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
+			inst->vsi_ctx.dec.realloc_mv_buf = true;
+
+		mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
+			inst->vsi_ctx.dec.resolution_changed,
+			inst->vsi_ctx.dec.realloc_mv_buf,
+			ctx->last_decoded_picinfo.pic_w,
+			ctx->last_decoded_picinfo.pic_h,
+			ctx->picinfo.pic_w, ctx->picinfo.pic_h);
+	}
+}
+
+static void get_crop_info(struct vdec_h264_slice_inst *inst,
+	struct v4l2_rect *cr)
+{
+	cr->left = inst->vsi_ctx.crop.left;
+	cr->top = inst->vsi_ctx.crop.top;
+	cr->width = inst->vsi_ctx.crop.width;
+	cr->height = inst->vsi_ctx.crop.height;
+
+	mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
+			 cr->left, cr->top, cr->width, cr->height);
+}
+
+static void get_dpb_size(struct vdec_h264_slice_inst *inst,
+	unsigned int *dpb_sz)
+{
+	*dpb_sz = inst->vsi_ctx.dec.dpb_sz;
+	mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
+}
+
+static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_h264_slice_inst *inst = NULL;
+	int err;
+
+	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+
+	inst->ctx = ctx;
+
+	inst->vpu.id = SCP_IPI_VDEC_H264;
+	inst->vpu.ctx = ctx;
+
+	err = vpu_dec_init(&inst->vpu);
+	if (err) {
+		mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
+		goto error_free_inst;
+	}
+
+	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
+	inst->vsi_ctx.dec.resolution_changed = true;
+	inst->vsi_ctx.dec.realloc_mv_buf = true;
+
+	err = allocate_predication_buf(inst);
+	if (err)
+		goto error_deinit;
+
+	mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
+		sizeof(struct mtk_h264_sps_param),
+		sizeof(struct mtk_h264_pps_param),
+		sizeof(struct mtk_h264_dec_slice_param),
+		sizeof(struct mtk_h264_dpb_info));
+
+	mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
+
+	ctx->drv_handle = inst;
+	return 0;
+
+error_deinit:
+	vpu_dec_deinit(&inst->vpu);
+
+error_free_inst:
+	kfree(inst);
+	return err;
+}
+
+static void vdec_h264_slice_deinit(void *h_vdec)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+
+	mtk_vcodec_debug_enter(inst);
+
+	vpu_dec_deinit(&inst->vpu);
+	free_predication_buf(inst);
+	free_mv_buf(inst);
+
+	kfree(inst);
+}
+
+static int find_start_code(unsigned char *data, unsigned int data_sz)
+{
+	if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
+		return 3;
+
+	if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
+	    data[3] == 1)
+		return 4;
+
+	return -1;
+}
+
+static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
+				  struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+	struct vdec_vpu_inst *vpu = &inst->vpu;
+	struct mtk_video_dec_buf *src_buf_info;
+	int nal_start_idx = 0, err = 0;
+	uint32_t nal_type, data[2];
+	unsigned char *buf;
+	uint64_t y_fb_dma;
+	uint64_t c_fb_dma;
+
+	mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
+			 ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
+
+	/* bs NULL means flush decoder */
+	if (bs == NULL)
+		return vpu_dec_reset(vpu);
+
+	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+
+	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
+	c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
+
+	buf = (unsigned char *)bs->va;
+	nal_start_idx = find_start_code(buf, bs->size);
+	if (nal_start_idx < 0)
+		goto err_free_fb_out;
+
+	data[0] = bs->size;
+	data[1] = buf[nal_start_idx];
+	nal_type = NAL_TYPE(buf[nal_start_idx]);
+	mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
+			 nal_type);
+
+	inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
+	inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
+	inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
+	inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
+
+	get_vdec_decode_parameters(inst);
+	*res_chg = inst->vsi_ctx.dec.resolution_changed;
+	if (*res_chg) {
+		mtk_vcodec_debug(inst, "- resolution changed -");
+		if (inst->vsi_ctx.dec.realloc_mv_buf) {
+			err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
+			inst->vsi_ctx.dec.realloc_mv_buf = false;
+			if (err)
+				goto err_free_fb_out;
+		}
+		*res_chg = false;
+	}
+
+	memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
+	err = vpu_dec_start(vpu, data, 2);
+	if (err)
+		goto err_free_fb_out;
+
+	if (nal_type == NAL_NON_IDR_SLICE || nal_type == NAL_IDR_SLICE) {
+		/* wait decoder done interrupt */
+		err = mtk_vcodec_wait_for_done_ctx(inst->ctx,
+						   MTK_INST_IRQ_RECEIVED,
+						   WAIT_INTR_TIMEOUT_MS);
+		if (err)
+			goto err_free_fb_out;
+
+		vpu_dec_end(vpu);
+	}
+
+	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
+	mtk_vcodec_debug(inst, "\n - NALU[%d] type=%d -\n", inst->num_nalu,
+			 nal_type);
+	return 0;
+
+err_free_fb_out:
+	mtk_vcodec_err(inst, "\n - NALU[%d] err=%d -\n", inst->num_nalu, err);
+	return err;
+}
+
+static int vdec_h264_slice_get_param(void *h_vdec,
+			       enum vdec_get_param_type type, void *out)
+{
+	struct vdec_h264_slice_inst *inst =
+		(struct vdec_h264_slice_inst *)h_vdec;
+
+	switch (type) {
+	case GET_PARAM_PIC_INFO:
+		get_pic_info(inst, out);
+		break;
+
+	case GET_PARAM_DPB_SIZE:
+		get_dpb_size(inst, out);
+		break;
+
+	case GET_PARAM_CROP_INFO:
+		get_crop_info(inst, out);
+		break;
+
+	default:
+		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct vdec_common_if vdec_h264_slice_if = {
+	.init		= vdec_h264_slice_init,
+	.decode		= vdec_h264_slice_decode,
+	.get_param	= vdec_h264_slice_get_param,
+	.deinit		= vdec_h264_slice_deinit,
+};
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index b18743b906ea..42008243ceac 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -19,6 +19,9 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 	int ret = 0;
 
 	switch (fourcc) {
+	case V4L2_PIX_FMT_H264_SLICE:
+		ctx->dec_if = &vdec_h264_slice_if;
+		break;
 	case V4L2_PIX_FMT_H264:
 		ctx->dec_if = &vdec_h264_if;
 		break;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index 270d8dc9984b..961b2b6072b5 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -55,6 +55,7 @@ struct vdec_fb_node {
 };
 
 extern const struct vdec_common_if vdec_h264_if;
+extern const struct vdec_common_if vdec_h264_slice_if;
 extern const struct vdec_common_if vdec_vp8_if;
 extern const struct vdec_common_if vdec_vp9_if;
 
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 07/15] media: mtk-vcodec: vdec: add media device if using stateless api
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

From: Yunfei Dong <yunfei.dong@mediatek.com>

The stateless API requires a media device for issuing requests. Add one
if we are being instantiated as a stateless decoder.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/Kconfig                |  1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  | 39 +++++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  2 +
 3 files changed, 42 insertions(+)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index c27db5643712..9d83b4223ecc 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -296,6 +296,7 @@ config VIDEO_MEDIATEK_VCODEC
 	select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
 	select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
 	select V4L2_H264
+	select MEDIA_CONTROLLER
 	help
 	  Mediatek video codec driver provides HW capability to
 	  encode and decode in a range of video formats on MT8173
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 533781d4680a..e942e28f96fe 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -14,6 +14,7 @@
 #include <media/v4l2-event.h>
 #include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-dma-contig.h>
+#include <media/v4l2-device.h>
 
 #include "mtk_vcodec_drv.h"
 #include "mtk_vcodec_dec.h"
@@ -324,6 +325,31 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 		goto err_event_workq;
 	}
 
+	if (dev->vdec_pdata->uses_stateless_api) {
+		dev->mdev_dec.dev = &pdev->dev;
+		strscpy(dev->mdev_dec.model, MTK_VCODEC_DEC_NAME,
+				sizeof(dev->mdev_dec.model));
+
+		media_device_init(&dev->mdev_dec);
+		dev->mdev_dec.ops = &mtk_vcodec_media_ops;
+		dev->v4l2_dev.mdev = &dev->mdev_dec;
+
+		ret = v4l2_m2m_register_media_controller(dev->m2m_dev_dec,
+			dev->vfd_dec, MEDIA_ENT_F_PROC_VIDEO_DECODER);
+		if (ret) {
+			mtk_v4l2_err("Failed to register media controller");
+			goto err_reg_cont;
+		}
+
+		ret = media_device_register(&dev->mdev_dec);
+		if (ret) {
+			mtk_v4l2_err("Failed to register media device");
+			goto err_media_reg;
+		}
+
+		mtk_v4l2_debug(0, "media registered as /dev/media%d",
+			vfd_dec->num);
+	}
 	ret = video_register_device(vfd_dec, VFL_TYPE_VIDEO, 0);
 	if (ret) {
 		mtk_v4l2_err("Failed to register video device");
@@ -336,6 +362,12 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	return 0;
 
 err_dec_reg:
+	if (dev->vdec_pdata->uses_stateless_api)
+		media_device_unregister(&dev->mdev_dec);
+err_media_reg:
+	if (dev->vdec_pdata->uses_stateless_api)
+		v4l2_m2m_unregister_media_controller(dev->m2m_dev_dec);
+err_reg_cont:
 	destroy_workqueue(dev->decode_workqueue);
 err_event_workq:
 	v4l2_m2m_release(dev->m2m_dev_dec);
@@ -368,6 +400,13 @@ static int mtk_vcodec_dec_remove(struct platform_device *pdev)
 
 	flush_workqueue(dev->decode_workqueue);
 	destroy_workqueue(dev->decode_workqueue);
+
+	if (media_devnode_is_registered(dev->mdev_dec.devnode)) {
+		media_device_unregister(&dev->mdev_dec);
+		v4l2_m2m_unregister_media_controller(dev->m2m_dev_dec);
+		media_device_cleanup(&dev->mdev_dec);
+	}
+
 	if (dev->m2m_dev_dec)
 		v4l2_m2m_release(dev->m2m_dev_dec);
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 3b884a321883..79d6a1e6c916 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -384,6 +384,7 @@ struct mtk_vcodec_enc_pdata {
  * struct mtk_vcodec_dev - driver data
  * @v4l2_dev: V4L2 device to register video devices for.
  * @vfd_dec: Video device for decoder
+ * @mdev_dec: Media device for decoder
  * @vfd_enc: Video device for encoder.
  *
  * @m2m_dev_dec: m2m device for decoder
@@ -420,6 +421,7 @@ struct mtk_vcodec_enc_pdata {
 struct mtk_vcodec_dev {
 	struct v4l2_device v4l2_dev;
 	struct video_device *vfd_dec;
+	struct media_device mdev_dec;
 	struct video_device *vfd_enc;
 
 	struct v4l2_m2m_dev *m2m_dev_dec;
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 07/15] media: mtk-vcodec: vdec: add media device if using stateless api
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

From: Yunfei Dong <yunfei.dong@mediatek.com>

The stateless API requires a media device for issuing requests. Add one
if we are being instantiated as a stateless decoder.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/Kconfig                |  1 +
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  | 39 +++++++++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  2 +
 3 files changed, 42 insertions(+)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index c27db5643712..9d83b4223ecc 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -296,6 +296,7 @@ config VIDEO_MEDIATEK_VCODEC
 	select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
 	select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
 	select V4L2_H264
+	select MEDIA_CONTROLLER
 	help
 	  Mediatek video codec driver provides HW capability to
 	  encode and decode in a range of video formats on MT8173
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 533781d4680a..e942e28f96fe 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -14,6 +14,7 @@
 #include <media/v4l2-event.h>
 #include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-dma-contig.h>
+#include <media/v4l2-device.h>
 
 #include "mtk_vcodec_drv.h"
 #include "mtk_vcodec_dec.h"
@@ -324,6 +325,31 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 		goto err_event_workq;
 	}
 
+	if (dev->vdec_pdata->uses_stateless_api) {
+		dev->mdev_dec.dev = &pdev->dev;
+		strscpy(dev->mdev_dec.model, MTK_VCODEC_DEC_NAME,
+				sizeof(dev->mdev_dec.model));
+
+		media_device_init(&dev->mdev_dec);
+		dev->mdev_dec.ops = &mtk_vcodec_media_ops;
+		dev->v4l2_dev.mdev = &dev->mdev_dec;
+
+		ret = v4l2_m2m_register_media_controller(dev->m2m_dev_dec,
+			dev->vfd_dec, MEDIA_ENT_F_PROC_VIDEO_DECODER);
+		if (ret) {
+			mtk_v4l2_err("Failed to register media controller");
+			goto err_reg_cont;
+		}
+
+		ret = media_device_register(&dev->mdev_dec);
+		if (ret) {
+			mtk_v4l2_err("Failed to register media device");
+			goto err_media_reg;
+		}
+
+		mtk_v4l2_debug(0, "media registered as /dev/media%d",
+			vfd_dec->num);
+	}
 	ret = video_register_device(vfd_dec, VFL_TYPE_VIDEO, 0);
 	if (ret) {
 		mtk_v4l2_err("Failed to register video device");
@@ -336,6 +362,12 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 	return 0;
 
 err_dec_reg:
+	if (dev->vdec_pdata->uses_stateless_api)
+		media_device_unregister(&dev->mdev_dec);
+err_media_reg:
+	if (dev->vdec_pdata->uses_stateless_api)
+		v4l2_m2m_unregister_media_controller(dev->m2m_dev_dec);
+err_reg_cont:
 	destroy_workqueue(dev->decode_workqueue);
 err_event_workq:
 	v4l2_m2m_release(dev->m2m_dev_dec);
@@ -368,6 +400,13 @@ static int mtk_vcodec_dec_remove(struct platform_device *pdev)
 
 	flush_workqueue(dev->decode_workqueue);
 	destroy_workqueue(dev->decode_workqueue);
+
+	if (media_devnode_is_registered(dev->mdev_dec.devnode)) {
+		media_device_unregister(&dev->mdev_dec);
+		v4l2_m2m_unregister_media_controller(dev->m2m_dev_dec);
+		media_device_cleanup(&dev->mdev_dec);
+	}
+
 	if (dev->m2m_dev_dec)
 		v4l2_m2m_release(dev->m2m_dev_dec);
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 3b884a321883..79d6a1e6c916 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -384,6 +384,7 @@ struct mtk_vcodec_enc_pdata {
  * struct mtk_vcodec_dev - driver data
  * @v4l2_dev: V4L2 device to register video devices for.
  * @vfd_dec: Video device for decoder
+ * @mdev_dec: Media device for decoder
  * @vfd_enc: Video device for encoder.
  *
  * @m2m_dev_dec: m2m device for decoder
@@ -420,6 +421,7 @@ struct mtk_vcodec_enc_pdata {
 struct mtk_vcodec_dev {
 	struct v4l2_device v4l2_dev;
 	struct video_device *vfd_dec;
+	struct media_device mdev_dec;
 	struct video_device *vfd_enc;
 
 	struct v4l2_m2m_dev *m2m_dev_dec;
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 08/15] dt-bindings: media: document mediatek,mt8183-vcodec-dec
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

MT8183's decoder is instantiated similarly to MT8173's.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 Documentation/devicetree/bindings/media/mediatek-vcodec.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
index 8217424fd4bd..30c5888bf2d6 100644
--- a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
+++ b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
@@ -7,6 +7,7 @@ Required properties:
 - compatible : "mediatek,mt8173-vcodec-enc" for MT8173 encoder
   "mediatek,mt8183-vcodec-enc" for MT8183 encoder.
   "mediatek,mt8173-vcodec-dec" for MT8173 decoder.
+  "mediatek,mt8183-vcodec-dec" for MT8183 decoder.
 - reg : Physical base address of the video codec registers and length of
   memory mapped region.
 - interrupts : interrupt number to the cpu.
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 08/15] dt-bindings: media: document mediatek, mt8183-vcodec-dec
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

MT8183's decoder is instantiated similarly to MT8173's.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 Documentation/devicetree/bindings/media/mediatek-vcodec.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
index 8217424fd4bd..30c5888bf2d6 100644
--- a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
+++ b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
@@ -7,6 +7,7 @@ Required properties:
 - compatible : "mediatek,mt8173-vcodec-enc" for MT8173 encoder
   "mediatek,mt8183-vcodec-enc" for MT8183 encoder.
   "mediatek,mt8173-vcodec-dec" for MT8173 decoder.
+  "mediatek,mt8183-vcodec-dec" for MT8183 decoder.
 - reg : Physical base address of the video codec registers and length of
   memory mapped region.
 - interrupts : interrupt number to the cpu.
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 09/15] media: mtk-vcodec: enable MT8183 decoder
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

From: Yunfei Dong <yunfei.dong@mediatek.com>

Now that all the supporting blocks are present, enable decoder for
MT8183.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index e942e28f96fe..e0526c0900c8 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -383,12 +383,17 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 }
 
 extern const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata;
+extern const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata;
 
 static const struct of_device_id mtk_vcodec_match[] = {
 	{
 		.compatible = "mediatek,mt8173-vcodec-dec",
 		.data = &mtk_vdec_8173_pdata,
 	},
+	{
+		.compatible = "mediatek,mt8183-vcodec-dec",
+		.data = &mtk_vdec_8183_pdata,
+	},
 	{},
 };
 
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 09/15] media: mtk-vcodec: enable MT8183 decoder
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

From: Yunfei Dong <yunfei.dong@mediatek.com>

Now that all the supporting blocks are present, enable decoder for
MT8183.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
[acourbot: refactor, cleanup and split]
Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index e942e28f96fe..e0526c0900c8 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -383,12 +383,17 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 }
 
 extern const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata;
+extern const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata;
 
 static const struct of_device_id mtk_vcodec_match[] = {
 	{
 		.compatible = "mediatek,mt8173-vcodec-dec",
 		.data = &mtk_vdec_8173_pdata,
 	},
+	{
+		.compatible = "mediatek,mt8183-vcodec-dec",
+		.data = &mtk_vdec_8183_pdata,
+	},
 	{},
 };
 
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 10/15] media: mtk-vcodec: vdec: use helpers in VIDIOC_(TRY_)DECODER_CMD
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Let's use the dedicated helpers to make sure we get the expected
behavior on stateful decoders as well.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.c   | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index c286cc0f239f..8bcff0b3626e 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -52,22 +52,10 @@ static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 	if (ctx->dev->vdec_pdata->uses_stateless_api)
 		return v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv,
 								cmd);
-
-	switch (cmd->cmd) {
-	case V4L2_DEC_CMD_STOP:
-	case V4L2_DEC_CMD_START:
-		if (cmd->flags != 0) {
-			mtk_v4l2_err("cmd->flags=%u", cmd->flags);
-			return -EINVAL;
-		}
-		break;
-	default:
-		return -EINVAL;
-	}
-	return 0;
+	else
+		return v4l2_m2m_ioctl_try_decoder_cmd(file, priv, cmd);
 }
 
-
 static int vidioc_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 10/15] media: mtk-vcodec: vdec: use helpers in VIDIOC_(TRY_)DECODER_CMD
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Let's use the dedicated helpers to make sure we get the expected
behavior on stateful decoders as well.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.c   | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index c286cc0f239f..8bcff0b3626e 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -52,22 +52,10 @@ static int vidioc_try_decoder_cmd(struct file *file, void *priv,
 	if (ctx->dev->vdec_pdata->uses_stateless_api)
 		return v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv,
 								cmd);
-
-	switch (cmd->cmd) {
-	case V4L2_DEC_CMD_STOP:
-	case V4L2_DEC_CMD_START:
-		if (cmd->flags != 0) {
-			mtk_v4l2_err("cmd->flags=%u", cmd->flags);
-			return -EINVAL;
-		}
-		break;
-	default:
-		return -EINVAL;
-	}
-	return 0;
+	else
+		return v4l2_m2m_ioctl_try_decoder_cmd(file, priv, cmd);
 }
 
-
 static int vidioc_decoder_cmd(struct file *file, void *priv,
 				struct v4l2_decoder_cmd *cmd)
 {
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 11/15] media: mtk-vcodec: vdec: Support H264 profile control
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Hirokazu Honda, Alexandre Courbot

From: Hirokazu Honda <hiroh@chromium.org>

Add H264 profiles supported by the MediaTek 8173 decoder.

Signed-off-by: Hirokazu Honda <hiroh@chromium.org>
[acourbot: fix commit log a bit]
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateful.c     | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index f9db7ef19c28..3666c7e73bff 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -591,7 +591,16 @@ static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
 				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
 				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
 				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
-
+	/*
+	 * H264. Baseline / Extended decoding is not supported.
+	 */
+	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
+			&mtk_vcodec_dec_ctrl_ops,
+			V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+			V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
+			BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
+			BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
+			V4L2_MPEG_VIDEO_H264_PROFILE_MAIN);
 	if (ctx->ctrl_hdl.error) {
 		mtk_v4l2_err("Adding control failed %d",
 				ctx->ctrl_hdl.error);
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 11/15] media: mtk-vcodec: vdec: Support H264 profile control
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Hirokazu Honda, Mauro Carvalho Chehab, linux-media

From: Hirokazu Honda <hiroh@chromium.org>

Add H264 profiles supported by the MediaTek 8173 decoder.

Signed-off-by: Hirokazu Honda <hiroh@chromium.org>
[acourbot: fix commit log a bit]
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateful.c     | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index f9db7ef19c28..3666c7e73bff 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -591,7 +591,16 @@ static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
 				V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
 				V4L2_MPEG_VIDEO_VP9_PROFILE_0,
 				0, V4L2_MPEG_VIDEO_VP9_PROFILE_0);
-
+	/*
+	 * H264. Baseline / Extended decoding is not supported.
+	 */
+	v4l2_ctrl_new_std_menu(&ctx->ctrl_hdl,
+			&mtk_vcodec_dec_ctrl_ops,
+			V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+			V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
+			BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
+			BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
+			V4L2_MPEG_VIDEO_H264_PROFILE_MAIN);
 	if (ctx->ctrl_hdl.error) {
 		mtk_v4l2_err("Adding control failed %d",
 				ctx->ctrl_hdl.error);
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 12/15] media: mtk-vcodec: vdec: clamp OUTPUT resolution to hardware limits
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

Calling S_FMT or TRY_FMT on the OUTPUT queue should adjust the
resolution to the limits supported by the hardware. Until now this was
only done on the CAPTURE queue, which could make clients believe that
unsupported resolutions can be used when they set the coded size on the
OUTPUT queue.

In the case of the stateless decoder, the problem was even bigger since
subsequently calling G_FMT on the CAPTURE queue would result in the
unclamped resolution being returned, further inducing the client into
error.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 8bcff0b3626e..209ccf3d2d67 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -223,19 +223,19 @@ static int vidioc_try_fmt(struct v4l2_format *f,
 
 	pix_fmt_mp->field = V4L2_FIELD_NONE;
 
+	pix_fmt_mp->width = clamp(pix_fmt_mp->width,
+				MTK_VDEC_MIN_W,
+				MTK_VDEC_MAX_W);
+	pix_fmt_mp->height = clamp(pix_fmt_mp->height,
+				MTK_VDEC_MIN_H,
+				MTK_VDEC_MAX_H);
+
 	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		pix_fmt_mp->num_planes = 1;
 		pix_fmt_mp->plane_fmt[0].bytesperline = 0;
 	} else if (f->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 		int tmp_w, tmp_h;
 
-		pix_fmt_mp->height = clamp(pix_fmt_mp->height,
-					MTK_VDEC_MIN_H,
-					MTK_VDEC_MAX_H);
-		pix_fmt_mp->width = clamp(pix_fmt_mp->width,
-					MTK_VDEC_MIN_W,
-					MTK_VDEC_MAX_W);
-
 		/*
 		 * Find next closer width align 64, heign align 64, size align
 		 * 64 rectangle
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 12/15] media: mtk-vcodec: vdec: clamp OUTPUT resolution to hardware limits
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

Calling S_FMT or TRY_FMT on the OUTPUT queue should adjust the
resolution to the limits supported by the hardware. Until now this was
only done on the CAPTURE queue, which could make clients believe that
unsupported resolutions can be used when they set the coded size on the
OUTPUT queue.

In the case of the stateless decoder, the problem was even bigger since
subsequently calling G_FMT on the CAPTURE queue would result in the
unclamped resolution being returned, further inducing the client into
error.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 8bcff0b3626e..209ccf3d2d67 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -223,19 +223,19 @@ static int vidioc_try_fmt(struct v4l2_format *f,
 
 	pix_fmt_mp->field = V4L2_FIELD_NONE;
 
+	pix_fmt_mp->width = clamp(pix_fmt_mp->width,
+				MTK_VDEC_MIN_W,
+				MTK_VDEC_MAX_W);
+	pix_fmt_mp->height = clamp(pix_fmt_mp->height,
+				MTK_VDEC_MIN_H,
+				MTK_VDEC_MAX_H);
+
 	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		pix_fmt_mp->num_planes = 1;
 		pix_fmt_mp->plane_fmt[0].bytesperline = 0;
 	} else if (f->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 		int tmp_w, tmp_h;
 
-		pix_fmt_mp->height = clamp(pix_fmt_mp->height,
-					MTK_VDEC_MIN_H,
-					MTK_VDEC_MAX_H);
-		pix_fmt_mp->width = clamp(pix_fmt_mp->width,
-					MTK_VDEC_MIN_W,
-					MTK_VDEC_MAX_W);
-
 		/*
 		 * Find next closer width align 64, heign align 64, size align
 		 * 64 rectangle
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 13/15] media: mtk-vcodec: make flush buffer reusable by encoder
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

The flush buffer is a special buffer that tells the decoder driver to
send an empty CAPTURE frame to the client with V4L2_BUF_FLAG_LAST set.

We need similar functionality for the encoder ; however currently the
flush buffer depends on decoder-specific structures and thus cannot be
reused with the encoder.

Fix this by testing for this buffer by its VB2 address, and not through
a dedicated flag stored in a higher-level decoder structure. This also
allows us to remove said flag and simplify the code a bit.

Since the flush buffer should never be used in the stateless decoder,
also add safeguards to check against it.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.c    |  9 ++-------
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.h    |  2 --
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c      | 12 +-----------
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateful.c | 15 +++++++++------
 .../media/platform/mtk-vcodec/mtk_vcodec_drv.h    |  6 ++++--
 5 files changed, 16 insertions(+), 28 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 209ccf3d2d67..1c86130bb52d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -86,8 +86,7 @@ static int vidioc_decoder_cmd(struct file *file, void *priv,
 			mtk_v4l2_debug(1, "Capture stream is off. No need to flush.");
 			return 0;
 		}
-		v4l2_m2m_buf_queue(ctx->m2m_ctx,
-				   &ctx->empty_flush_buf->m2m_buf.vb);
+		v4l2_m2m_buf_queue(ctx->m2m_ctx, &ctx->empty_flush_buf.vb);
 		v4l2_m2m_try_schedule(ctx->m2m_ctx);
 		break;
 
@@ -779,8 +778,6 @@ int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 	if (vb->vb2_queue->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 		buf->used = false;
 		buf->queued_in_v4l2 = false;
-	} else {
-		buf->lastframe = false;
 	}
 
 	return 0;
@@ -807,9 +804,7 @@ void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 
 	if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
-			struct mtk_video_dec_buf *buf_info = container_of(
-				 src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-			if (!buf_info->lastframe) {
+			if (src_buf != &ctx->empty_flush_buf.vb) {
 				struct media_request *req =
 					src_buf->vb2_buf.req_obj.req;
 				v4l2_m2m_buf_done(src_buf,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index a2949e1bc7fe..a510e74251e6 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -42,7 +42,6 @@ struct vdec_fb {
  * @queued_in_vb2:	Capture buffer is queue in vb2
  * @queued_in_v4l2:	Capture buffer is in v4l2 driver, but not in vb2
  *			queue yet
- * @lastframe:		Intput buffer is last buffer - EOS
  * @error:		An unrecoverable error occurs on this buffer.
  * @frame_buffer:	Decode status, and buffer information of Capture buffer
  * @bs_buffer:	Output buffer info
@@ -55,7 +54,6 @@ struct mtk_video_dec_buf {
 	bool	used;
 	bool	queued_in_vb2;
 	bool	queued_in_v4l2;
-	bool	lastframe;
 
 	bool	error;
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index e0526c0900c8..4789e669c258 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -82,21 +82,14 @@ static int fops_vcodec_open(struct file *file)
 {
 	struct mtk_vcodec_dev *dev = video_drvdata(file);
 	struct mtk_vcodec_ctx *ctx = NULL;
-	struct mtk_video_dec_buf *mtk_buf = NULL;
 	int ret = 0;
 	struct vb2_queue *src_vq;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
-	mtk_buf = kzalloc(sizeof(*mtk_buf), GFP_KERNEL);
-	if (!mtk_buf) {
-		kfree(ctx);
-		return -ENOMEM;
-	}
 
 	mutex_lock(&dev->dev_mutex);
-	ctx->empty_flush_buf = mtk_buf;
 	ctx->id = dev->id_counter++;
 	v4l2_fh_init(&ctx->fh, video_devdata(file));
 	file->private_data = &ctx->fh;
@@ -122,8 +115,7 @@ static int fops_vcodec_open(struct file *file)
 	}
 	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
 				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
-	ctx->empty_flush_buf->m2m_buf.vb.vb2_buf.vb2_queue = src_vq;
-	ctx->empty_flush_buf->lastframe = true;
+	ctx->empty_flush_buf.vb.vb2_buf.vb2_queue = src_vq;
 	mtk_vcodec_dec_set_default_params(ctx);
 
 	if (v4l2_fh_is_singular(&ctx->fh)) {
@@ -161,7 +153,6 @@ static int fops_vcodec_open(struct file *file)
 err_ctrls_setup:
 	v4l2_fh_del(&ctx->fh);
 	v4l2_fh_exit(&ctx->fh);
-	kfree(ctx->empty_flush_buf);
 	kfree(ctx);
 	mutex_unlock(&dev->dev_mutex);
 
@@ -192,7 +183,6 @@ static int fops_vcodec_release(struct file *file)
 	v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
 
 	list_del_init(&ctx->list);
-	kfree(ctx->empty_flush_buf);
 	kfree(ctx);
 	mutex_unlock(&dev->dev_mutex);
 	return 0;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 3666c7e73bff..8d4ec8c7728b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -337,8 +337,6 @@ static void mtk_vdec_worker(struct work_struct *work)
 		return;
 	}
 
-	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
 	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
 				    m2m_buf.vb);
 
@@ -359,7 +357,7 @@ static void mtk_vdec_worker(struct work_struct *work)
 			pfb->base_y.va, &pfb->base_y.dma_addr,
 			&pfb->base_c.dma_addr, pfb->base_y.size);
 
-	if (src_buf_info->lastframe) {
+	if (src_buf == &ctx->empty_flush_buf.vb) {
 		mtk_v4l2_debug(1, "Got empty flush input buffer.");
 		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
 
@@ -380,6 +378,10 @@ static void mtk_vdec_worker(struct work_struct *work)
 		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
 		return;
 	}
+
+	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+
 	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
 	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
 	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
@@ -459,7 +461,6 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 	unsigned int dpbsize = 1, i = 0;
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
-	struct mtk_video_dec_buf *buf = NULL;
 	struct mtk_q_data *dst_q_data;
 
 	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
@@ -469,6 +470,8 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 	 * check if this buffer is ready to be used after decode
 	 */
 	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
+		struct mtk_video_dec_buf *buf;
+
 		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
 		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
 				   m2m_buf.vb);
@@ -498,8 +501,8 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 		mtk_v4l2_err("No src buffer");
 		return;
 	}
-	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-	if (buf->lastframe) {
+
+	if (src_buf == &ctx->empty_flush_buf.vb) {
 		/* This shouldn't happen. Just in case. */
 		mtk_v4l2_err("Invalid flush buffer.");
 		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 79d6a1e6c916..8dab9f520283 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -13,6 +13,7 @@
 #include <media/v4l2-ctrls.h>
 #include <media/v4l2-device.h>
 #include <media/v4l2-ioctl.h>
+#include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-core.h>
 #include "mtk_vcodec_util.h"
 
@@ -250,7 +251,8 @@ struct vdec_pic_info {
  * @decode_work: worker for the decoding
  * @encode_work: worker for the encoding
  * @last_decoded_picinfo: pic information get from latest decode
- * @empty_flush_buf: a fake size-0 capture buffer that indicates flush
+ * @empty_flush_buf: a fake size-0 capture buffer that indicates flush. Only
+ *		     to be used with encoder and stateful decoder.
  * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
@@ -289,7 +291,7 @@ struct mtk_vcodec_ctx {
 	struct work_struct decode_work;
 	struct work_struct encode_work;
 	struct vdec_pic_info last_decoded_picinfo;
-	struct mtk_video_dec_buf *empty_flush_buf;
+	struct v4l2_m2m_buffer empty_flush_buf;
 
 	u32 current_codec;
 
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 13/15] media: mtk-vcodec: make flush buffer reusable by encoder
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

The flush buffer is a special buffer that tells the decoder driver to
send an empty CAPTURE frame to the client with V4L2_BUF_FLAG_LAST set.

We need similar functionality for the encoder ; however currently the
flush buffer depends on decoder-specific structures and thus cannot be
reused with the encoder.

Fix this by testing for this buffer by its VB2 address, and not through
a dedicated flag stored in a higher-level decoder structure. This also
allows us to remove said flag and simplify the code a bit.

Since the flush buffer should never be used in the stateless decoder,
also add safeguards to check against it.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.c    |  9 ++-------
 .../media/platform/mtk-vcodec/mtk_vcodec_dec.h    |  2 --
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c      | 12 +-----------
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateful.c | 15 +++++++++------
 .../media/platform/mtk-vcodec/mtk_vcodec_drv.h    |  6 ++++--
 5 files changed, 16 insertions(+), 28 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 209ccf3d2d67..1c86130bb52d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -86,8 +86,7 @@ static int vidioc_decoder_cmd(struct file *file, void *priv,
 			mtk_v4l2_debug(1, "Capture stream is off. No need to flush.");
 			return 0;
 		}
-		v4l2_m2m_buf_queue(ctx->m2m_ctx,
-				   &ctx->empty_flush_buf->m2m_buf.vb);
+		v4l2_m2m_buf_queue(ctx->m2m_ctx, &ctx->empty_flush_buf.vb);
 		v4l2_m2m_try_schedule(ctx->m2m_ctx);
 		break;
 
@@ -779,8 +778,6 @@ int vb2ops_vdec_buf_init(struct vb2_buffer *vb)
 	if (vb->vb2_queue->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 		buf->used = false;
 		buf->queued_in_v4l2 = false;
-	} else {
-		buf->lastframe = false;
 	}
 
 	return 0;
@@ -807,9 +804,7 @@ void vb2ops_vdec_stop_streaming(struct vb2_queue *q)
 
 	if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
-			struct mtk_video_dec_buf *buf_info = container_of(
-				 src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-			if (!buf_info->lastframe) {
+			if (src_buf != &ctx->empty_flush_buf.vb) {
 				struct media_request *req =
 					src_buf->vb2_buf.req_obj.req;
 				v4l2_m2m_buf_done(src_buf,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
index a2949e1bc7fe..a510e74251e6 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.h
@@ -42,7 +42,6 @@ struct vdec_fb {
  * @queued_in_vb2:	Capture buffer is queue in vb2
  * @queued_in_v4l2:	Capture buffer is in v4l2 driver, but not in vb2
  *			queue yet
- * @lastframe:		Intput buffer is last buffer - EOS
  * @error:		An unrecoverable error occurs on this buffer.
  * @frame_buffer:	Decode status, and buffer information of Capture buffer
  * @bs_buffer:	Output buffer info
@@ -55,7 +54,6 @@ struct mtk_video_dec_buf {
 	bool	used;
 	bool	queued_in_vb2;
 	bool	queued_in_v4l2;
-	bool	lastframe;
 
 	bool	error;
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index e0526c0900c8..4789e669c258 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -82,21 +82,14 @@ static int fops_vcodec_open(struct file *file)
 {
 	struct mtk_vcodec_dev *dev = video_drvdata(file);
 	struct mtk_vcodec_ctx *ctx = NULL;
-	struct mtk_video_dec_buf *mtk_buf = NULL;
 	int ret = 0;
 	struct vb2_queue *src_vq;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
 		return -ENOMEM;
-	mtk_buf = kzalloc(sizeof(*mtk_buf), GFP_KERNEL);
-	if (!mtk_buf) {
-		kfree(ctx);
-		return -ENOMEM;
-	}
 
 	mutex_lock(&dev->dev_mutex);
-	ctx->empty_flush_buf = mtk_buf;
 	ctx->id = dev->id_counter++;
 	v4l2_fh_init(&ctx->fh, video_devdata(file));
 	file->private_data = &ctx->fh;
@@ -122,8 +115,7 @@ static int fops_vcodec_open(struct file *file)
 	}
 	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
 				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
-	ctx->empty_flush_buf->m2m_buf.vb.vb2_buf.vb2_queue = src_vq;
-	ctx->empty_flush_buf->lastframe = true;
+	ctx->empty_flush_buf.vb.vb2_buf.vb2_queue = src_vq;
 	mtk_vcodec_dec_set_default_params(ctx);
 
 	if (v4l2_fh_is_singular(&ctx->fh)) {
@@ -161,7 +153,6 @@ static int fops_vcodec_open(struct file *file)
 err_ctrls_setup:
 	v4l2_fh_del(&ctx->fh);
 	v4l2_fh_exit(&ctx->fh);
-	kfree(ctx->empty_flush_buf);
 	kfree(ctx);
 	mutex_unlock(&dev->dev_mutex);
 
@@ -192,7 +183,6 @@ static int fops_vcodec_release(struct file *file)
 	v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
 
 	list_del_init(&ctx->list);
-	kfree(ctx->empty_flush_buf);
 	kfree(ctx);
 	mutex_unlock(&dev->dev_mutex);
 	return 0;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 3666c7e73bff..8d4ec8c7728b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -337,8 +337,6 @@ static void mtk_vdec_worker(struct work_struct *work)
 		return;
 	}
 
-	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
-				    m2m_buf.vb);
 	dst_buf_info = container_of(dst_buf, struct mtk_video_dec_buf,
 				    m2m_buf.vb);
 
@@ -359,7 +357,7 @@ static void mtk_vdec_worker(struct work_struct *work)
 			pfb->base_y.va, &pfb->base_y.dma_addr,
 			&pfb->base_c.dma_addr, pfb->base_y.size);
 
-	if (src_buf_info->lastframe) {
+	if (src_buf == &ctx->empty_flush_buf.vb) {
 		mtk_v4l2_debug(1, "Got empty flush input buffer.");
 		src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
 
@@ -380,6 +378,10 @@ static void mtk_vdec_worker(struct work_struct *work)
 		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
 		return;
 	}
+
+	src_buf_info = container_of(src_buf, struct mtk_video_dec_buf,
+				    m2m_buf.vb);
+
 	buf.va = vb2_plane_vaddr(&src_buf->vb2_buf, 0);
 	buf.dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
 	buf.size = (size_t)src_buf->vb2_buf.planes[0].bytesused;
@@ -459,7 +461,6 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 	unsigned int dpbsize = 1, i = 0;
 	struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
 	struct vb2_v4l2_buffer *vb2_v4l2 = NULL;
-	struct mtk_video_dec_buf *buf = NULL;
 	struct mtk_q_data *dst_q_data;
 
 	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
@@ -469,6 +470,8 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 	 * check if this buffer is ready to be used after decode
 	 */
 	if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
+		struct mtk_video_dec_buf *buf;
+
 		vb2_v4l2 = to_vb2_v4l2_buffer(vb);
 		buf = container_of(vb2_v4l2, struct mtk_video_dec_buf,
 				   m2m_buf.vb);
@@ -498,8 +501,8 @@ static void vb2ops_vdec_stateful_buf_queue(struct vb2_buffer *vb)
 		mtk_v4l2_err("No src buffer");
 		return;
 	}
-	buf = container_of(src_buf, struct mtk_video_dec_buf, m2m_buf.vb);
-	if (buf->lastframe) {
+
+	if (src_buf == &ctx->empty_flush_buf.vb) {
 		/* This shouldn't happen. Just in case. */
 		mtk_v4l2_err("Invalid flush buffer.");
 		v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 79d6a1e6c916..8dab9f520283 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -13,6 +13,7 @@
 #include <media/v4l2-ctrls.h>
 #include <media/v4l2-device.h>
 #include <media/v4l2-ioctl.h>
+#include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-core.h>
 #include "mtk_vcodec_util.h"
 
@@ -250,7 +251,8 @@ struct vdec_pic_info {
  * @decode_work: worker for the decoding
  * @encode_work: worker for the encoding
  * @last_decoded_picinfo: pic information get from latest decode
- * @empty_flush_buf: a fake size-0 capture buffer that indicates flush
+ * @empty_flush_buf: a fake size-0 capture buffer that indicates flush. Only
+ *		     to be used with encoder and stateful decoder.
  * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
@@ -289,7 +291,7 @@ struct mtk_vcodec_ctx {
 	struct work_struct decode_work;
 	struct work_struct encode_work;
 	struct vdec_pic_info last_decoded_picinfo;
-	struct mtk_video_dec_buf *empty_flush_buf;
+	struct v4l2_m2m_buffer empty_flush_buf;
 
 	u32 current_codec;
 
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 14/15] media: mtk-vcodec: venc: support START and STOP commands
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Alexandre Courbot

The V4L2 encoder specification requires encoders to support the
V4L2_ENC_CMD_START and V4L2_ENC_CMD_STOP commands. Add support for these
to the mtk-vcodec encoder by reusing the same flush buffer as used by
the decoder driver.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   2 +
 .../platform/mtk-vcodec/mtk_vcodec_enc.c      | 123 +++++++++++++++++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |   4 +
 3 files changed, 122 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 8dab9f520283..73da6c7b69a8 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -253,6 +253,7 @@ struct vdec_pic_info {
  * @last_decoded_picinfo: pic information get from latest decode
  * @empty_flush_buf: a fake size-0 capture buffer that indicates flush. Only
  *		     to be used with encoder and stateful decoder.
+ * @is_flushing: set to true if flushing is in progress.
  * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
@@ -292,6 +293,7 @@ struct mtk_vcodec_ctx {
 	struct work_struct encode_work;
 	struct vdec_pic_info last_decoded_picinfo;
 	struct v4l2_m2m_buffer empty_flush_buf;
+	bool is_flushing;
 
 	u32 current_codec;
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 8c917969c2f1..4de381b522ae 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -658,6 +658,7 @@ static int vidioc_venc_dqbuf(struct file *file, void *priv,
 			     struct v4l2_buffer *buf)
 {
 	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	int ret;
 
 	if (ctx->state == MTK_STATE_ABORT) {
 		mtk_v4l2_err("[%d] Call on QBUF after unrecoverable error",
@@ -665,7 +666,77 @@ static int vidioc_venc_dqbuf(struct file *file, void *priv,
 		return -EIO;
 	}
 
-	return v4l2_m2m_dqbuf(file, ctx->m2m_ctx, buf);
+	ret = v4l2_m2m_dqbuf(file, ctx->m2m_ctx, buf);
+	if (ret)
+		return ret;
+
+	/*
+	 * Complete flush if the user dequeued the 0-payload LAST buffer.
+	 * We check the payload because a buffer with the LAST flag can also
+	 * be seen during resolution changes. If we happen to be flushing at
+	 * that time, the last buffer before the resolution changes could be
+	 * misinterpreted for the buffer generated by the flush and terminate
+	 * it earlier than we want.
+	 */
+	if (!V4L2_TYPE_IS_OUTPUT(buf->type) &&
+	    buf->flags & V4L2_BUF_FLAG_LAST &&
+	    buf->m.planes[0].bytesused == 0 &&
+	    ctx->is_flushing) {
+		/*
+		 * Last CAPTURE buffer is dequeued, we can allow another flush
+		 * to take place.
+		 */
+		ctx->is_flushing = false;
+	}
+
+	return 0;
+}
+
+static int vidioc_encoder_cmd(struct file *file, void *priv,
+			      struct v4l2_encoder_cmd *cmd)
+{
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	struct vb2_queue *src_vq, *dst_vq;
+	int ret;
+
+	ret = v4l2_m2m_ioctl_try_encoder_cmd(file, priv, cmd);
+	if (ret)
+		return ret;
+
+	/* Calling START or STOP is invalid if a flush is in progress */
+	if (ctx->is_flushing)
+		return -EBUSY;
+
+	mtk_v4l2_debug(1, "encoder cmd=%u", cmd->cmd);
+
+	dst_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+	switch (cmd->cmd) {
+	case V4L2_ENC_CMD_STOP:
+		src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+		if (!vb2_is_streaming(src_vq)) {
+			mtk_v4l2_debug(1, "Output stream is off. No need to flush.");
+			return 0;
+		}
+		if (!vb2_is_streaming(dst_vq)) {
+			mtk_v4l2_debug(1, "Capture stream is off. No need to flush.");
+			return 0;
+		}
+		ctx->is_flushing = true;
+		v4l2_m2m_buf_queue(ctx->m2m_ctx, &ctx->empty_flush_buf.vb);
+		v4l2_m2m_try_schedule(ctx->m2m_ctx);
+		break;
+
+	case V4L2_ENC_CMD_START:
+		vb2_clear_last_buffer_dequeued(dst_vq);
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 const struct v4l2_ioctl_ops mtk_venc_ioctl_ops = {
@@ -701,6 +772,9 @@ const struct v4l2_ioctl_ops mtk_venc_ioctl_ops = {
 
 	.vidioc_g_selection		= vidioc_venc_g_selection,
 	.vidioc_s_selection		= vidioc_venc_s_selection,
+
+	.vidioc_encoder_cmd		= vidioc_encoder_cmd,
+	.vidioc_try_encoder_cmd		= v4l2_m2m_ioctl_try_encoder_cmd,
 };
 
 static int vb2ops_venc_queue_setup(struct vb2_queue *vq,
@@ -857,9 +931,27 @@ static void vb2ops_venc_stop_streaming(struct vb2_queue *q)
 			dst_buf->vb2_buf.planes[0].bytesused = 0;
 			v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_ERROR);
 		}
+		/* STREAMOFF on the CAPTURE queue completes any ongoing flush */
+		if (ctx->is_flushing) {
+			mtk_v4l2_debug(1, "STREAMOFF called while flushing");
+			v4l2_m2m_buf_remove_by_buf(&ctx->m2m_ctx->out_q_ctx,
+						   &ctx->empty_flush_buf.vb);
+			ctx->is_flushing = false;
+		}
 	} else {
-		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx)))
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
+			if (src_buf != &ctx->empty_flush_buf.vb)
+				v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		}
+		if (ctx->is_flushing) {
+			/*
+			 * If we are in the middle of a flush, put the flush
+			 * buffer back into the queue so the next CAPTURE
+			 * buffer gets returned with the LAST flag set.
+			 */
+			v4l2_m2m_buf_queue(ctx->m2m_ctx,
+					   &ctx->empty_flush_buf.vb);
+		}
 	}
 
 	if ((q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE &&
@@ -955,12 +1047,15 @@ static int mtk_venc_param_change(struct mtk_vcodec_ctx *ctx)
 {
 	struct venc_enc_param enc_prm;
 	struct vb2_v4l2_buffer *vb2_v4l2 = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	struct mtk_video_enc_buf *mtk_buf =
-			container_of(vb2_v4l2, struct mtk_video_enc_buf,
-				     m2m_buf.vb);
-
+	struct mtk_video_enc_buf *mtk_buf;
 	int ret = 0;
 
+	/* Don't upcast the empty flush buffer */
+	if (vb2_v4l2 == &ctx->empty_flush_buf.vb)
+		return 0;
+
+	mtk_buf = container_of(vb2_v4l2, struct mtk_video_enc_buf, m2m_buf.vb);
+
 	memset(&enc_prm, 0, sizeof(enc_prm));
 	if (mtk_buf->param_change == MTK_ENCODE_PARAM_NONE)
 		return 0;
@@ -1046,6 +1141,20 @@ static void mtk_venc_worker(struct work_struct *work)
 	}
 
 	src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+
+	/*
+	 * If we see the flush buffer, send an empty buffer with the LAST flag
+	 * to the client. is_flushing will be reset at the time the buffer
+	 * is dequeued.
+	 */
+	if (src_buf == &ctx->empty_flush_buf.vb) {
+		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
+		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
+		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
+		v4l2_m2m_job_finish(ctx->dev->m2m_dev_enc, ctx->m2m_ctx);
+		return;
+	}
+
 	memset(&frm_buf, 0, sizeof(frm_buf));
 	for (i = 0; i < src_buf->vb2_buf.num_planes ; i++) {
 		frm_buf.fb_addr[i].dma_addr =
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
index be3842e6ca47..b2ba8db32fea 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
@@ -150,6 +150,7 @@ static int fops_vcodec_open(struct file *file)
 	struct mtk_vcodec_dev *dev = video_drvdata(file);
 	struct mtk_vcodec_ctx *ctx = NULL;
 	int ret = 0;
+	struct vb2_queue *src_vq;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
@@ -183,6 +184,9 @@ static int fops_vcodec_open(struct file *file)
 				ret);
 		goto err_m2m_ctx_init;
 	}
+	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+	ctx->empty_flush_buf.vb.vb2_buf.vb2_queue = src_vq;
 	mtk_vcodec_enc_set_default_params(ctx);
 
 	if (v4l2_fh_is_singular(&ctx->fh)) {
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 14/15] media: mtk-vcodec: venc: support START and STOP commands
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hans Verkuil,
	Mauro Carvalho Chehab, linux-media

The V4L2 encoder specification requires encoders to support the
V4L2_ENC_CMD_START and V4L2_ENC_CMD_STOP commands. Add support for these
to the mtk-vcodec encoder by reusing the same flush buffer as used by
the decoder driver.

Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   2 +
 .../platform/mtk-vcodec/mtk_vcodec_enc.c      | 123 +++++++++++++++++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |   4 +
 3 files changed, 122 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 8dab9f520283..73da6c7b69a8 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -253,6 +253,7 @@ struct vdec_pic_info {
  * @last_decoded_picinfo: pic information get from latest decode
  * @empty_flush_buf: a fake size-0 capture buffer that indicates flush. Only
  *		     to be used with encoder and stateful decoder.
+ * @is_flushing: set to true if flushing is in progress.
  * @current_codec: current set input codec, in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
@@ -292,6 +293,7 @@ struct mtk_vcodec_ctx {
 	struct work_struct encode_work;
 	struct vdec_pic_info last_decoded_picinfo;
 	struct v4l2_m2m_buffer empty_flush_buf;
+	bool is_flushing;
 
 	u32 current_codec;
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 8c917969c2f1..4de381b522ae 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -658,6 +658,7 @@ static int vidioc_venc_dqbuf(struct file *file, void *priv,
 			     struct v4l2_buffer *buf)
 {
 	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	int ret;
 
 	if (ctx->state == MTK_STATE_ABORT) {
 		mtk_v4l2_err("[%d] Call on QBUF after unrecoverable error",
@@ -665,7 +666,77 @@ static int vidioc_venc_dqbuf(struct file *file, void *priv,
 		return -EIO;
 	}
 
-	return v4l2_m2m_dqbuf(file, ctx->m2m_ctx, buf);
+	ret = v4l2_m2m_dqbuf(file, ctx->m2m_ctx, buf);
+	if (ret)
+		return ret;
+
+	/*
+	 * Complete flush if the user dequeued the 0-payload LAST buffer.
+	 * We check the payload because a buffer with the LAST flag can also
+	 * be seen during resolution changes. If we happen to be flushing at
+	 * that time, the last buffer before the resolution changes could be
+	 * misinterpreted for the buffer generated by the flush and terminate
+	 * it earlier than we want.
+	 */
+	if (!V4L2_TYPE_IS_OUTPUT(buf->type) &&
+	    buf->flags & V4L2_BUF_FLAG_LAST &&
+	    buf->m.planes[0].bytesused == 0 &&
+	    ctx->is_flushing) {
+		/*
+		 * Last CAPTURE buffer is dequeued, we can allow another flush
+		 * to take place.
+		 */
+		ctx->is_flushing = false;
+	}
+
+	return 0;
+}
+
+static int vidioc_encoder_cmd(struct file *file, void *priv,
+			      struct v4l2_encoder_cmd *cmd)
+{
+	struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+	struct vb2_queue *src_vq, *dst_vq;
+	int ret;
+
+	ret = v4l2_m2m_ioctl_try_encoder_cmd(file, priv, cmd);
+	if (ret)
+		return ret;
+
+	/* Calling START or STOP is invalid if a flush is in progress */
+	if (ctx->is_flushing)
+		return -EBUSY;
+
+	mtk_v4l2_debug(1, "encoder cmd=%u", cmd->cmd);
+
+	dst_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+	switch (cmd->cmd) {
+	case V4L2_ENC_CMD_STOP:
+		src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+		if (!vb2_is_streaming(src_vq)) {
+			mtk_v4l2_debug(1, "Output stream is off. No need to flush.");
+			return 0;
+		}
+		if (!vb2_is_streaming(dst_vq)) {
+			mtk_v4l2_debug(1, "Capture stream is off. No need to flush.");
+			return 0;
+		}
+		ctx->is_flushing = true;
+		v4l2_m2m_buf_queue(ctx->m2m_ctx, &ctx->empty_flush_buf.vb);
+		v4l2_m2m_try_schedule(ctx->m2m_ctx);
+		break;
+
+	case V4L2_ENC_CMD_START:
+		vb2_clear_last_buffer_dequeued(dst_vq);
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 const struct v4l2_ioctl_ops mtk_venc_ioctl_ops = {
@@ -701,6 +772,9 @@ const struct v4l2_ioctl_ops mtk_venc_ioctl_ops = {
 
 	.vidioc_g_selection		= vidioc_venc_g_selection,
 	.vidioc_s_selection		= vidioc_venc_s_selection,
+
+	.vidioc_encoder_cmd		= vidioc_encoder_cmd,
+	.vidioc_try_encoder_cmd		= v4l2_m2m_ioctl_try_encoder_cmd,
 };
 
 static int vb2ops_venc_queue_setup(struct vb2_queue *vq,
@@ -857,9 +931,27 @@ static void vb2ops_venc_stop_streaming(struct vb2_queue *q)
 			dst_buf->vb2_buf.planes[0].bytesused = 0;
 			v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_ERROR);
 		}
+		/* STREAMOFF on the CAPTURE queue completes any ongoing flush */
+		if (ctx->is_flushing) {
+			mtk_v4l2_debug(1, "STREAMOFF called while flushing");
+			v4l2_m2m_buf_remove_by_buf(&ctx->m2m_ctx->out_q_ctx,
+						   &ctx->empty_flush_buf.vb);
+			ctx->is_flushing = false;
+		}
 	} else {
-		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx)))
-			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		while ((src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx))) {
+			if (src_buf != &ctx->empty_flush_buf.vb)
+				v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+		}
+		if (ctx->is_flushing) {
+			/*
+			 * If we are in the middle of a flush, put the flush
+			 * buffer back into the queue so the next CAPTURE
+			 * buffer gets returned with the LAST flag set.
+			 */
+			v4l2_m2m_buf_queue(ctx->m2m_ctx,
+					   &ctx->empty_flush_buf.vb);
+		}
 	}
 
 	if ((q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE &&
@@ -955,12 +1047,15 @@ static int mtk_venc_param_change(struct mtk_vcodec_ctx *ctx)
 {
 	struct venc_enc_param enc_prm;
 	struct vb2_v4l2_buffer *vb2_v4l2 = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
-	struct mtk_video_enc_buf *mtk_buf =
-			container_of(vb2_v4l2, struct mtk_video_enc_buf,
-				     m2m_buf.vb);
-
+	struct mtk_video_enc_buf *mtk_buf;
 	int ret = 0;
 
+	/* Don't upcast the empty flush buffer */
+	if (vb2_v4l2 == &ctx->empty_flush_buf.vb)
+		return 0;
+
+	mtk_buf = container_of(vb2_v4l2, struct mtk_video_enc_buf, m2m_buf.vb);
+
 	memset(&enc_prm, 0, sizeof(enc_prm));
 	if (mtk_buf->param_change == MTK_ENCODE_PARAM_NONE)
 		return 0;
@@ -1046,6 +1141,20 @@ static void mtk_venc_worker(struct work_struct *work)
 	}
 
 	src_buf = v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+
+	/*
+	 * If we see the flush buffer, send an empty buffer with the LAST flag
+	 * to the client. is_flushing will be reset at the time the buffer
+	 * is dequeued.
+	 */
+	if (src_buf == &ctx->empty_flush_buf.vb) {
+		vb2_set_plane_payload(&dst_buf->vb2_buf, 0, 0);
+		dst_buf->flags |= V4L2_BUF_FLAG_LAST;
+		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_DONE);
+		v4l2_m2m_job_finish(ctx->dev->m2m_dev_enc, ctx->m2m_ctx);
+		return;
+	}
+
 	memset(&frm_buf, 0, sizeof(frm_buf));
 	for (i = 0; i < src_buf->vb2_buf.num_planes ; i++) {
 		frm_buf.fb_addr[i].dma_addr =
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
index be3842e6ca47..b2ba8db32fea 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
@@ -150,6 +150,7 @@ static int fops_vcodec_open(struct file *file)
 	struct mtk_vcodec_dev *dev = video_drvdata(file);
 	struct mtk_vcodec_ctx *ctx = NULL;
 	int ret = 0;
+	struct vb2_queue *src_vq;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx)
@@ -183,6 +184,9 @@ static int fops_vcodec_open(struct file *file)
 				ret);
 		goto err_m2m_ctx_init;
 	}
+	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+				V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+	ctx->empty_flush_buf.vb.vb2_buf.vb2_queue = src_vq;
 	mtk_vcodec_enc_set_default_params(ctx);
 
 	if (v4l2_fh_is_singular(&ctx->fh)) {
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 15/15] media: mtk-vcodec: venc: make sure buffer exists in list before removing
  2021-02-26 10:01 ` Alexandre Courbot
@ 2021-02-26 10:01   ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Mauro Carvalho Chehab, Hans Verkuil, linux-media, linux-kernel,
	linux-mediatek, Hsin-Yi Wang, Alexandre Courbot

From: Hsin-Yi Wang <hsinyi@chromium.org>

It is possible that empty_flush_buf is removed in mtk_venc_worker() and
then again in vb2ops_venc_stop_streaming(). However, there's no empty
list check in v4l2_m2m_buf_remove_by_buf(). Double remove causes a
kernel crash.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
[acourbot: fix commit log a bit]
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_enc.c   | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 4de381b522ae..8af7e840b958 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -933,9 +933,21 @@ static void vb2ops_venc_stop_streaming(struct vb2_queue *q)
 		}
 		/* STREAMOFF on the CAPTURE queue completes any ongoing flush */
 		if (ctx->is_flushing) {
+			struct v4l2_m2m_buffer *b, *n;
+
 			mtk_v4l2_debug(1, "STREAMOFF called while flushing");
-			v4l2_m2m_buf_remove_by_buf(&ctx->m2m_ctx->out_q_ctx,
-						   &ctx->empty_flush_buf.vb);
+			/*
+			 * STREAMOFF could be called before the flush buffer is
+			 * dequeued. Check whether empty flush buf is still in
+			 * queue before removing it.
+			 */
+			v4l2_m2m_for_each_src_buf_safe(ctx->m2m_ctx, b, n) {
+				if (b == &ctx->empty_flush_buf) {
+					v4l2_m2m_src_buf_remove_by_buf(
+							ctx->m2m_ctx, &b->vb);
+					break;
+				}
+			}
 			ctx->is_flushing = false;
 		}
 	} else {
-- 
2.30.1.766.gb4fecdf3b7-goog


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 15/15] media: mtk-vcodec: venc: make sure buffer exists in list before removing
@ 2021-02-26 10:01   ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-02-26 10:01 UTC (permalink / raw)
  To: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong
  Cc: Alexandre Courbot, linux-kernel, linux-mediatek, Hsin-Yi Wang,
	Hans Verkuil, Mauro Carvalho Chehab, linux-media

From: Hsin-Yi Wang <hsinyi@chromium.org>

It is possible that empty_flush_buf is removed in mtk_venc_worker() and
then again in vb2ops_venc_stop_streaming(). However, there's no empty
list check in v4l2_m2m_buf_remove_by_buf(). Double remove causes a
kernel crash.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
[acourbot: fix commit log a bit]
Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_enc.c   | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 4de381b522ae..8af7e840b958 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -933,9 +933,21 @@ static void vb2ops_venc_stop_streaming(struct vb2_queue *q)
 		}
 		/* STREAMOFF on the CAPTURE queue completes any ongoing flush */
 		if (ctx->is_flushing) {
+			struct v4l2_m2m_buffer *b, *n;
+
 			mtk_v4l2_debug(1, "STREAMOFF called while flushing");
-			v4l2_m2m_buf_remove_by_buf(&ctx->m2m_ctx->out_q_ctx,
-						   &ctx->empty_flush_buf.vb);
+			/*
+			 * STREAMOFF could be called before the flush buffer is
+			 * dequeued. Check whether empty flush buf is still in
+			 * queue before removing it.
+			 */
+			v4l2_m2m_for_each_src_buf_safe(ctx->m2m_ctx, b, n) {
+				if (b == &ctx->empty_flush_buf) {
+					v4l2_m2m_src_buf_remove_by_buf(
+							ctx->m2m_ctx, &b->vb);
+					break;
+				}
+			}
 			ctx->is_flushing = false;
 		}
 	} else {
-- 
2.30.1.766.gb4fecdf3b7-goog


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-02-26 10:01   ` Alexandre Courbot
@ 2021-03-03 21:30     ` Ezequiel Garcia
  -1 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:30 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hello Alex,

Thanks for the patch.

On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> From: Yunfei Dong <yunfei.dong@mediatek.com>
>
> Support the stateless codec API that will be used by MT8183.
>
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> [acourbot: refactor, cleanup and split]
> Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
>  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
>  5 files changed, 503 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
>
[..]

> +
> +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_SPS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,

This "needed_in_request" is not really required, as controls
are not volatile, and their value is stored per-context (per-fd).

It's perfectly valid for an application to pass the SPS control
at the beginning of the sequence, and then omit it
in further requests.

> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_PPS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> +                       .menu_skip_mask =
> +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +       },
> +};

Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
the driver supports. From a next patch, this case seems to be
V4L2_STATELESS_H264_START_CODE_ANNEX_B.

> +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> +
> +static const struct mtk_video_fmt mtk_video_formats[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> +               .type = MTK_FMT_DEC,
> +               .num_planes = 1,
> +       },
> +       {
> +               .fourcc = V4L2_PIX_FMT_MM21,
> +               .type = MTK_FMT_FRAME,
> +               .num_planes = 2,
> +       },
> +};
> +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> +#define DEFAULT_OUT_FMT_IDX    0
> +#define DEFAULT_CAP_FMT_IDX    1
> +
> +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> +               .stepwise = {
> +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> +               },
> +       },
> +};
> +
> +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> +
> +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> +                                              struct vdec_fb *fb)
> +{
> +       struct mtk_video_dec_buf *vdec_frame_buf =
> +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +
> +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +               unsigned int cap_c_size =
> +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +
> +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> +       }
> +}
> +
> +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> +{
> +       struct mtk_video_dec_buf *framebuf =
> +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> +
> +       pfb = &framebuf->frame_buffer;
> +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);

Are you sure you need a CPU mapping? It seems strange.
I'll comment some more on the next patch(es).

> +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +
> +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> +               pfb->base_c.dma_addr =
> +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +       }
> +       mtk_v4l2_debug(1,
> +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> +               dst_buf->index, pfb,
> +               pfb->base_y.va, &pfb->base_y.dma_addr,
> +               &pfb->base_c.dma_addr, pfb->base_y.size,
> +               ctx->decoded_frame_cnt);
> +
> +       return pfb;
> +}
> +
> +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> +{
> +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +
> +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> +}
> +
> +static int fops_media_request_validate(struct media_request *mreq)
> +{
> +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> +       struct mtk_vcodec_ctx *ctx = NULL;
> +       struct media_request_object *req_obj;
> +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> +       struct v4l2_ctrl *ctrl;
> +       unsigned int i;
> +
> +       switch (buffer_cnt) {
> +       case 1:
> +               /* We expect exactly one buffer with the request */
> +               break;
> +       case 0:
> +               mtk_v4l2_err("No buffer provided with the request");
> +               return -ENOENT;
> +       default:
> +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> +                            buffer_cnt);
> +               return -EINVAL;
> +       }
> +
> +       list_for_each_entry(req_obj, &mreq->objects, list) {
> +               struct vb2_buffer *vb;
> +
> +               if (vb2_request_object_is_buffer(req_obj)) {
> +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> +                       break;
> +               }
> +       }
> +
> +       if (!ctx) {
> +               mtk_v4l2_err("Cannot find buffer for request");
> +               return -ENOENT;
> +       }
> +
> +       parent_hdl = &ctx->ctrl_hdl;
> +
> +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> +       if (!hdl) {
> +               mtk_v4l2_err("Cannot find control handler for request\n");
> +               return -ENOENT;
> +       }
> +
> +       for (i = 0; i < NUM_CTRLS; i++) {
> +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> +                       continue;
> +               if (!mtk_stateless_controls[i].needed_in_request)
> +                       continue;
> +
> +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> +                                         mtk_stateless_controls[i].cfg.id);
> +               if (!ctrl) {
> +                       mtk_v4l2_err("Missing required codec control\n");
> +                       return -ENOENT;
> +               }
> +       }
> +
> +       v4l2_ctrl_request_hdl_put(hdl);
> +
> +       return vb2_request_validate(mreq);
> +}
> +
> +static void mtk_vdec_worker(struct work_struct *work)
> +{
> +       struct mtk_vcodec_ctx *ctx =
> +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> +       struct mtk_vcodec_dev *dev = ctx->dev;
> +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> +       struct vb2_buffer *vb2_src;
> +       struct mtk_vcodec_mem *bs_src;
> +       struct mtk_video_dec_buf *dec_buf_src;
> +       struct media_request *src_buf_req;
> +       struct vdec_fb *dst_buf;
> +       bool res_chg = false;
> +       int ret;
> +
> +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> +       if (vb2_v4l2_src == NULL) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> +               return;
> +       }
> +
> +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> +       if (vb2_v4l2_dst == NULL) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> +               return;
> +       }
> +
> +       vb2_src = &vb2_v4l2_src->vb2_buf;
> +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> +                                  m2m_buf.vb);
> +       bs_src = &dec_buf_src->bs_buffer;
> +
> +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> +                       ctx->id, src_buf->vb2_queue->type,
> +                       src_buf->index, src_buf, src_buf_info);
> +
> +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> +       if (!bs_src->va) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> +                            vb2_src->index);
> +               return;
> +       }
> +
> +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> +       /* Apply request controls. */
> +       src_buf_req = vb2_src->req_obj.req;
> +       if (src_buf_req)
> +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> +       else
> +               mtk_v4l2_err("vb2 buffer media request is NULL");
> +
> +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> +       if (ret) {
> +               mtk_v4l2_err(
> +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> +                       ctx->id, vb2_src->index, bs_src->size,
> +                       vb2_src->timestamp, ret, res_chg);
> +               if (ret == -EIO) {
> +                       mutex_lock(&ctx->lock);
> +                       dec_buf_src->error = true;
> +                       mutex_unlock(&ctx->lock);
> +               }
> +       }
> +
> +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> +
> +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> +
> +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> +}
> +
> +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> +{
> +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> +
> +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> +                       ctx->id, vb->vb2_queue->type,
> +                       vb->index, vb);
> +
> +       mutex_lock(&ctx->lock);
> +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> +       mutex_unlock(&ctx->lock);
> +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> +               return;
> +
> +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> +               vb->vb2_queue->type, vb->index, src_buf);
> +
> +       /* If an OUTPUT buffer, we may need to update the state */
> +       if (ctx->state == MTK_STATE_INIT) {
> +               ctx->state = MTK_STATE_HEADER;
> +               mtk_v4l2_debug(1, "Init driver from init to header.");

This state thing seems just something to make the rest
of the stateful-based driver happy, right?

Makes me wonder a bit if just splitting the stateless part to its
own driver, wouldn't make your maintenance easier.

What's the motivation for sharing the driver?

> +       } else {
> +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> +                               ctx->id, ctx->state);
> +       }
> +}
> +
> +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> +{
> +       bool res_chg;
> +
> +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> +}
> +
> +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> +};
> +
> +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct v4l2_ctrl *ctrl;
> +       unsigned int i;
> +
> +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> +       if (ctx->ctrl_hdl.error) {
> +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> +               return ctx->ctrl_hdl.error;
> +       }
> +
> +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> +                               &mtk_vcodec_dec_ctrl_ops,
> +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> +                               0, 32, 1, 1);
> +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;

Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
to return the DPB size. However, isn't this something userspace already knows?

> +
> +       for (i = 0; i < NUM_CTRLS; i++) {
> +               struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
> +
> +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> +               if (ctx->ctrl_hdl.error) {
> +                       mtk_v4l2_err("Adding control %d failed %d",
> +                                       i, ctx->ctrl_hdl.error);
> +                       return ctx->ctrl_hdl.error;
> +               }
> +       }
> +
> +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> +
> +       return 0;
> +}
> +
> +const struct media_device_ops mtk_vcodec_media_ops = {
> +       .req_validate   = fops_media_request_validate,
> +       .req_queue      = v4l2_m2m_request_queue,
> +};
> +
> +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct vb2_queue *src_vq;
> +
> +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> +
> +       /* Support request api for output plane */
> +       src_vq->supports_requests = true;
> +       src_vq->requires_requests = true;
> +}
> +
> +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> +{

I have to admit I do not remember exactly the reason,
but this should set the buffer field to V4L2_FIELD_NONE.

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-03 21:30     ` Ezequiel Garcia
  0 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:30 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hello Alex,

Thanks for the patch.

On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> From: Yunfei Dong <yunfei.dong@mediatek.com>
>
> Support the stateless codec API that will be used by MT8183.
>
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> [acourbot: refactor, cleanup and split]
> Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
>  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
>  5 files changed, 503 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
>
[..]

> +
> +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_SPS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,

This "needed_in_request" is not really required, as controls
are not volatile, and their value is stored per-context (per-fd).

It's perfectly valid for an application to pass the SPS control
at the beginning of the sequence, and then omit it
in further requests.

> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_PPS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +               .needed_in_request = true,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> +                       .menu_skip_mask =
> +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +       },
> +       {
> +               .cfg = {
> +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> +               },
> +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> +       },
> +};

Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
the driver supports. From a next patch, this case seems to be
V4L2_STATELESS_H264_START_CODE_ANNEX_B.

> +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> +
> +static const struct mtk_video_fmt mtk_video_formats[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> +               .type = MTK_FMT_DEC,
> +               .num_planes = 1,
> +       },
> +       {
> +               .fourcc = V4L2_PIX_FMT_MM21,
> +               .type = MTK_FMT_FRAME,
> +               .num_planes = 2,
> +       },
> +};
> +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> +#define DEFAULT_OUT_FMT_IDX    0
> +#define DEFAULT_CAP_FMT_IDX    1
> +
> +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> +               .stepwise = {
> +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> +               },
> +       },
> +};
> +
> +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> +
> +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> +                                              struct vdec_fb *fb)
> +{
> +       struct mtk_video_dec_buf *vdec_frame_buf =
> +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +
> +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +               unsigned int cap_c_size =
> +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +
> +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> +       }
> +}
> +
> +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> +{
> +       struct mtk_video_dec_buf *framebuf =
> +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> +
> +       pfb = &framebuf->frame_buffer;
> +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);

Are you sure you need a CPU mapping? It seems strange.
I'll comment some more on the next patch(es).

> +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +
> +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> +               pfb->base_c.dma_addr =
> +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +       }
> +       mtk_v4l2_debug(1,
> +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> +               dst_buf->index, pfb,
> +               pfb->base_y.va, &pfb->base_y.dma_addr,
> +               &pfb->base_c.dma_addr, pfb->base_y.size,
> +               ctx->decoded_frame_cnt);
> +
> +       return pfb;
> +}
> +
> +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> +{
> +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +
> +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> +}
> +
> +static int fops_media_request_validate(struct media_request *mreq)
> +{
> +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> +       struct mtk_vcodec_ctx *ctx = NULL;
> +       struct media_request_object *req_obj;
> +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> +       struct v4l2_ctrl *ctrl;
> +       unsigned int i;
> +
> +       switch (buffer_cnt) {
> +       case 1:
> +               /* We expect exactly one buffer with the request */
> +               break;
> +       case 0:
> +               mtk_v4l2_err("No buffer provided with the request");
> +               return -ENOENT;
> +       default:
> +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> +                            buffer_cnt);
> +               return -EINVAL;
> +       }
> +
> +       list_for_each_entry(req_obj, &mreq->objects, list) {
> +               struct vb2_buffer *vb;
> +
> +               if (vb2_request_object_is_buffer(req_obj)) {
> +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> +                       break;
> +               }
> +       }
> +
> +       if (!ctx) {
> +               mtk_v4l2_err("Cannot find buffer for request");
> +               return -ENOENT;
> +       }
> +
> +       parent_hdl = &ctx->ctrl_hdl;
> +
> +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> +       if (!hdl) {
> +               mtk_v4l2_err("Cannot find control handler for request\n");
> +               return -ENOENT;
> +       }
> +
> +       for (i = 0; i < NUM_CTRLS; i++) {
> +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> +                       continue;
> +               if (!mtk_stateless_controls[i].needed_in_request)
> +                       continue;
> +
> +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> +                                         mtk_stateless_controls[i].cfg.id);
> +               if (!ctrl) {
> +                       mtk_v4l2_err("Missing required codec control\n");
> +                       return -ENOENT;
> +               }
> +       }
> +
> +       v4l2_ctrl_request_hdl_put(hdl);
> +
> +       return vb2_request_validate(mreq);
> +}
> +
> +static void mtk_vdec_worker(struct work_struct *work)
> +{
> +       struct mtk_vcodec_ctx *ctx =
> +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> +       struct mtk_vcodec_dev *dev = ctx->dev;
> +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> +       struct vb2_buffer *vb2_src;
> +       struct mtk_vcodec_mem *bs_src;
> +       struct mtk_video_dec_buf *dec_buf_src;
> +       struct media_request *src_buf_req;
> +       struct vdec_fb *dst_buf;
> +       bool res_chg = false;
> +       int ret;
> +
> +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> +       if (vb2_v4l2_src == NULL) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> +               return;
> +       }
> +
> +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> +       if (vb2_v4l2_dst == NULL) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> +               return;
> +       }
> +
> +       vb2_src = &vb2_v4l2_src->vb2_buf;
> +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> +                                  m2m_buf.vb);
> +       bs_src = &dec_buf_src->bs_buffer;
> +
> +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> +                       ctx->id, src_buf->vb2_queue->type,
> +                       src_buf->index, src_buf, src_buf_info);
> +
> +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> +       if (!bs_src->va) {
> +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> +                            vb2_src->index);
> +               return;
> +       }
> +
> +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> +       /* Apply request controls. */
> +       src_buf_req = vb2_src->req_obj.req;
> +       if (src_buf_req)
> +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> +       else
> +               mtk_v4l2_err("vb2 buffer media request is NULL");
> +
> +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> +       if (ret) {
> +               mtk_v4l2_err(
> +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> +                       ctx->id, vb2_src->index, bs_src->size,
> +                       vb2_src->timestamp, ret, res_chg);
> +               if (ret == -EIO) {
> +                       mutex_lock(&ctx->lock);
> +                       dec_buf_src->error = true;
> +                       mutex_unlock(&ctx->lock);
> +               }
> +       }
> +
> +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> +
> +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> +
> +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> +}
> +
> +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> +{
> +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> +
> +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> +                       ctx->id, vb->vb2_queue->type,
> +                       vb->index, vb);
> +
> +       mutex_lock(&ctx->lock);
> +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> +       mutex_unlock(&ctx->lock);
> +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> +               return;
> +
> +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> +               vb->vb2_queue->type, vb->index, src_buf);
> +
> +       /* If an OUTPUT buffer, we may need to update the state */
> +       if (ctx->state == MTK_STATE_INIT) {
> +               ctx->state = MTK_STATE_HEADER;
> +               mtk_v4l2_debug(1, "Init driver from init to header.");

This state thing seems just something to make the rest
of the stateful-based driver happy, right?

Makes me wonder a bit if just splitting the stateless part to its
own driver, wouldn't make your maintenance easier.

What's the motivation for sharing the driver?

> +       } else {
> +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> +                               ctx->id, ctx->state);
> +       }
> +}
> +
> +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> +{
> +       bool res_chg;
> +
> +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> +}
> +
> +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> +};
> +
> +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct v4l2_ctrl *ctrl;
> +       unsigned int i;
> +
> +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> +       if (ctx->ctrl_hdl.error) {
> +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> +               return ctx->ctrl_hdl.error;
> +       }
> +
> +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> +                               &mtk_vcodec_dec_ctrl_ops,
> +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> +                               0, 32, 1, 1);
> +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;

Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
to return the DPB size. However, isn't this something userspace already knows?

> +
> +       for (i = 0; i < NUM_CTRLS; i++) {
> +               struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
> +
> +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> +               if (ctx->ctrl_hdl.error) {
> +                       mtk_v4l2_err("Adding control %d failed %d",
> +                                       i, ctx->ctrl_hdl.error);
> +                       return ctx->ctrl_hdl.error;
> +               }
> +       }
> +
> +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> +
> +       return 0;
> +}
> +
> +const struct media_device_ops mtk_vcodec_media_ops = {
> +       .req_validate   = fops_media_request_validate,
> +       .req_queue      = v4l2_m2m_request_queue,
> +};
> +
> +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct vb2_queue *src_vq;
> +
> +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> +
> +       /* Support request api for output plane */
> +       src_vq->supports_requests = true;
> +       src_vq->requires_requests = true;
> +}
> +
> +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> +{

I have to admit I do not remember exactly the reason,
but this should set the buffer field to V4L2_FIELD_NONE.

Thanks,
Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-02-26 10:01   ` Alexandre Courbot
@ 2021-03-03 21:47     ` Ezequiel Garcia
  -1 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:47 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

 Hi Alex,

Thanks for the patch.

On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> From: Yunfei Dong <yunfei.dong@mediatek.com>
>
> Add support for H.264 decoding using the stateless API, as supported by
> MT8183. This support takes advantage of the V4L2 H.264 reference list
> builders.
>
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> [acourbot: refactor, cleanup and split]
> Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
>  drivers/media/platform/Kconfig                |   1 +
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
>  5 files changed, 813 insertions(+)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
>
> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index fd1831e97b22..c27db5643712 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
>         select V4L2_MEM2MEM_DEV
>         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
>         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> +       select V4L2_H264
>         help
>           Mediatek video codec driver provides HW capability to
>           encode and decode in a range of video formats on MT8173
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index 4ba93d838ab6..ca8e9e7a9c4e 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
>  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>                 vdec/vdec_vp8_if.o \
>                 vdec/vdec_vp9_if.o \
> +               vdec/vdec_h264_req_if.o \
>                 mtk_vcodec_dec_drv.o \
>                 vdec_drv_if.o \
>                 vdec_vpu_if.o \
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> new file mode 100644
> index 000000000000..2fbbfbbcfbec
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> @@ -0,0 +1,807 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/v4l2-h264.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include "../vdec_drv_if.h"
> +#include "../mtk_vcodec_util.h"
> +#include "../mtk_vcodec_dec.h"
> +#include "../mtk_vcodec_intr.h"
> +#include "../vdec_vpu_if.h"
> +#include "../vdec_drv_base.h"
> +
> +#define NAL_NON_IDR_SLICE                      0x01
> +#define NAL_IDR_SLICE                          0x05
> +#define NAL_H264_PPS                           0x08

Not used?

> +#define NAL_TYPE(value)                                ((value) & 0x1F)
> +

I believe you may not need the NAL type.

> +#define BUF_PREDICTION_SZ                      (64 * 4096)
> +#define MB_UNIT_LEN                            16
> +
> +/* get used parameters for sps/pps */
> +#define GET_MTK_VDEC_FLAG(cond, flag) \
> +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> +#define GET_MTK_VDEC_PARAM(param) \
> +       { dst_param->param = src_param->param; }
> +/* motion vector size (bytes) for every macro block */
> +#define HW_MB_STORE_SZ                         64
> +
> +#define H264_MAX_FB_NUM                                17
> +#define H264_MAX_MV_NUM                                32
> +#define HDR_PARSING_BUF_SZ                     1024
> +
> +/**
> + * struct mtk_h264_dpb_info  - h264 dpb information
> + * @y_dma_addr: Y bitstream physical address
> + * @c_dma_addr: CbCr bitstream physical address
> + * @reference_flag: reference picture flag (short/long term reference picture)
> + * @field: field picture flag
> + */
> +struct mtk_h264_dpb_info {
> +       dma_addr_t y_dma_addr;
> +       dma_addr_t c_dma_addr;
> +       int reference_flag;
> +       int field;
> +};
> +
> +/**
> + * struct mtk_h264_sps_param  - parameters for sps
> + */
> +struct mtk_h264_sps_param {
> +       unsigned char chroma_format_idc;
> +       unsigned char bit_depth_luma_minus8;
> +       unsigned char bit_depth_chroma_minus8;
> +       unsigned char log2_max_frame_num_minus4;
> +       unsigned char pic_order_cnt_type;
> +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> +       unsigned char max_num_ref_frames;
> +       unsigned char separate_colour_plane_flag;
> +       unsigned short pic_width_in_mbs_minus1;
> +       unsigned short pic_height_in_map_units_minus1;
> +       unsigned int max_frame_nums;
> +       unsigned char qpprime_y_zero_transform_bypass_flag;
> +       unsigned char delta_pic_order_always_zero_flag;
> +       unsigned char frame_mbs_only_flag;
> +       unsigned char mb_adaptive_frame_field_flag;
> +       unsigned char direct_8x8_inference_flag;
> +       unsigned char reserved[3];
> +};
> +
> +/**
> + * struct mtk_h264_pps_param  - parameters for pps
> + */
> +struct mtk_h264_pps_param {
> +       unsigned char num_ref_idx_l0_default_active_minus1;
> +       unsigned char num_ref_idx_l1_default_active_minus1;
> +       unsigned char weighted_bipred_idc;
> +       char pic_init_qp_minus26;
> +       char chroma_qp_index_offset;
> +       char second_chroma_qp_index_offset;
> +       unsigned char entropy_coding_mode_flag;
> +       unsigned char pic_order_present_flag;
> +       unsigned char deblocking_filter_control_present_flag;
> +       unsigned char constrained_intra_pred_flag;
> +       unsigned char weighted_pred_flag;
> +       unsigned char redundant_pic_cnt_present_flag;
> +       unsigned char transform_8x8_mode_flag;
> +       unsigned char scaling_matrix_present_flag;
> +       unsigned char reserved[2];
> +};
> +
> +struct slice_api_h264_scaling_matrix {

Equal to v4l2_ctrl_h264_scaling_matrix ?
Well I guess you don't want to mix a hardware-specific
thing with the V4L2 API maybe.

> +       unsigned char scaling_list_4x4[6][16];
> +       unsigned char scaling_list_8x8[6][64];
> +};
> +
> +struct slice_h264_dpb_entry {
> +       unsigned long long reference_ts;
> +       unsigned short frame_num;
> +       unsigned short pic_num;
> +       /* Note that field is indicated by v4l2_buffer.field */
> +       int top_field_order_cnt;
> +       int bottom_field_order_cnt;
> +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> +};
> +
> +/**
> + * struct slice_api_h264_decode_param - parameters for decode.
> + */
> +struct slice_api_h264_decode_param {
> +       struct slice_h264_dpb_entry dpb[16];

V4L2_H264_NUM_DPB_ENTRIES?

> +       unsigned short num_slices;
> +       unsigned short nal_ref_idc;
> +       unsigned char ref_pic_list_p0[32];
> +       unsigned char ref_pic_list_b0[32];
> +       unsigned char ref_pic_list_b1[32];

V4L2_H264_REF_LIST_LEN?

> +       int top_field_order_cnt;
> +       int bottom_field_order_cnt;
> +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> +};
> +
> +/**
> + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> + */
> +struct mtk_h264_dec_slice_param {
> +       struct mtk_h264_sps_param                       sps;
> +       struct mtk_h264_pps_param                       pps;
> +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> +       struct slice_api_h264_decode_param              decode_params;
> +       struct mtk_h264_dpb_info h264_dpb_info[16];

V4L2_H264_NUM_DPB_ENTRIES?

> +};
> +
> +/**
> + * struct h264_fb - h264 decode frame buffer information
> + * @vdec_fb_va  : virtual address of struct vdec_fb
> + * @y_fb_dma    : dma address of Y frame buffer (luma)
> + * @c_fb_dma    : dma address of C frame buffer (chroma)
> + * @poc         : picture order count of frame buffer
> + * @reserved    : for 8 bytes alignment
> + */
> +struct h264_fb {
> +       uint64_t vdec_fb_va;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +       int32_t poc;
> +       uint32_t reserved;
> +};
> +
> +/**
> + * struct vdec_h264_dec_info - decode information
> + * @dpb_sz             : decoding picture buffer size
> + * @resolution_changed  : resoltion change happen
> + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> + * @cap_num_planes     : number planes of capture buffer
> + * @bs_dma             : Input bit-stream buffer dma address
> + * @y_fb_dma           : Y frame buffer dma address
> + * @c_fb_dma           : C frame buffer dma address
> + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> + */
> +struct vdec_h264_dec_info {
> +       uint32_t dpb_sz;
> +       uint32_t resolution_changed;
> +       uint32_t realloc_mv_buf;
> +       uint32_t cap_num_planes;
> +       uint64_t bs_dma;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +       uint64_t vdec_fb_va;
> +};
> +
> +/**
> + * struct vdec_h264_vsi - shared memory for decode information exchange
> + *                        between VPU and Host.
> + *                        The memory is allocated by VPU then mapping to Host
> + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> + *                        by VPU.
> + *                        AP-W/R : AP is writer/reader on this item
> + *                        VPU-W/R: VPU is write/reader on this item
> + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> + * @dec          : decode information (AP-R, VPU-W)
> + * @pic          : picture information (AP-R, VPU-W)
> + * @crop         : crop information (AP-R, VPU-W)
> + */
> +struct vdec_h264_vsi {
> +       uint64_t pred_buf_dma;
> +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> +       struct vdec_h264_dec_info dec;
> +       struct vdec_pic_info pic;
> +       struct v4l2_rect crop;
> +       struct mtk_h264_dec_slice_param h264_slice_params;
> +};
> +
> +/**
> + * struct vdec_h264_slice_inst - h264 decoder instance
> + * @num_nalu : how many nalus be decoded
> + * @ctx      : point to mtk_vcodec_ctx
> + * @pred_buf : HW working predication buffer
> + * @mv_buf   : HW working motion vector buffer
> + * @vpu      : VPU instance
> + * @vsi_ctx  : Local VSI data for this decoding context
> + */
> +struct vdec_h264_slice_inst {
> +       unsigned int num_nalu;
> +       struct mtk_vcodec_ctx *ctx;
> +       struct mtk_vcodec_mem pred_buf;
> +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> +       struct vdec_vpu_inst vpu;
> +       struct vdec_h264_vsi vsi_ctx;
> +       struct mtk_h264_dec_slice_param h264_slice_param;
> +
> +       struct v4l2_h264_dpb_entry dpb[16];
> +};
> +
> +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> +                                int id)
> +{
> +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> +
> +       return ctrl->p_cur.p;
> +}
> +
> +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> +                             struct mtk_h264_dec_slice_param *slice_param)
> +{
> +       struct vb2_queue *vq;
> +       struct vb2_buffer *vb;
> +       struct vb2_v4l2_buffer *vb2_v4l2;
> +       u64 index;
> +
> +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> +
> +       for (index = 0; index < 16; index++) {

Ditto, some macro instead of 16.

> +               const struct slice_h264_dpb_entry *dpb;
> +               int vb2_index;
> +
> +               dpb = &slice_param->decode_params.dpb[index];
> +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> +                       continue;
> +               }
> +
> +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> +               if (vb2_index < 0) {
> +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> +                               index, dpb->reference_ts);
> +                       continue;
> +               }
> +               /* 1 for short term reference, 2 for long term reference */
> +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> +               else
> +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> +
> +               vb = vq->bufs[vb2_index];
> +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> +
> +               slice_param->h264_dpb_info[index].y_dma_addr =
> +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +                       slice_param->h264_dpb_info[index].c_dma_addr =
> +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> +               }
> +       }
> +}
> +
> +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> +       const struct v4l2_ctrl_h264_sps *src_param)
> +{
> +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> +
> +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> +}
> +
> +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> +       const struct v4l2_ctrl_h264_pps *src_param)
> +{
> +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> +
> +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> +}
> +
> +static void
> +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> +{
> +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> +              sizeof(dst_matrix->scaling_list_4x4));
> +
> +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> +              sizeof(dst_matrix->scaling_list_8x8));
> +}
> +
> +static void get_h264_decode_parameters(
> +       struct slice_api_h264_decode_param *dst_params,
> +       const struct v4l2_ctrl_h264_decode_params *src_params,
> +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> +
> +               dst_entry->reference_ts = src_entry->reference_ts;
> +               dst_entry->frame_num = src_entry->frame_num;
> +               dst_entry->pic_num = src_entry->pic_num;
> +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> +               dst_entry->bottom_field_order_cnt =
> +                       src_entry->bottom_field_order_cnt;
> +               dst_entry->flags = src_entry->flags;
> +       }
> +
> +       // num_slices is a leftover from the old H.264 support and is ignored
> +       // by the firmware.
> +       dst_params->num_slices = 0;
> +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> +       dst_params->flags = src_params->flags;
> +}
> +
> +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> +                           const struct v4l2_h264_dpb_entry *b)
> +{
> +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> +}
> +
> +/*
> + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> + * into the already existing slot in dpb, and move other entries into new slots.
> + *
> + * This function is an adaptation of the similarly-named function in
> + * hantro_h264.c.
> + */
> +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> +                      struct v4l2_h264_dpb_entry *dpb)
> +{
> +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       unsigned int i, j;
> +
> +       /* Disable all entries by default, and mark the ones in use. */
> +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> +                       set_bit(i, in_use);
> +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> +       }
> +
> +       /* Try to match new DPB entries with existing ones by their POCs. */
> +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +
> +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> +                       continue;
> +
> +               /*
> +                * To cut off some comparisons, iterate only on target DPB
> +                * entries were already used.
> +                */
> +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> +                       struct v4l2_h264_dpb_entry *cdpb;
> +
> +                       cdpb = &dpb[j];
> +                       if (!dpb_entry_match(cdpb, ndpb))
> +                               continue;
> +
> +                       *cdpb = *ndpb;
> +                       set_bit(j, used);
> +                       /* Don't reiterate on this one. */
> +                       clear_bit(j, in_use);
> +                       break;
> +               }
> +
> +               if (j == ARRAY_SIZE(dec_param->dpb))
> +                       set_bit(i, new);
> +       }
> +
> +       /* For entries that could not be matched, use remaining free slots. */
> +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +               struct v4l2_h264_dpb_entry *cdpb;
> +
> +               /*
> +                * Both arrays are of the same sizes, so there is no way
> +                * we can end up with no space in target array, unless
> +                * something is buggy.
> +                */
> +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> +                       return;
> +
> +               cdpb = &dpb[j];
> +               *cdpb = *ndpb;
> +               set_bit(j, used);
> +       }
> +}
> +
> +/*
> + * The firmware expects unused reflist entries to have the value 0x20.
> + */
> +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> +{
> +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> +}
> +
> +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> +{
> +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> +       const struct v4l2_ctrl_h264_sps *sps =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> +       const struct v4l2_ctrl_h264_pps *pps =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> +       struct v4l2_h264_reflist_builder reflist_builder;
> +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> +       int i;
> +
> +       update_dpb(dec_params, inst->dpb);
> +
> +       get_h264_sps_parameters(&slice_param->sps, sps);
> +       get_h264_pps_parameters(&slice_param->pps, pps);
> +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> +                                  inst->dpb);
> +       get_h264_dpb_list(inst, slice_param);
> +
> +       /* Prepare the fields for our reference lists */
> +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> +       /* Build the reference lists */
> +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> +                                      inst->dpb);
> +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> +       /* Adapt the built lists to the firmware's expectations */
> +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> +
> +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> +              sizeof(inst->vsi_ctx.h264_slice_params));
> +}
> +
> +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> +{
> +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> +
> +       return HW_MB_STORE_SZ * unit_size;
> +}
> +
> +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       int err = 0;
> +
> +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> +       if (err) {
> +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> +               return err;
> +       }
> +
> +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> +       return 0;
> +}
> +
> +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       struct mtk_vcodec_mem *mem = NULL;
> +
> +       mtk_vcodec_debug_enter(inst);
> +
> +       inst->vsi_ctx.pred_buf_dma = 0;
> +       mem = &inst->pred_buf;
> +       if (mem->va)
> +               mtk_vcodec_mem_free(inst->ctx, mem);
> +}
> +
> +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> +       struct vdec_pic_info *pic)
> +{
> +       int i;
> +       int err;
> +       struct mtk_vcodec_mem *mem = NULL;
> +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> +
> +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +               mem = &inst->mv_buf[i];
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(inst->ctx, mem);
> +               mem->size = buf_sz;
> +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +               if (err) {
> +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> +                       return err;
> +               }
> +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> +       }
> +
> +       return 0;
> +}
> +
> +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       int i;
> +       struct mtk_vcodec_mem *mem = NULL;
> +
> +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> +               mem = &inst->mv_buf[i];
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(inst->ctx, mem);
> +       }
> +}
> +
> +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> +                        struct vdec_pic_info *pic)
> +{
> +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> +
> +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> +       inst->vsi_ctx.dec.cap_num_planes =
> +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> +
> +       pic = &ctx->picinfo;
> +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> +               ctx->picinfo.fb_sz[1]);
> +
> +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> +               inst->vsi_ctx.dec.resolution_changed = true;
> +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> +
> +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> +                       inst->vsi_ctx.dec.resolution_changed,
> +                       inst->vsi_ctx.dec.realloc_mv_buf,
> +                       ctx->last_decoded_picinfo.pic_w,
> +                       ctx->last_decoded_picinfo.pic_h,
> +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> +       }
> +}
> +
> +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> +       struct v4l2_rect *cr)
> +{
> +       cr->left = inst->vsi_ctx.crop.left;
> +       cr->top = inst->vsi_ctx.crop.top;
> +       cr->width = inst->vsi_ctx.crop.width;
> +       cr->height = inst->vsi_ctx.crop.height;
> +
> +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> +                        cr->left, cr->top, cr->width, cr->height);
> +}
> +
> +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> +       unsigned int *dpb_sz)
> +{
> +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> +}
> +
> +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct vdec_h264_slice_inst *inst = NULL;
> +       int err;
> +
> +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> +       if (!inst)
> +               return -ENOMEM;
> +
> +       inst->ctx = ctx;
> +
> +       inst->vpu.id = SCP_IPI_VDEC_H264;
> +       inst->vpu.ctx = ctx;
> +
> +       err = vpu_dec_init(&inst->vpu);
> +       if (err) {
> +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> +               goto error_free_inst;
> +       }
> +
> +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> +       inst->vsi_ctx.dec.resolution_changed = true;
> +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> +
> +       err = allocate_predication_buf(inst);
> +       if (err)
> +               goto error_deinit;
> +
> +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> +               sizeof(struct mtk_h264_sps_param),
> +               sizeof(struct mtk_h264_pps_param),
> +               sizeof(struct mtk_h264_dec_slice_param),
> +               sizeof(struct mtk_h264_dpb_info));
> +
> +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> +
> +       ctx->drv_handle = inst;
> +       return 0;
> +
> +error_deinit:
> +       vpu_dec_deinit(&inst->vpu);
> +
> +error_free_inst:
> +       kfree(inst);
> +       return err;
> +}
> +
> +static void vdec_h264_slice_deinit(void *h_vdec)
> +{
> +       struct vdec_h264_slice_inst *inst =
> +               (struct vdec_h264_slice_inst *)h_vdec;
> +
> +       mtk_vcodec_debug_enter(inst);
> +
> +       vpu_dec_deinit(&inst->vpu);
> +       free_predication_buf(inst);
> +       free_mv_buf(inst);
> +
> +       kfree(inst);
> +}
> +
> +static int find_start_code(unsigned char *data, unsigned int data_sz)
> +{
> +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> +               return 3;
> +
> +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> +           data[3] == 1)
> +               return 4;
> +
> +       return -1;
> +}
> +
> +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> +                                 struct vdec_fb *fb, bool *res_chg)
> +{
> +       struct vdec_h264_slice_inst *inst =
> +               (struct vdec_h264_slice_inst *)h_vdec;
> +       struct vdec_vpu_inst *vpu = &inst->vpu;
> +       struct mtk_video_dec_buf *src_buf_info;
> +       int nal_start_idx = 0, err = 0;
> +       uint32_t nal_type, data[2];
> +       unsigned char *buf;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +
> +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> +
> +       /* bs NULL means flush decoder */
> +       if (bs == NULL)
> +               return vpu_dec_reset(vpu);
> +
> +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +
> +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> +
> +       buf = (unsigned char *)bs->va;

I can be completely wrong, but it would seem here
is where the CPU mapping is used.

> +       nal_start_idx = find_start_code(buf, bs->size);
> +       if (nal_start_idx < 0)
> +               goto err_free_fb_out;
> +
> +       data[0] = bs->size;
> +       data[1] = buf[nal_start_idx];
> +       nal_type = NAL_TYPE(buf[nal_start_idx]);

Which seems to be used to parse the NAL type. But shouldn't
you expect here VLC NALUs only?

I.e. you only get IDR or non-IDR frames, marked with
V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.

> +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> +                        nal_type);
> +
> +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> +
> +       get_vdec_decode_parameters(inst);
> +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> +       if (*res_chg) {
> +               mtk_vcodec_debug(inst, "- resolution changed -");
> +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> +                       if (err)
> +                               goto err_free_fb_out;
> +               }
> +               *res_chg = false;
> +       }
> +
> +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> +       err = vpu_dec_start(vpu, data, 2);

Then it seems this 2-bytes are passed to the firmware. Maybe you
could test if that can be derived without the CPU mapping.
That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-03 21:47     ` Ezequiel Garcia
  0 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:47 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

 Hi Alex,

Thanks for the patch.

On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> From: Yunfei Dong <yunfei.dong@mediatek.com>
>
> Add support for H.264 decoding using the stateless API, as supported by
> MT8183. This support takes advantage of the V4L2 H.264 reference list
> builders.
>
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> [acourbot: refactor, cleanup and split]
> Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> ---
>  drivers/media/platform/Kconfig                |   1 +
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
>  5 files changed, 813 insertions(+)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
>
> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index fd1831e97b22..c27db5643712 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
>         select V4L2_MEM2MEM_DEV
>         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
>         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> +       select V4L2_H264
>         help
>           Mediatek video codec driver provides HW capability to
>           encode and decode in a range of video formats on MT8173
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index 4ba93d838ab6..ca8e9e7a9c4e 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
>  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>                 vdec/vdec_vp8_if.o \
>                 vdec/vdec_vp9_if.o \
> +               vdec/vdec_h264_req_if.o \
>                 mtk_vcodec_dec_drv.o \
>                 vdec_drv_if.o \
>                 vdec_vpu_if.o \
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> new file mode 100644
> index 000000000000..2fbbfbbcfbec
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> @@ -0,0 +1,807 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/v4l2-h264.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include "../vdec_drv_if.h"
> +#include "../mtk_vcodec_util.h"
> +#include "../mtk_vcodec_dec.h"
> +#include "../mtk_vcodec_intr.h"
> +#include "../vdec_vpu_if.h"
> +#include "../vdec_drv_base.h"
> +
> +#define NAL_NON_IDR_SLICE                      0x01
> +#define NAL_IDR_SLICE                          0x05
> +#define NAL_H264_PPS                           0x08

Not used?

> +#define NAL_TYPE(value)                                ((value) & 0x1F)
> +

I believe you may not need the NAL type.

> +#define BUF_PREDICTION_SZ                      (64 * 4096)
> +#define MB_UNIT_LEN                            16
> +
> +/* get used parameters for sps/pps */
> +#define GET_MTK_VDEC_FLAG(cond, flag) \
> +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> +#define GET_MTK_VDEC_PARAM(param) \
> +       { dst_param->param = src_param->param; }
> +/* motion vector size (bytes) for every macro block */
> +#define HW_MB_STORE_SZ                         64
> +
> +#define H264_MAX_FB_NUM                                17
> +#define H264_MAX_MV_NUM                                32
> +#define HDR_PARSING_BUF_SZ                     1024
> +
> +/**
> + * struct mtk_h264_dpb_info  - h264 dpb information
> + * @y_dma_addr: Y bitstream physical address
> + * @c_dma_addr: CbCr bitstream physical address
> + * @reference_flag: reference picture flag (short/long term reference picture)
> + * @field: field picture flag
> + */
> +struct mtk_h264_dpb_info {
> +       dma_addr_t y_dma_addr;
> +       dma_addr_t c_dma_addr;
> +       int reference_flag;
> +       int field;
> +};
> +
> +/**
> + * struct mtk_h264_sps_param  - parameters for sps
> + */
> +struct mtk_h264_sps_param {
> +       unsigned char chroma_format_idc;
> +       unsigned char bit_depth_luma_minus8;
> +       unsigned char bit_depth_chroma_minus8;
> +       unsigned char log2_max_frame_num_minus4;
> +       unsigned char pic_order_cnt_type;
> +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> +       unsigned char max_num_ref_frames;
> +       unsigned char separate_colour_plane_flag;
> +       unsigned short pic_width_in_mbs_minus1;
> +       unsigned short pic_height_in_map_units_minus1;
> +       unsigned int max_frame_nums;
> +       unsigned char qpprime_y_zero_transform_bypass_flag;
> +       unsigned char delta_pic_order_always_zero_flag;
> +       unsigned char frame_mbs_only_flag;
> +       unsigned char mb_adaptive_frame_field_flag;
> +       unsigned char direct_8x8_inference_flag;
> +       unsigned char reserved[3];
> +};
> +
> +/**
> + * struct mtk_h264_pps_param  - parameters for pps
> + */
> +struct mtk_h264_pps_param {
> +       unsigned char num_ref_idx_l0_default_active_minus1;
> +       unsigned char num_ref_idx_l1_default_active_minus1;
> +       unsigned char weighted_bipred_idc;
> +       char pic_init_qp_minus26;
> +       char chroma_qp_index_offset;
> +       char second_chroma_qp_index_offset;
> +       unsigned char entropy_coding_mode_flag;
> +       unsigned char pic_order_present_flag;
> +       unsigned char deblocking_filter_control_present_flag;
> +       unsigned char constrained_intra_pred_flag;
> +       unsigned char weighted_pred_flag;
> +       unsigned char redundant_pic_cnt_present_flag;
> +       unsigned char transform_8x8_mode_flag;
> +       unsigned char scaling_matrix_present_flag;
> +       unsigned char reserved[2];
> +};
> +
> +struct slice_api_h264_scaling_matrix {

Equal to v4l2_ctrl_h264_scaling_matrix ?
Well I guess you don't want to mix a hardware-specific
thing with the V4L2 API maybe.

> +       unsigned char scaling_list_4x4[6][16];
> +       unsigned char scaling_list_8x8[6][64];
> +};
> +
> +struct slice_h264_dpb_entry {
> +       unsigned long long reference_ts;
> +       unsigned short frame_num;
> +       unsigned short pic_num;
> +       /* Note that field is indicated by v4l2_buffer.field */
> +       int top_field_order_cnt;
> +       int bottom_field_order_cnt;
> +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> +};
> +
> +/**
> + * struct slice_api_h264_decode_param - parameters for decode.
> + */
> +struct slice_api_h264_decode_param {
> +       struct slice_h264_dpb_entry dpb[16];

V4L2_H264_NUM_DPB_ENTRIES?

> +       unsigned short num_slices;
> +       unsigned short nal_ref_idc;
> +       unsigned char ref_pic_list_p0[32];
> +       unsigned char ref_pic_list_b0[32];
> +       unsigned char ref_pic_list_b1[32];

V4L2_H264_REF_LIST_LEN?

> +       int top_field_order_cnt;
> +       int bottom_field_order_cnt;
> +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> +};
> +
> +/**
> + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> + */
> +struct mtk_h264_dec_slice_param {
> +       struct mtk_h264_sps_param                       sps;
> +       struct mtk_h264_pps_param                       pps;
> +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> +       struct slice_api_h264_decode_param              decode_params;
> +       struct mtk_h264_dpb_info h264_dpb_info[16];

V4L2_H264_NUM_DPB_ENTRIES?

> +};
> +
> +/**
> + * struct h264_fb - h264 decode frame buffer information
> + * @vdec_fb_va  : virtual address of struct vdec_fb
> + * @y_fb_dma    : dma address of Y frame buffer (luma)
> + * @c_fb_dma    : dma address of C frame buffer (chroma)
> + * @poc         : picture order count of frame buffer
> + * @reserved    : for 8 bytes alignment
> + */
> +struct h264_fb {
> +       uint64_t vdec_fb_va;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +       int32_t poc;
> +       uint32_t reserved;
> +};
> +
> +/**
> + * struct vdec_h264_dec_info - decode information
> + * @dpb_sz             : decoding picture buffer size
> + * @resolution_changed  : resoltion change happen
> + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> + * @cap_num_planes     : number planes of capture buffer
> + * @bs_dma             : Input bit-stream buffer dma address
> + * @y_fb_dma           : Y frame buffer dma address
> + * @c_fb_dma           : C frame buffer dma address
> + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> + */
> +struct vdec_h264_dec_info {
> +       uint32_t dpb_sz;
> +       uint32_t resolution_changed;
> +       uint32_t realloc_mv_buf;
> +       uint32_t cap_num_planes;
> +       uint64_t bs_dma;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +       uint64_t vdec_fb_va;
> +};
> +
> +/**
> + * struct vdec_h264_vsi - shared memory for decode information exchange
> + *                        between VPU and Host.
> + *                        The memory is allocated by VPU then mapping to Host
> + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> + *                        by VPU.
> + *                        AP-W/R : AP is writer/reader on this item
> + *                        VPU-W/R: VPU is write/reader on this item
> + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> + * @dec          : decode information (AP-R, VPU-W)
> + * @pic          : picture information (AP-R, VPU-W)
> + * @crop         : crop information (AP-R, VPU-W)
> + */
> +struct vdec_h264_vsi {
> +       uint64_t pred_buf_dma;
> +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> +       struct vdec_h264_dec_info dec;
> +       struct vdec_pic_info pic;
> +       struct v4l2_rect crop;
> +       struct mtk_h264_dec_slice_param h264_slice_params;
> +};
> +
> +/**
> + * struct vdec_h264_slice_inst - h264 decoder instance
> + * @num_nalu : how many nalus be decoded
> + * @ctx      : point to mtk_vcodec_ctx
> + * @pred_buf : HW working predication buffer
> + * @mv_buf   : HW working motion vector buffer
> + * @vpu      : VPU instance
> + * @vsi_ctx  : Local VSI data for this decoding context
> + */
> +struct vdec_h264_slice_inst {
> +       unsigned int num_nalu;
> +       struct mtk_vcodec_ctx *ctx;
> +       struct mtk_vcodec_mem pred_buf;
> +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> +       struct vdec_vpu_inst vpu;
> +       struct vdec_h264_vsi vsi_ctx;
> +       struct mtk_h264_dec_slice_param h264_slice_param;
> +
> +       struct v4l2_h264_dpb_entry dpb[16];
> +};
> +
> +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> +                                int id)
> +{
> +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> +
> +       return ctrl->p_cur.p;
> +}
> +
> +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> +                             struct mtk_h264_dec_slice_param *slice_param)
> +{
> +       struct vb2_queue *vq;
> +       struct vb2_buffer *vb;
> +       struct vb2_v4l2_buffer *vb2_v4l2;
> +       u64 index;
> +
> +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> +
> +       for (index = 0; index < 16; index++) {

Ditto, some macro instead of 16.

> +               const struct slice_h264_dpb_entry *dpb;
> +               int vb2_index;
> +
> +               dpb = &slice_param->decode_params.dpb[index];
> +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> +                       continue;
> +               }
> +
> +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> +               if (vb2_index < 0) {
> +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> +                               index, dpb->reference_ts);
> +                       continue;
> +               }
> +               /* 1 for short term reference, 2 for long term reference */
> +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> +               else
> +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> +
> +               vb = vq->bufs[vb2_index];
> +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> +
> +               slice_param->h264_dpb_info[index].y_dma_addr =
> +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> +                       slice_param->h264_dpb_info[index].c_dma_addr =
> +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> +               }
> +       }
> +}
> +
> +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> +       const struct v4l2_ctrl_h264_sps *src_param)
> +{
> +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> +
> +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> +}
> +
> +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> +       const struct v4l2_ctrl_h264_pps *src_param)
> +{
> +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> +
> +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> +}
> +
> +static void
> +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> +{
> +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> +              sizeof(dst_matrix->scaling_list_4x4));
> +
> +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> +              sizeof(dst_matrix->scaling_list_8x8));
> +}
> +
> +static void get_h264_decode_parameters(
> +       struct slice_api_h264_decode_param *dst_params,
> +       const struct v4l2_ctrl_h264_decode_params *src_params,
> +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> +
> +               dst_entry->reference_ts = src_entry->reference_ts;
> +               dst_entry->frame_num = src_entry->frame_num;
> +               dst_entry->pic_num = src_entry->pic_num;
> +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> +               dst_entry->bottom_field_order_cnt =
> +                       src_entry->bottom_field_order_cnt;
> +               dst_entry->flags = src_entry->flags;
> +       }
> +
> +       // num_slices is a leftover from the old H.264 support and is ignored
> +       // by the firmware.
> +       dst_params->num_slices = 0;
> +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> +       dst_params->flags = src_params->flags;
> +}
> +
> +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> +                           const struct v4l2_h264_dpb_entry *b)
> +{
> +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> +}
> +
> +/*
> + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> + * into the already existing slot in dpb, and move other entries into new slots.
> + *
> + * This function is an adaptation of the similarly-named function in
> + * hantro_h264.c.
> + */
> +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> +                      struct v4l2_h264_dpb_entry *dpb)
> +{
> +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +       unsigned int i, j;
> +
> +       /* Disable all entries by default, and mark the ones in use. */
> +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> +                       set_bit(i, in_use);
> +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> +       }
> +
> +       /* Try to match new DPB entries with existing ones by their POCs. */
> +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +
> +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> +                       continue;
> +
> +               /*
> +                * To cut off some comparisons, iterate only on target DPB
> +                * entries were already used.
> +                */
> +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> +                       struct v4l2_h264_dpb_entry *cdpb;
> +
> +                       cdpb = &dpb[j];
> +                       if (!dpb_entry_match(cdpb, ndpb))
> +                               continue;
> +
> +                       *cdpb = *ndpb;
> +                       set_bit(j, used);
> +                       /* Don't reiterate on this one. */
> +                       clear_bit(j, in_use);
> +                       break;
> +               }
> +
> +               if (j == ARRAY_SIZE(dec_param->dpb))
> +                       set_bit(i, new);
> +       }
> +
> +       /* For entries that could not be matched, use remaining free slots. */
> +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +               struct v4l2_h264_dpb_entry *cdpb;
> +
> +               /*
> +                * Both arrays are of the same sizes, so there is no way
> +                * we can end up with no space in target array, unless
> +                * something is buggy.
> +                */
> +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> +                       return;
> +
> +               cdpb = &dpb[j];
> +               *cdpb = *ndpb;
> +               set_bit(j, used);
> +       }
> +}
> +
> +/*
> + * The firmware expects unused reflist entries to have the value 0x20.
> + */
> +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> +{
> +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> +}
> +
> +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> +{
> +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> +       const struct v4l2_ctrl_h264_sps *sps =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> +       const struct v4l2_ctrl_h264_pps *pps =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> +       struct v4l2_h264_reflist_builder reflist_builder;
> +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> +       int i;
> +
> +       update_dpb(dec_params, inst->dpb);
> +
> +       get_h264_sps_parameters(&slice_param->sps, sps);
> +       get_h264_pps_parameters(&slice_param->pps, pps);
> +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> +                                  inst->dpb);
> +       get_h264_dpb_list(inst, slice_param);
> +
> +       /* Prepare the fields for our reference lists */
> +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> +       /* Build the reference lists */
> +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> +                                      inst->dpb);
> +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> +       /* Adapt the built lists to the firmware's expectations */
> +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> +
> +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> +              sizeof(inst->vsi_ctx.h264_slice_params));
> +}
> +
> +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> +{
> +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> +
> +       return HW_MB_STORE_SZ * unit_size;
> +}
> +
> +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       int err = 0;
> +
> +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> +       if (err) {
> +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> +               return err;
> +       }
> +
> +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> +       return 0;
> +}
> +
> +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       struct mtk_vcodec_mem *mem = NULL;
> +
> +       mtk_vcodec_debug_enter(inst);
> +
> +       inst->vsi_ctx.pred_buf_dma = 0;
> +       mem = &inst->pred_buf;
> +       if (mem->va)
> +               mtk_vcodec_mem_free(inst->ctx, mem);
> +}
> +
> +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> +       struct vdec_pic_info *pic)
> +{
> +       int i;
> +       int err;
> +       struct mtk_vcodec_mem *mem = NULL;
> +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> +
> +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +               mem = &inst->mv_buf[i];
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(inst->ctx, mem);
> +               mem->size = buf_sz;
> +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +               if (err) {
> +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> +                       return err;
> +               }
> +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> +       }
> +
> +       return 0;
> +}
> +
> +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> +{
> +       int i;
> +       struct mtk_vcodec_mem *mem = NULL;
> +
> +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> +               mem = &inst->mv_buf[i];
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(inst->ctx, mem);
> +       }
> +}
> +
> +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> +                        struct vdec_pic_info *pic)
> +{
> +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> +
> +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> +       inst->vsi_ctx.dec.cap_num_planes =
> +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> +
> +       pic = &ctx->picinfo;
> +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> +               ctx->picinfo.fb_sz[1]);
> +
> +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> +               inst->vsi_ctx.dec.resolution_changed = true;
> +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> +
> +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> +                       inst->vsi_ctx.dec.resolution_changed,
> +                       inst->vsi_ctx.dec.realloc_mv_buf,
> +                       ctx->last_decoded_picinfo.pic_w,
> +                       ctx->last_decoded_picinfo.pic_h,
> +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> +       }
> +}
> +
> +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> +       struct v4l2_rect *cr)
> +{
> +       cr->left = inst->vsi_ctx.crop.left;
> +       cr->top = inst->vsi_ctx.crop.top;
> +       cr->width = inst->vsi_ctx.crop.width;
> +       cr->height = inst->vsi_ctx.crop.height;
> +
> +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> +                        cr->left, cr->top, cr->width, cr->height);
> +}
> +
> +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> +       unsigned int *dpb_sz)
> +{
> +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> +}
> +
> +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> +{
> +       struct vdec_h264_slice_inst *inst = NULL;
> +       int err;
> +
> +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> +       if (!inst)
> +               return -ENOMEM;
> +
> +       inst->ctx = ctx;
> +
> +       inst->vpu.id = SCP_IPI_VDEC_H264;
> +       inst->vpu.ctx = ctx;
> +
> +       err = vpu_dec_init(&inst->vpu);
> +       if (err) {
> +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> +               goto error_free_inst;
> +       }
> +
> +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> +       inst->vsi_ctx.dec.resolution_changed = true;
> +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> +
> +       err = allocate_predication_buf(inst);
> +       if (err)
> +               goto error_deinit;
> +
> +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> +               sizeof(struct mtk_h264_sps_param),
> +               sizeof(struct mtk_h264_pps_param),
> +               sizeof(struct mtk_h264_dec_slice_param),
> +               sizeof(struct mtk_h264_dpb_info));
> +
> +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> +
> +       ctx->drv_handle = inst;
> +       return 0;
> +
> +error_deinit:
> +       vpu_dec_deinit(&inst->vpu);
> +
> +error_free_inst:
> +       kfree(inst);
> +       return err;
> +}
> +
> +static void vdec_h264_slice_deinit(void *h_vdec)
> +{
> +       struct vdec_h264_slice_inst *inst =
> +               (struct vdec_h264_slice_inst *)h_vdec;
> +
> +       mtk_vcodec_debug_enter(inst);
> +
> +       vpu_dec_deinit(&inst->vpu);
> +       free_predication_buf(inst);
> +       free_mv_buf(inst);
> +
> +       kfree(inst);
> +}
> +
> +static int find_start_code(unsigned char *data, unsigned int data_sz)
> +{
> +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> +               return 3;
> +
> +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> +           data[3] == 1)
> +               return 4;
> +
> +       return -1;
> +}
> +
> +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> +                                 struct vdec_fb *fb, bool *res_chg)
> +{
> +       struct vdec_h264_slice_inst *inst =
> +               (struct vdec_h264_slice_inst *)h_vdec;
> +       struct vdec_vpu_inst *vpu = &inst->vpu;
> +       struct mtk_video_dec_buf *src_buf_info;
> +       int nal_start_idx = 0, err = 0;
> +       uint32_t nal_type, data[2];
> +       unsigned char *buf;
> +       uint64_t y_fb_dma;
> +       uint64_t c_fb_dma;
> +
> +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> +
> +       /* bs NULL means flush decoder */
> +       if (bs == NULL)
> +               return vpu_dec_reset(vpu);
> +
> +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +
> +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> +
> +       buf = (unsigned char *)bs->va;

I can be completely wrong, but it would seem here
is where the CPU mapping is used.

> +       nal_start_idx = find_start_code(buf, bs->size);
> +       if (nal_start_idx < 0)
> +               goto err_free_fb_out;
> +
> +       data[0] = bs->size;
> +       data[1] = buf[nal_start_idx];
> +       nal_type = NAL_TYPE(buf[nal_start_idx]);

Which seems to be used to parse the NAL type. But shouldn't
you expect here VLC NALUs only?

I.e. you only get IDR or non-IDR frames, marked with
V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.

> +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> +                        nal_type);
> +
> +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> +
> +       get_vdec_decode_parameters(inst);
> +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> +       if (*res_chg) {
> +               mtk_vcodec_debug(inst, "- resolution changed -");
> +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> +                       if (err)
> +                               goto err_free_fb_out;
> +               }
> +               *res_chg = false;
> +       }
> +
> +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> +       err = vpu_dec_start(vpu, data, 2);

Then it seems this 2-bytes are passed to the firmware. Maybe you
could test if that can be derived without the CPU mapping.
That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.

Thanks,
Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-03-03 21:30     ` Ezequiel Garcia
@ 2021-03-15 11:28       ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-15 11:28 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Ezequiel, thanks for the feedback!

On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hello Alex,
>
> Thanks for the patch.
>
> On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > From: Yunfei Dong <yunfei.dong@mediatek.com>
> >
> > Support the stateless codec API that will be used by MT8183.
> >
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > [acourbot: refactor, cleanup and split]
> > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > ---
> >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> >  5 files changed, 503 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> >
> [..]
>
> > +
> > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
>
> This "needed_in_request" is not really required, as controls
> are not volatile, and their value is stored per-context (per-fd).
>
> It's perfectly valid for an application to pass the SPS control
> at the beginning of the sequence, and then omit it
> in further requests.

If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
requests, this boolean only checks that the control has been provided
at least once, and not that it is provided with every request. Without
it we could send a frame to the firmware without e.g. setting an SPS,
which would be a problem.

>
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > +                       .menu_skip_mask =
> > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +       },
> > +};
>
> Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> the driver supports. From a next patch, this case seems to be
> V4L2_STATELESS_H264_START_CODE_ANNEX_B.

Indeed - I've added the control, thanks for catching this!

>
> > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > +
> > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > +               .type = MTK_FMT_DEC,
> > +               .num_planes = 1,
> > +       },
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_MM21,
> > +               .type = MTK_FMT_FRAME,
> > +               .num_planes = 2,
> > +       },
> > +};
> > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > +#define DEFAULT_OUT_FMT_IDX    0
> > +#define DEFAULT_CAP_FMT_IDX    1
> > +
> > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > +               .stepwise = {
> > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > +               },
> > +       },
> > +};
> > +
> > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > +
> > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > +                                              struct vdec_fb *fb)
> > +{
> > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > +
> > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +               unsigned int cap_c_size =
> > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > +
> > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > +       }
> > +}
> > +
> > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > +{
> > +       struct mtk_video_dec_buf *framebuf =
> > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > +
> > +       pfb = &framebuf->frame_buffer;
> > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
>
> Are you sure you need a CPU mapping? It seems strange.
> I'll comment some more on the next patch(es).

I'll answer on the next patch since this is where that mapping is being used.

>
> > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > +
> > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > +               pfb->base_c.dma_addr =
> > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > +       }
> > +       mtk_v4l2_debug(1,
> > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > +               dst_buf->index, pfb,
> > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > +               ctx->decoded_frame_cnt);
> > +
> > +       return pfb;
> > +}
> > +
> > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +
> > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > +}
> > +
> > +static int fops_media_request_validate(struct media_request *mreq)
> > +{
> > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > +       struct mtk_vcodec_ctx *ctx = NULL;
> > +       struct media_request_object *req_obj;
> > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > +       struct v4l2_ctrl *ctrl;
> > +       unsigned int i;
> > +
> > +       switch (buffer_cnt) {
> > +       case 1:
> > +               /* We expect exactly one buffer with the request */
> > +               break;
> > +       case 0:
> > +               mtk_v4l2_err("No buffer provided with the request");
> > +               return -ENOENT;
> > +       default:
> > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > +                            buffer_cnt);
> > +               return -EINVAL;
> > +       }
> > +
> > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > +               struct vb2_buffer *vb;
> > +
> > +               if (vb2_request_object_is_buffer(req_obj)) {
> > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +                       break;
> > +               }
> > +       }
> > +
> > +       if (!ctx) {
> > +               mtk_v4l2_err("Cannot find buffer for request");
> > +               return -ENOENT;
> > +       }
> > +
> > +       parent_hdl = &ctx->ctrl_hdl;
> > +
> > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > +       if (!hdl) {
> > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > +               return -ENOENT;
> > +       }
> > +
> > +       for (i = 0; i < NUM_CTRLS; i++) {
> > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > +                       continue;
> > +               if (!mtk_stateless_controls[i].needed_in_request)
> > +                       continue;
> > +
> > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > +                                         mtk_stateless_controls[i].cfg.id);
> > +               if (!ctrl) {
> > +                       mtk_v4l2_err("Missing required codec control\n");
> > +                       return -ENOENT;
> > +               }
> > +       }
> > +
> > +       v4l2_ctrl_request_hdl_put(hdl);
> > +
> > +       return vb2_request_validate(mreq);
> > +}
> > +
> > +static void mtk_vdec_worker(struct work_struct *work)
> > +{
> > +       struct mtk_vcodec_ctx *ctx =
> > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > +       struct vb2_buffer *vb2_src;
> > +       struct mtk_vcodec_mem *bs_src;
> > +       struct mtk_video_dec_buf *dec_buf_src;
> > +       struct media_request *src_buf_req;
> > +       struct vdec_fb *dst_buf;
> > +       bool res_chg = false;
> > +       int ret;
> > +
> > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > +       if (vb2_v4l2_src == NULL) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > +               return;
> > +       }
> > +
> > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > +       if (vb2_v4l2_dst == NULL) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > +               return;
> > +       }
> > +
> > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > +                                  m2m_buf.vb);
> > +       bs_src = &dec_buf_src->bs_buffer;
> > +
> > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > +                       ctx->id, src_buf->vb2_queue->type,
> > +                       src_buf->index, src_buf, src_buf_info);
> > +
> > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > +       if (!bs_src->va) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > +                            vb2_src->index);
> > +               return;
> > +       }
> > +
> > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > +       /* Apply request controls. */
> > +       src_buf_req = vb2_src->req_obj.req;
> > +       if (src_buf_req)
> > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > +       else
> > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > +
> > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > +       if (ret) {
> > +               mtk_v4l2_err(
> > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > +                       ctx->id, vb2_src->index, bs_src->size,
> > +                       vb2_src->timestamp, ret, res_chg);
> > +               if (ret == -EIO) {
> > +                       mutex_lock(&ctx->lock);
> > +                       dec_buf_src->error = true;
> > +                       mutex_unlock(&ctx->lock);
> > +               }
> > +       }
> > +
> > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > +
> > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > +
> > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > +}
> > +
> > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > +
> > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > +                       ctx->id, vb->vb2_queue->type,
> > +                       vb->index, vb);
> > +
> > +       mutex_lock(&ctx->lock);
> > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > +       mutex_unlock(&ctx->lock);
> > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > +               return;
> > +
> > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > +               vb->vb2_queue->type, vb->index, src_buf);
> > +
> > +       /* If an OUTPUT buffer, we may need to update the state */
> > +       if (ctx->state == MTK_STATE_INIT) {
> > +               ctx->state = MTK_STATE_HEADER;
> > +               mtk_v4l2_debug(1, "Init driver from init to header.");
>
> This state thing seems just something to make the rest
> of the stateful-based driver happy, right?

Correct - if anything we should either use more of the state here
(i.e. set the error state when relevant) or move the state entirely in
the stateful part of the driver.

>
> Makes me wonder a bit if just splitting the stateless part to its
> own driver, wouldn't make your maintenance easier.
>
> What's the motivation for sharing the driver?

Technically you could do it both ways. Separating the driver would
result in some boilerplate code and buffer-management structs
duplication (unless we keep the shared part under another module - but
in this case we are basically in the same situation as now). Also
despite using different userspace-facing ABIs, MT8173 and MT8183
follow a similar architecture and a similar firmware interface.
Considering these similarities it seems simpler from an architectural
point of view to have all the Mediatek codec support under the same
driver. It also probably results in less code.

That being said, the split can probably be improved as you pointed out
with this state variable. But the current split is not too bad IMHO,
at least not worse than how the code was originally.

>
> > +       } else {
> > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > +                               ctx->id, ctx->state);
> > +       }
> > +}
> > +
> > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       bool res_chg;
> > +
> > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > +}
> > +
> > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > +};
> > +
> > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct v4l2_ctrl *ctrl;
> > +       unsigned int i;
> > +
> > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > +       if (ctx->ctrl_hdl.error) {
> > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > +               return ctx->ctrl_hdl.error;
> > +       }
> > +
> > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > +                               &mtk_vcodec_dec_ctrl_ops,
> > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > +                               0, 32, 1, 1);
> > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
>
> Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> to return the DPB size. However, isn't this something userspace already knows?

True, but that's also a control the driver is supposed to provide per
the spec IIUC.

>
> > +
> > +       for (i = 0; i < NUM_CTRLS; i++) {
> > +               struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
> > +
> > +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> > +               if (ctx->ctrl_hdl.error) {
> > +                       mtk_v4l2_err("Adding control %d failed %d",
> > +                                       i, ctx->ctrl_hdl.error);
> > +                       return ctx->ctrl_hdl.error;
> > +               }
> > +       }
> > +
> > +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> > +
> > +       return 0;
> > +}
> > +
> > +const struct media_device_ops mtk_vcodec_media_ops = {
> > +       .req_validate   = fops_media_request_validate,
> > +       .req_queue      = v4l2_m2m_request_queue,
> > +};
> > +
> > +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct vb2_queue *src_vq;
> > +
> > +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> > +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> > +
> > +       /* Support request api for output plane */
> > +       src_vq->supports_requests = true;
> > +       src_vq->requires_requests = true;
> > +}
> > +
> > +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> > +{
>
> I have to admit I do not remember exactly the reason,
> but this should set the buffer field to V4L2_FIELD_NONE.

Right, I see all other drivers are doing this. Done.

Cheers,
Alex.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-15 11:28       ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-15 11:28 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Ezequiel, thanks for the feedback!

On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hello Alex,
>
> Thanks for the patch.
>
> On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > From: Yunfei Dong <yunfei.dong@mediatek.com>
> >
> > Support the stateless codec API that will be used by MT8183.
> >
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > [acourbot: refactor, cleanup and split]
> > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > ---
> >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> >  5 files changed, 503 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> >
> [..]
>
> > +
> > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
>
> This "needed_in_request" is not really required, as controls
> are not volatile, and their value is stored per-context (per-fd).
>
> It's perfectly valid for an application to pass the SPS control
> at the beginning of the sequence, and then omit it
> in further requests.

If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
requests, this boolean only checks that the control has been provided
at least once, and not that it is provided with every request. Without
it we could send a frame to the firmware without e.g. setting an SPS,
which would be a problem.

>
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +               .needed_in_request = true,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > +                       .menu_skip_mask =
> > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +       },
> > +       {
> > +               .cfg = {
> > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > +               },
> > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > +       },
> > +};
>
> Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> the driver supports. From a next patch, this case seems to be
> V4L2_STATELESS_H264_START_CODE_ANNEX_B.

Indeed - I've added the control, thanks for catching this!

>
> > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > +
> > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > +               .type = MTK_FMT_DEC,
> > +               .num_planes = 1,
> > +       },
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_MM21,
> > +               .type = MTK_FMT_FRAME,
> > +               .num_planes = 2,
> > +       },
> > +};
> > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > +#define DEFAULT_OUT_FMT_IDX    0
> > +#define DEFAULT_CAP_FMT_IDX    1
> > +
> > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > +       {
> > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > +               .stepwise = {
> > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > +               },
> > +       },
> > +};
> > +
> > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > +
> > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > +                                              struct vdec_fb *fb)
> > +{
> > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > +
> > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +               unsigned int cap_c_size =
> > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > +
> > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > +       }
> > +}
> > +
> > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > +{
> > +       struct mtk_video_dec_buf *framebuf =
> > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > +
> > +       pfb = &framebuf->frame_buffer;
> > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
>
> Are you sure you need a CPU mapping? It seems strange.
> I'll comment some more on the next patch(es).

I'll answer on the next patch since this is where that mapping is being used.

>
> > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > +
> > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > +               pfb->base_c.dma_addr =
> > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > +       }
> > +       mtk_v4l2_debug(1,
> > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > +               dst_buf->index, pfb,
> > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > +               ctx->decoded_frame_cnt);
> > +
> > +       return pfb;
> > +}
> > +
> > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +
> > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > +}
> > +
> > +static int fops_media_request_validate(struct media_request *mreq)
> > +{
> > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > +       struct mtk_vcodec_ctx *ctx = NULL;
> > +       struct media_request_object *req_obj;
> > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > +       struct v4l2_ctrl *ctrl;
> > +       unsigned int i;
> > +
> > +       switch (buffer_cnt) {
> > +       case 1:
> > +               /* We expect exactly one buffer with the request */
> > +               break;
> > +       case 0:
> > +               mtk_v4l2_err("No buffer provided with the request");
> > +               return -ENOENT;
> > +       default:
> > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > +                            buffer_cnt);
> > +               return -EINVAL;
> > +       }
> > +
> > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > +               struct vb2_buffer *vb;
> > +
> > +               if (vb2_request_object_is_buffer(req_obj)) {
> > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +                       break;
> > +               }
> > +       }
> > +
> > +       if (!ctx) {
> > +               mtk_v4l2_err("Cannot find buffer for request");
> > +               return -ENOENT;
> > +       }
> > +
> > +       parent_hdl = &ctx->ctrl_hdl;
> > +
> > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > +       if (!hdl) {
> > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > +               return -ENOENT;
> > +       }
> > +
> > +       for (i = 0; i < NUM_CTRLS; i++) {
> > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > +                       continue;
> > +               if (!mtk_stateless_controls[i].needed_in_request)
> > +                       continue;
> > +
> > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > +                                         mtk_stateless_controls[i].cfg.id);
> > +               if (!ctrl) {
> > +                       mtk_v4l2_err("Missing required codec control\n");
> > +                       return -ENOENT;
> > +               }
> > +       }
> > +
> > +       v4l2_ctrl_request_hdl_put(hdl);
> > +
> > +       return vb2_request_validate(mreq);
> > +}
> > +
> > +static void mtk_vdec_worker(struct work_struct *work)
> > +{
> > +       struct mtk_vcodec_ctx *ctx =
> > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > +       struct vb2_buffer *vb2_src;
> > +       struct mtk_vcodec_mem *bs_src;
> > +       struct mtk_video_dec_buf *dec_buf_src;
> > +       struct media_request *src_buf_req;
> > +       struct vdec_fb *dst_buf;
> > +       bool res_chg = false;
> > +       int ret;
> > +
> > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > +       if (vb2_v4l2_src == NULL) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > +               return;
> > +       }
> > +
> > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > +       if (vb2_v4l2_dst == NULL) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > +               return;
> > +       }
> > +
> > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > +                                  m2m_buf.vb);
> > +       bs_src = &dec_buf_src->bs_buffer;
> > +
> > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > +                       ctx->id, src_buf->vb2_queue->type,
> > +                       src_buf->index, src_buf, src_buf_info);
> > +
> > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > +       if (!bs_src->va) {
> > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > +                            vb2_src->index);
> > +               return;
> > +       }
> > +
> > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > +       /* Apply request controls. */
> > +       src_buf_req = vb2_src->req_obj.req;
> > +       if (src_buf_req)
> > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > +       else
> > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > +
> > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > +       if (ret) {
> > +               mtk_v4l2_err(
> > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > +                       ctx->id, vb2_src->index, bs_src->size,
> > +                       vb2_src->timestamp, ret, res_chg);
> > +               if (ret == -EIO) {
> > +                       mutex_lock(&ctx->lock);
> > +                       dec_buf_src->error = true;
> > +                       mutex_unlock(&ctx->lock);
> > +               }
> > +       }
> > +
> > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > +
> > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > +
> > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > +}
> > +
> > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > +
> > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > +                       ctx->id, vb->vb2_queue->type,
> > +                       vb->index, vb);
> > +
> > +       mutex_lock(&ctx->lock);
> > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > +       mutex_unlock(&ctx->lock);
> > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > +               return;
> > +
> > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > +               vb->vb2_queue->type, vb->index, src_buf);
> > +
> > +       /* If an OUTPUT buffer, we may need to update the state */
> > +       if (ctx->state == MTK_STATE_INIT) {
> > +               ctx->state = MTK_STATE_HEADER;
> > +               mtk_v4l2_debug(1, "Init driver from init to header.");
>
> This state thing seems just something to make the rest
> of the stateful-based driver happy, right?

Correct - if anything we should either use more of the state here
(i.e. set the error state when relevant) or move the state entirely in
the stateful part of the driver.

>
> Makes me wonder a bit if just splitting the stateless part to its
> own driver, wouldn't make your maintenance easier.
>
> What's the motivation for sharing the driver?

Technically you could do it both ways. Separating the driver would
result in some boilerplate code and buffer-management structs
duplication (unless we keep the shared part under another module - but
in this case we are basically in the same situation as now). Also
despite using different userspace-facing ABIs, MT8173 and MT8183
follow a similar architecture and a similar firmware interface.
Considering these similarities it seems simpler from an architectural
point of view to have all the Mediatek codec support under the same
driver. It also probably results in less code.

That being said, the split can probably be improved as you pointed out
with this state variable. But the current split is not too bad IMHO,
at least not worse than how the code was originally.

>
> > +       } else {
> > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > +                               ctx->id, ctx->state);
> > +       }
> > +}
> > +
> > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       bool res_chg;
> > +
> > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > +}
> > +
> > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > +};
> > +
> > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct v4l2_ctrl *ctrl;
> > +       unsigned int i;
> > +
> > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > +       if (ctx->ctrl_hdl.error) {
> > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > +               return ctx->ctrl_hdl.error;
> > +       }
> > +
> > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > +                               &mtk_vcodec_dec_ctrl_ops,
> > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > +                               0, 32, 1, 1);
> > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
>
> Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> to return the DPB size. However, isn't this something userspace already knows?

True, but that's also a control the driver is supposed to provide per
the spec IIUC.

>
> > +
> > +       for (i = 0; i < NUM_CTRLS; i++) {
> > +               struct v4l2_ctrl_config cfg = mtk_stateless_controls[i].cfg;
> > +
> > +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> > +               if (ctx->ctrl_hdl.error) {
> > +                       mtk_v4l2_err("Adding control %d failed %d",
> > +                                       i, ctx->ctrl_hdl.error);
> > +                       return ctx->ctrl_hdl.error;
> > +               }
> > +       }
> > +
> > +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> > +
> > +       return 0;
> > +}
> > +
> > +const struct media_device_ops mtk_vcodec_media_ops = {
> > +       .req_validate   = fops_media_request_validate,
> > +       .req_queue      = v4l2_m2m_request_queue,
> > +};
> > +
> > +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct vb2_queue *src_vq;
> > +
> > +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> > +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> > +
> > +       /* Support request api for output plane */
> > +       src_vq->supports_requests = true;
> > +       src_vq->requires_requests = true;
> > +}
> > +
> > +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> > +{
>
> I have to admit I do not remember exactly the reason,
> but this should set the buffer field to V4L2_FIELD_NONE.

Right, I see all other drivers are doing this. Done.

Cheers,
Alex.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-03-03 21:47     ` Ezequiel Garcia
@ 2021-03-15 11:28       ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-15 11:28 UTC (permalink / raw)
  To: Ezequiel Garcia, Yunfei Dong
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Mauro Carvalho Chehab,
	Hans Verkuil, linux-media, Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Ezequiel,

On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
>  Hi Alex,
>
> Thanks for the patch.
>
> On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > From: Yunfei Dong <yunfei.dong@mediatek.com>
> >
> > Add support for H.264 decoding using the stateless API, as supported by
> > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > builders.
> >
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > [acourbot: refactor, cleanup and split]
> > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > ---
> >  drivers/media/platform/Kconfig                |   1 +
> >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> >  5 files changed, 813 insertions(+)
> >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index fd1831e97b22..c27db5643712 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> >         select V4L2_MEM2MEM_DEV
> >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > +       select V4L2_H264
> >         help
> >           Mediatek video codec driver provides HW capability to
> >           encode and decode in a range of video formats on MT8173
> > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> >                 vdec/vdec_vp8_if.o \
> >                 vdec/vdec_vp9_if.o \
> > +               vdec/vdec_h264_req_if.o \
> >                 mtk_vcodec_dec_drv.o \
> >                 vdec_drv_if.o \
> >                 vdec_vpu_if.o \
> > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > new file mode 100644
> > index 000000000000..2fbbfbbcfbec
> > --- /dev/null
> > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > @@ -0,0 +1,807 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <media/v4l2-mem2mem.h>
> > +#include <media/v4l2-h264.h>
> > +#include <media/videobuf2-dma-contig.h>
> > +
> > +#include "../vdec_drv_if.h"
> > +#include "../mtk_vcodec_util.h"
> > +#include "../mtk_vcodec_dec.h"
> > +#include "../mtk_vcodec_intr.h"
> > +#include "../vdec_vpu_if.h"
> > +#include "../vdec_drv_base.h"
> > +
> > +#define NAL_NON_IDR_SLICE                      0x01
> > +#define NAL_IDR_SLICE                          0x05
> > +#define NAL_H264_PPS                           0x08
>
> Not used?
>
> > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > +
>
> I believe you may not need the NAL type.

True, removed this block of defines.

>
> > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > +#define MB_UNIT_LEN                            16
> > +
> > +/* get used parameters for sps/pps */
> > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > +#define GET_MTK_VDEC_PARAM(param) \
> > +       { dst_param->param = src_param->param; }
> > +/* motion vector size (bytes) for every macro block */
> > +#define HW_MB_STORE_SZ                         64
> > +
> > +#define H264_MAX_FB_NUM                                17
> > +#define H264_MAX_MV_NUM                                32
> > +#define HDR_PARSING_BUF_SZ                     1024
> > +
> > +/**
> > + * struct mtk_h264_dpb_info  - h264 dpb information
> > + * @y_dma_addr: Y bitstream physical address
> > + * @c_dma_addr: CbCr bitstream physical address
> > + * @reference_flag: reference picture flag (short/long term reference picture)
> > + * @field: field picture flag
> > + */
> > +struct mtk_h264_dpb_info {
> > +       dma_addr_t y_dma_addr;
> > +       dma_addr_t c_dma_addr;
> > +       int reference_flag;
> > +       int field;
> > +};
> > +
> > +/**
> > + * struct mtk_h264_sps_param  - parameters for sps
> > + */
> > +struct mtk_h264_sps_param {
> > +       unsigned char chroma_format_idc;
> > +       unsigned char bit_depth_luma_minus8;
> > +       unsigned char bit_depth_chroma_minus8;
> > +       unsigned char log2_max_frame_num_minus4;
> > +       unsigned char pic_order_cnt_type;
> > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > +       unsigned char max_num_ref_frames;
> > +       unsigned char separate_colour_plane_flag;
> > +       unsigned short pic_width_in_mbs_minus1;
> > +       unsigned short pic_height_in_map_units_minus1;
> > +       unsigned int max_frame_nums;
> > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > +       unsigned char delta_pic_order_always_zero_flag;
> > +       unsigned char frame_mbs_only_flag;
> > +       unsigned char mb_adaptive_frame_field_flag;
> > +       unsigned char direct_8x8_inference_flag;
> > +       unsigned char reserved[3];
> > +};
> > +
> > +/**
> > + * struct mtk_h264_pps_param  - parameters for pps
> > + */
> > +struct mtk_h264_pps_param {
> > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > +       unsigned char weighted_bipred_idc;
> > +       char pic_init_qp_minus26;
> > +       char chroma_qp_index_offset;
> > +       char second_chroma_qp_index_offset;
> > +       unsigned char entropy_coding_mode_flag;
> > +       unsigned char pic_order_present_flag;
> > +       unsigned char deblocking_filter_control_present_flag;
> > +       unsigned char constrained_intra_pred_flag;
> > +       unsigned char weighted_pred_flag;
> > +       unsigned char redundant_pic_cnt_present_flag;
> > +       unsigned char transform_8x8_mode_flag;
> > +       unsigned char scaling_matrix_present_flag;
> > +       unsigned char reserved[2];
> > +};
> > +
> > +struct slice_api_h264_scaling_matrix {
>
> Equal to v4l2_ctrl_h264_scaling_matrix ?
> Well I guess you don't want to mix a hardware-specific
> thing with the V4L2 API maybe.

That's the idea. Although the layout match and the ABI is now stable,
I think this communicates better the fact that this is a firmware
structure.

>
> > +       unsigned char scaling_list_4x4[6][16];
> > +       unsigned char scaling_list_8x8[6][64];
> > +};
> > +
> > +struct slice_h264_dpb_entry {
> > +       unsigned long long reference_ts;
> > +       unsigned short frame_num;
> > +       unsigned short pic_num;
> > +       /* Note that field is indicated by v4l2_buffer.field */
> > +       int top_field_order_cnt;
> > +       int bottom_field_order_cnt;
> > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > +};
> > +
> > +/**
> > + * struct slice_api_h264_decode_param - parameters for decode.
> > + */
> > +struct slice_api_h264_decode_param {
> > +       struct slice_h264_dpb_entry dpb[16];
>
> V4L2_H264_NUM_DPB_ENTRIES?

For the same reason as above (this being a firmware structure), I
think it is clearer to not use the kernel definitions here.

>
> > +       unsigned short num_slices;
> > +       unsigned short nal_ref_idc;
> > +       unsigned char ref_pic_list_p0[32];
> > +       unsigned char ref_pic_list_b0[32];
> > +       unsigned char ref_pic_list_b1[32];
>
> V4L2_H264_REF_LIST_LEN?

Ditto.

>
> > +       int top_field_order_cnt;
> > +       int bottom_field_order_cnt;
> > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > +};
> > +
> > +/**
> > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > + */
> > +struct mtk_h264_dec_slice_param {
> > +       struct mtk_h264_sps_param                       sps;
> > +       struct mtk_h264_pps_param                       pps;
> > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > +       struct slice_api_h264_decode_param              decode_params;
> > +       struct mtk_h264_dpb_info h264_dpb_info[16];
>
> V4L2_H264_NUM_DPB_ENTRIES?

Ditto.

>
> > +};
> > +
> > +/**
> > + * struct h264_fb - h264 decode frame buffer information
> > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > + * @poc         : picture order count of frame buffer
> > + * @reserved    : for 8 bytes alignment
> > + */
> > +struct h264_fb {
> > +       uint64_t vdec_fb_va;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +       int32_t poc;
> > +       uint32_t reserved;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_dec_info - decode information
> > + * @dpb_sz             : decoding picture buffer size
> > + * @resolution_changed  : resoltion change happen
> > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > + * @cap_num_planes     : number planes of capture buffer
> > + * @bs_dma             : Input bit-stream buffer dma address
> > + * @y_fb_dma           : Y frame buffer dma address
> > + * @c_fb_dma           : C frame buffer dma address
> > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > + */
> > +struct vdec_h264_dec_info {
> > +       uint32_t dpb_sz;
> > +       uint32_t resolution_changed;
> > +       uint32_t realloc_mv_buf;
> > +       uint32_t cap_num_planes;
> > +       uint64_t bs_dma;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +       uint64_t vdec_fb_va;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > + *                        between VPU and Host.
> > + *                        The memory is allocated by VPU then mapping to Host
> > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > + *                        by VPU.
> > + *                        AP-W/R : AP is writer/reader on this item
> > + *                        VPU-W/R: VPU is write/reader on this item
> > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > + * @dec          : decode information (AP-R, VPU-W)
> > + * @pic          : picture information (AP-R, VPU-W)
> > + * @crop         : crop information (AP-R, VPU-W)
> > + */
> > +struct vdec_h264_vsi {
> > +       uint64_t pred_buf_dma;
> > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > +       struct vdec_h264_dec_info dec;
> > +       struct vdec_pic_info pic;
> > +       struct v4l2_rect crop;
> > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_slice_inst - h264 decoder instance
> > + * @num_nalu : how many nalus be decoded
> > + * @ctx      : point to mtk_vcodec_ctx
> > + * @pred_buf : HW working predication buffer
> > + * @mv_buf   : HW working motion vector buffer
> > + * @vpu      : VPU instance
> > + * @vsi_ctx  : Local VSI data for this decoding context
> > + */
> > +struct vdec_h264_slice_inst {
> > +       unsigned int num_nalu;
> > +       struct mtk_vcodec_ctx *ctx;
> > +       struct mtk_vcodec_mem pred_buf;
> > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > +       struct vdec_vpu_inst vpu;
> > +       struct vdec_h264_vsi vsi_ctx;
> > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > +
> > +       struct v4l2_h264_dpb_entry dpb[16];
> > +};
> > +
> > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > +                                int id)
> > +{
> > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > +
> > +       return ctrl->p_cur.p;
> > +}
> > +
> > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > +                             struct mtk_h264_dec_slice_param *slice_param)
> > +{
> > +       struct vb2_queue *vq;
> > +       struct vb2_buffer *vb;
> > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > +       u64 index;
> > +
> > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > +
> > +       for (index = 0; index < 16; index++) {
>
> Ditto, some macro instead of 16.

Changed this to use ARRAY_SIZE() which is appropriate here.

>
> > +               const struct slice_h264_dpb_entry *dpb;
> > +               int vb2_index;
> > +
> > +               dpb = &slice_param->decode_params.dpb[index];
> > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > +                       continue;
> > +               }
> > +
> > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > +               if (vb2_index < 0) {
> > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > +                               index, dpb->reference_ts);
> > +                       continue;
> > +               }
> > +               /* 1 for short term reference, 2 for long term reference */
> > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > +               else
> > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > +
> > +               vb = vq->bufs[vb2_index];
> > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > +
> > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > +               }
> > +       }
> > +}
> > +
> > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > +       const struct v4l2_ctrl_h264_sps *src_param)
> > +{
> > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > +
> > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > +}
> > +
> > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > +       const struct v4l2_ctrl_h264_pps *src_param)
> > +{
> > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > +
> > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > +}
> > +
> > +static void
> > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > +{
> > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > +              sizeof(dst_matrix->scaling_list_4x4));
> > +
> > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > +              sizeof(dst_matrix->scaling_list_8x8));
> > +}
> > +
> > +static void get_h264_decode_parameters(
> > +       struct slice_api_h264_decode_param *dst_params,
> > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > +
> > +               dst_entry->reference_ts = src_entry->reference_ts;
> > +               dst_entry->frame_num = src_entry->frame_num;
> > +               dst_entry->pic_num = src_entry->pic_num;
> > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > +               dst_entry->bottom_field_order_cnt =
> > +                       src_entry->bottom_field_order_cnt;
> > +               dst_entry->flags = src_entry->flags;
> > +       }
> > +
> > +       // num_slices is a leftover from the old H.264 support and is ignored
> > +       // by the firmware.
> > +       dst_params->num_slices = 0;
> > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > +       dst_params->flags = src_params->flags;
> > +}
> > +
> > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > +                           const struct v4l2_h264_dpb_entry *b)
> > +{
> > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > +}
> > +
> > +/*
> > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > + * into the already existing slot in dpb, and move other entries into new slots.
> > + *
> > + * This function is an adaptation of the similarly-named function in
> > + * hantro_h264.c.
> > + */
> > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > +                      struct v4l2_h264_dpb_entry *dpb)
> > +{
> > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       unsigned int i, j;
> > +
> > +       /* Disable all entries by default, and mark the ones in use. */
> > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > +                       set_bit(i, in_use);
> > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > +       }
> > +
> > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > +
> > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > +                       continue;
> > +
> > +               /*
> > +                * To cut off some comparisons, iterate only on target DPB
> > +                * entries were already used.
> > +                */
> > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > +                       struct v4l2_h264_dpb_entry *cdpb;
> > +
> > +                       cdpb = &dpb[j];
> > +                       if (!dpb_entry_match(cdpb, ndpb))
> > +                               continue;
> > +
> > +                       *cdpb = *ndpb;
> > +                       set_bit(j, used);
> > +                       /* Don't reiterate on this one. */
> > +                       clear_bit(j, in_use);
> > +                       break;
> > +               }
> > +
> > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > +                       set_bit(i, new);
> > +       }
> > +
> > +       /* For entries that could not be matched, use remaining free slots. */
> > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > +               struct v4l2_h264_dpb_entry *cdpb;
> > +
> > +               /*
> > +                * Both arrays are of the same sizes, so there is no way
> > +                * we can end up with no space in target array, unless
> > +                * something is buggy.
> > +                */
> > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > +                       return;
> > +
> > +               cdpb = &dpb[j];
> > +               *cdpb = *ndpb;
> > +               set_bit(j, used);
> > +       }
> > +}
> > +
> > +/*
> > + * The firmware expects unused reflist entries to have the value 0x20.
> > + */
> > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > +{
> > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > +}
> > +
> > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > +{
> > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > +       const struct v4l2_ctrl_h264_sps *sps =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > +       const struct v4l2_ctrl_h264_pps *pps =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > +       struct v4l2_h264_reflist_builder reflist_builder;
> > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > +       int i;
> > +
> > +       update_dpb(dec_params, inst->dpb);
> > +
> > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > +                                  inst->dpb);
> > +       get_h264_dpb_list(inst, slice_param);
> > +
> > +       /* Prepare the fields for our reference lists */
> > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > +       /* Build the reference lists */
> > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > +                                      inst->dpb);
> > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > +       /* Adapt the built lists to the firmware's expectations */
> > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > +
> > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > +}
> > +
> > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > +{
> > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > +
> > +       return HW_MB_STORE_SZ * unit_size;
> > +}
> > +
> > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       int err = 0;
> > +
> > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > +       if (err) {
> > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > +               return err;
> > +       }
> > +
> > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > +       return 0;
> > +}
> > +
> > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +
> > +       mtk_vcodec_debug_enter(inst);
> > +
> > +       inst->vsi_ctx.pred_buf_dma = 0;
> > +       mem = &inst->pred_buf;
> > +       if (mem->va)
> > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > +}
> > +
> > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > +       struct vdec_pic_info *pic)
> > +{
> > +       int i;
> > +       int err;
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > +
> > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > +               mem = &inst->mv_buf[i];
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > +               mem->size = buf_sz;
> > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > +               if (err) {
> > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > +                       return err;
> > +               }
> > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       int i;
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +
> > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > +               mem = &inst->mv_buf[i];
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > +       }
> > +}
> > +
> > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > +                        struct vdec_pic_info *pic)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > +
> > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > +       inst->vsi_ctx.dec.cap_num_planes =
> > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > +
> > +       pic = &ctx->picinfo;
> > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > +               ctx->picinfo.fb_sz[1]);
> > +
> > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > +               inst->vsi_ctx.dec.resolution_changed = true;
> > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > +
> > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > +                       inst->vsi_ctx.dec.resolution_changed,
> > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > +                       ctx->last_decoded_picinfo.pic_w,
> > +                       ctx->last_decoded_picinfo.pic_h,
> > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > +       }
> > +}
> > +
> > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > +       struct v4l2_rect *cr)
> > +{
> > +       cr->left = inst->vsi_ctx.crop.left;
> > +       cr->top = inst->vsi_ctx.crop.top;
> > +       cr->width = inst->vsi_ctx.crop.width;
> > +       cr->height = inst->vsi_ctx.crop.height;
> > +
> > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > +                        cr->left, cr->top, cr->width, cr->height);
> > +}
> > +
> > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > +       unsigned int *dpb_sz)
> > +{
> > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > +}
> > +
> > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct vdec_h264_slice_inst *inst = NULL;
> > +       int err;
> > +
> > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > +       if (!inst)
> > +               return -ENOMEM;
> > +
> > +       inst->ctx = ctx;
> > +
> > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > +       inst->vpu.ctx = ctx;
> > +
> > +       err = vpu_dec_init(&inst->vpu);
> > +       if (err) {
> > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > +               goto error_free_inst;
> > +       }
> > +
> > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > +       inst->vsi_ctx.dec.resolution_changed = true;
> > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > +
> > +       err = allocate_predication_buf(inst);
> > +       if (err)
> > +               goto error_deinit;
> > +
> > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > +               sizeof(struct mtk_h264_sps_param),
> > +               sizeof(struct mtk_h264_pps_param),
> > +               sizeof(struct mtk_h264_dec_slice_param),
> > +               sizeof(struct mtk_h264_dpb_info));
> > +
> > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > +
> > +       ctx->drv_handle = inst;
> > +       return 0;
> > +
> > +error_deinit:
> > +       vpu_dec_deinit(&inst->vpu);
> > +
> > +error_free_inst:
> > +       kfree(inst);
> > +       return err;
> > +}
> > +
> > +static void vdec_h264_slice_deinit(void *h_vdec)
> > +{
> > +       struct vdec_h264_slice_inst *inst =
> > +               (struct vdec_h264_slice_inst *)h_vdec;
> > +
> > +       mtk_vcodec_debug_enter(inst);
> > +
> > +       vpu_dec_deinit(&inst->vpu);
> > +       free_predication_buf(inst);
> > +       free_mv_buf(inst);
> > +
> > +       kfree(inst);
> > +}
> > +
> > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > +{
> > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > +               return 3;
> > +
> > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > +           data[3] == 1)
> > +               return 4;
> > +
> > +       return -1;
> > +}
> > +
> > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > +                                 struct vdec_fb *fb, bool *res_chg)
> > +{
> > +       struct vdec_h264_slice_inst *inst =
> > +               (struct vdec_h264_slice_inst *)h_vdec;
> > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > +       struct mtk_video_dec_buf *src_buf_info;
> > +       int nal_start_idx = 0, err = 0;
> > +       uint32_t nal_type, data[2];
> > +       unsigned char *buf;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +
> > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > +
> > +       /* bs NULL means flush decoder */
> > +       if (bs == NULL)
> > +               return vpu_dec_reset(vpu);
> > +
> > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > +
> > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > +
> > +       buf = (unsigned char *)bs->va;
>
> I can be completely wrong, but it would seem here
> is where the CPU mapping is used.

I think you're right. :)

>
> > +       nal_start_idx = find_start_code(buf, bs->size);
> > +       if (nal_start_idx < 0)
> > +               goto err_free_fb_out;
> > +
> > +       data[0] = bs->size;
> > +       data[1] = buf[nal_start_idx];
> > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
>
> Which seems to be used to parse the NAL type. But shouldn't
> you expect here VLC NALUs only?
>
> I.e. you only get IDR or non-IDR frames, marked with
> V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.

Yep, that's true. And as a matter of fact I can remove `nal_type` (and
the test using it below) and the driver is just as happy.

>
> > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > +                        nal_type);
> > +
> > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > +
> > +       get_vdec_decode_parameters(inst);
> > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > +       if (*res_chg) {
> > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > +                       if (err)
> > +                               goto err_free_fb_out;
> > +               }
> > +               *res_chg = false;
> > +       }
> > +
> > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > +       err = vpu_dec_start(vpu, data, 2);
>
> Then it seems this 2-bytes are passed to the firmware. Maybe you
> could test if that can be derived without the CPU mapping.
> That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.

This one is a bit trickier. It seems the NAL type is passed as part of
the decode request to the firmware. Which should be absolutely not
needed since the firmware can check this from the buffer itself. Just
for fun I have tried setting this parameter unconditionally to 0x1
(non-IDR picture) and all I get is green frames with seemingly random
garbage. If I set it to 0x5 (IDR picture) I also get green frames with
a different kind of garbage, and once every while a properly rendered
frame (presumably when it is *really* an IDR frame).

So, mmm, I'm afraid we cannot decode properly without this information
and thus without the mapping, unless Yunfei can tell us of a way to
achieve this. Yunfei, do you have any idea?

Cheers,
Alex.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-15 11:28       ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-15 11:28 UTC (permalink / raw)
  To: Ezequiel Garcia, Yunfei Dong
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Mauro Carvalho Chehab,
	Hans Verkuil, linux-media, Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Ezequiel,

On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
>  Hi Alex,
>
> Thanks for the patch.
>
> On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > From: Yunfei Dong <yunfei.dong@mediatek.com>
> >
> > Add support for H.264 decoding using the stateless API, as supported by
> > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > builders.
> >
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > [acourbot: refactor, cleanup and split]
> > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > ---
> >  drivers/media/platform/Kconfig                |   1 +
> >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> >  5 files changed, 813 insertions(+)
> >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index fd1831e97b22..c27db5643712 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> >         select V4L2_MEM2MEM_DEV
> >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > +       select V4L2_H264
> >         help
> >           Mediatek video codec driver provides HW capability to
> >           encode and decode in a range of video formats on MT8173
> > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> >                 vdec/vdec_vp8_if.o \
> >                 vdec/vdec_vp9_if.o \
> > +               vdec/vdec_h264_req_if.o \
> >                 mtk_vcodec_dec_drv.o \
> >                 vdec_drv_if.o \
> >                 vdec_vpu_if.o \
> > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > new file mode 100644
> > index 000000000000..2fbbfbbcfbec
> > --- /dev/null
> > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > @@ -0,0 +1,807 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <media/v4l2-mem2mem.h>
> > +#include <media/v4l2-h264.h>
> > +#include <media/videobuf2-dma-contig.h>
> > +
> > +#include "../vdec_drv_if.h"
> > +#include "../mtk_vcodec_util.h"
> > +#include "../mtk_vcodec_dec.h"
> > +#include "../mtk_vcodec_intr.h"
> > +#include "../vdec_vpu_if.h"
> > +#include "../vdec_drv_base.h"
> > +
> > +#define NAL_NON_IDR_SLICE                      0x01
> > +#define NAL_IDR_SLICE                          0x05
> > +#define NAL_H264_PPS                           0x08
>
> Not used?
>
> > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > +
>
> I believe you may not need the NAL type.

True, removed this block of defines.

>
> > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > +#define MB_UNIT_LEN                            16
> > +
> > +/* get used parameters for sps/pps */
> > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > +#define GET_MTK_VDEC_PARAM(param) \
> > +       { dst_param->param = src_param->param; }
> > +/* motion vector size (bytes) for every macro block */
> > +#define HW_MB_STORE_SZ                         64
> > +
> > +#define H264_MAX_FB_NUM                                17
> > +#define H264_MAX_MV_NUM                                32
> > +#define HDR_PARSING_BUF_SZ                     1024
> > +
> > +/**
> > + * struct mtk_h264_dpb_info  - h264 dpb information
> > + * @y_dma_addr: Y bitstream physical address
> > + * @c_dma_addr: CbCr bitstream physical address
> > + * @reference_flag: reference picture flag (short/long term reference picture)
> > + * @field: field picture flag
> > + */
> > +struct mtk_h264_dpb_info {
> > +       dma_addr_t y_dma_addr;
> > +       dma_addr_t c_dma_addr;
> > +       int reference_flag;
> > +       int field;
> > +};
> > +
> > +/**
> > + * struct mtk_h264_sps_param  - parameters for sps
> > + */
> > +struct mtk_h264_sps_param {
> > +       unsigned char chroma_format_idc;
> > +       unsigned char bit_depth_luma_minus8;
> > +       unsigned char bit_depth_chroma_minus8;
> > +       unsigned char log2_max_frame_num_minus4;
> > +       unsigned char pic_order_cnt_type;
> > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > +       unsigned char max_num_ref_frames;
> > +       unsigned char separate_colour_plane_flag;
> > +       unsigned short pic_width_in_mbs_minus1;
> > +       unsigned short pic_height_in_map_units_minus1;
> > +       unsigned int max_frame_nums;
> > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > +       unsigned char delta_pic_order_always_zero_flag;
> > +       unsigned char frame_mbs_only_flag;
> > +       unsigned char mb_adaptive_frame_field_flag;
> > +       unsigned char direct_8x8_inference_flag;
> > +       unsigned char reserved[3];
> > +};
> > +
> > +/**
> > + * struct mtk_h264_pps_param  - parameters for pps
> > + */
> > +struct mtk_h264_pps_param {
> > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > +       unsigned char weighted_bipred_idc;
> > +       char pic_init_qp_minus26;
> > +       char chroma_qp_index_offset;
> > +       char second_chroma_qp_index_offset;
> > +       unsigned char entropy_coding_mode_flag;
> > +       unsigned char pic_order_present_flag;
> > +       unsigned char deblocking_filter_control_present_flag;
> > +       unsigned char constrained_intra_pred_flag;
> > +       unsigned char weighted_pred_flag;
> > +       unsigned char redundant_pic_cnt_present_flag;
> > +       unsigned char transform_8x8_mode_flag;
> > +       unsigned char scaling_matrix_present_flag;
> > +       unsigned char reserved[2];
> > +};
> > +
> > +struct slice_api_h264_scaling_matrix {
>
> Equal to v4l2_ctrl_h264_scaling_matrix ?
> Well I guess you don't want to mix a hardware-specific
> thing with the V4L2 API maybe.

That's the idea. Although the layout match and the ABI is now stable,
I think this communicates better the fact that this is a firmware
structure.

>
> > +       unsigned char scaling_list_4x4[6][16];
> > +       unsigned char scaling_list_8x8[6][64];
> > +};
> > +
> > +struct slice_h264_dpb_entry {
> > +       unsigned long long reference_ts;
> > +       unsigned short frame_num;
> > +       unsigned short pic_num;
> > +       /* Note that field is indicated by v4l2_buffer.field */
> > +       int top_field_order_cnt;
> > +       int bottom_field_order_cnt;
> > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > +};
> > +
> > +/**
> > + * struct slice_api_h264_decode_param - parameters for decode.
> > + */
> > +struct slice_api_h264_decode_param {
> > +       struct slice_h264_dpb_entry dpb[16];
>
> V4L2_H264_NUM_DPB_ENTRIES?

For the same reason as above (this being a firmware structure), I
think it is clearer to not use the kernel definitions here.

>
> > +       unsigned short num_slices;
> > +       unsigned short nal_ref_idc;
> > +       unsigned char ref_pic_list_p0[32];
> > +       unsigned char ref_pic_list_b0[32];
> > +       unsigned char ref_pic_list_b1[32];
>
> V4L2_H264_REF_LIST_LEN?

Ditto.

>
> > +       int top_field_order_cnt;
> > +       int bottom_field_order_cnt;
> > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > +};
> > +
> > +/**
> > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > + */
> > +struct mtk_h264_dec_slice_param {
> > +       struct mtk_h264_sps_param                       sps;
> > +       struct mtk_h264_pps_param                       pps;
> > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > +       struct slice_api_h264_decode_param              decode_params;
> > +       struct mtk_h264_dpb_info h264_dpb_info[16];
>
> V4L2_H264_NUM_DPB_ENTRIES?

Ditto.

>
> > +};
> > +
> > +/**
> > + * struct h264_fb - h264 decode frame buffer information
> > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > + * @poc         : picture order count of frame buffer
> > + * @reserved    : for 8 bytes alignment
> > + */
> > +struct h264_fb {
> > +       uint64_t vdec_fb_va;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +       int32_t poc;
> > +       uint32_t reserved;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_dec_info - decode information
> > + * @dpb_sz             : decoding picture buffer size
> > + * @resolution_changed  : resoltion change happen
> > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > + * @cap_num_planes     : number planes of capture buffer
> > + * @bs_dma             : Input bit-stream buffer dma address
> > + * @y_fb_dma           : Y frame buffer dma address
> > + * @c_fb_dma           : C frame buffer dma address
> > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > + */
> > +struct vdec_h264_dec_info {
> > +       uint32_t dpb_sz;
> > +       uint32_t resolution_changed;
> > +       uint32_t realloc_mv_buf;
> > +       uint32_t cap_num_planes;
> > +       uint64_t bs_dma;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +       uint64_t vdec_fb_va;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > + *                        between VPU and Host.
> > + *                        The memory is allocated by VPU then mapping to Host
> > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > + *                        by VPU.
> > + *                        AP-W/R : AP is writer/reader on this item
> > + *                        VPU-W/R: VPU is write/reader on this item
> > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > + * @dec          : decode information (AP-R, VPU-W)
> > + * @pic          : picture information (AP-R, VPU-W)
> > + * @crop         : crop information (AP-R, VPU-W)
> > + */
> > +struct vdec_h264_vsi {
> > +       uint64_t pred_buf_dma;
> > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > +       struct vdec_h264_dec_info dec;
> > +       struct vdec_pic_info pic;
> > +       struct v4l2_rect crop;
> > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > +};
> > +
> > +/**
> > + * struct vdec_h264_slice_inst - h264 decoder instance
> > + * @num_nalu : how many nalus be decoded
> > + * @ctx      : point to mtk_vcodec_ctx
> > + * @pred_buf : HW working predication buffer
> > + * @mv_buf   : HW working motion vector buffer
> > + * @vpu      : VPU instance
> > + * @vsi_ctx  : Local VSI data for this decoding context
> > + */
> > +struct vdec_h264_slice_inst {
> > +       unsigned int num_nalu;
> > +       struct mtk_vcodec_ctx *ctx;
> > +       struct mtk_vcodec_mem pred_buf;
> > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > +       struct vdec_vpu_inst vpu;
> > +       struct vdec_h264_vsi vsi_ctx;
> > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > +
> > +       struct v4l2_h264_dpb_entry dpb[16];
> > +};
> > +
> > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > +                                int id)
> > +{
> > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > +
> > +       return ctrl->p_cur.p;
> > +}
> > +
> > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > +                             struct mtk_h264_dec_slice_param *slice_param)
> > +{
> > +       struct vb2_queue *vq;
> > +       struct vb2_buffer *vb;
> > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > +       u64 index;
> > +
> > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > +
> > +       for (index = 0; index < 16; index++) {
>
> Ditto, some macro instead of 16.

Changed this to use ARRAY_SIZE() which is appropriate here.

>
> > +               const struct slice_h264_dpb_entry *dpb;
> > +               int vb2_index;
> > +
> > +               dpb = &slice_param->decode_params.dpb[index];
> > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > +                       continue;
> > +               }
> > +
> > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > +               if (vb2_index < 0) {
> > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > +                               index, dpb->reference_ts);
> > +                       continue;
> > +               }
> > +               /* 1 for short term reference, 2 for long term reference */
> > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > +               else
> > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > +
> > +               vb = vq->bufs[vb2_index];
> > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > +
> > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > +               }
> > +       }
> > +}
> > +
> > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > +       const struct v4l2_ctrl_h264_sps *src_param)
> > +{
> > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > +
> > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > +}
> > +
> > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > +       const struct v4l2_ctrl_h264_pps *src_param)
> > +{
> > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > +
> > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > +}
> > +
> > +static void
> > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > +{
> > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > +              sizeof(dst_matrix->scaling_list_4x4));
> > +
> > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > +              sizeof(dst_matrix->scaling_list_8x8));
> > +}
> > +
> > +static void get_h264_decode_parameters(
> > +       struct slice_api_h264_decode_param *dst_params,
> > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > +
> > +               dst_entry->reference_ts = src_entry->reference_ts;
> > +               dst_entry->frame_num = src_entry->frame_num;
> > +               dst_entry->pic_num = src_entry->pic_num;
> > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > +               dst_entry->bottom_field_order_cnt =
> > +                       src_entry->bottom_field_order_cnt;
> > +               dst_entry->flags = src_entry->flags;
> > +       }
> > +
> > +       // num_slices is a leftover from the old H.264 support and is ignored
> > +       // by the firmware.
> > +       dst_params->num_slices = 0;
> > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > +       dst_params->flags = src_params->flags;
> > +}
> > +
> > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > +                           const struct v4l2_h264_dpb_entry *b)
> > +{
> > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > +}
> > +
> > +/*
> > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > + * into the already existing slot in dpb, and move other entries into new slots.
> > + *
> > + * This function is an adaptation of the similarly-named function in
> > + * hantro_h264.c.
> > + */
> > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > +                      struct v4l2_h264_dpb_entry *dpb)
> > +{
> > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > +       unsigned int i, j;
> > +
> > +       /* Disable all entries by default, and mark the ones in use. */
> > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > +                       set_bit(i, in_use);
> > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > +       }
> > +
> > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > +
> > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > +                       continue;
> > +
> > +               /*
> > +                * To cut off some comparisons, iterate only on target DPB
> > +                * entries were already used.
> > +                */
> > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > +                       struct v4l2_h264_dpb_entry *cdpb;
> > +
> > +                       cdpb = &dpb[j];
> > +                       if (!dpb_entry_match(cdpb, ndpb))
> > +                               continue;
> > +
> > +                       *cdpb = *ndpb;
> > +                       set_bit(j, used);
> > +                       /* Don't reiterate on this one. */
> > +                       clear_bit(j, in_use);
> > +                       break;
> > +               }
> > +
> > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > +                       set_bit(i, new);
> > +       }
> > +
> > +       /* For entries that could not be matched, use remaining free slots. */
> > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > +               struct v4l2_h264_dpb_entry *cdpb;
> > +
> > +               /*
> > +                * Both arrays are of the same sizes, so there is no way
> > +                * we can end up with no space in target array, unless
> > +                * something is buggy.
> > +                */
> > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > +                       return;
> > +
> > +               cdpb = &dpb[j];
> > +               *cdpb = *ndpb;
> > +               set_bit(j, used);
> > +       }
> > +}
> > +
> > +/*
> > + * The firmware expects unused reflist entries to have the value 0x20.
> > + */
> > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > +{
> > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > +}
> > +
> > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > +{
> > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > +       const struct v4l2_ctrl_h264_sps *sps =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > +       const struct v4l2_ctrl_h264_pps *pps =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > +       struct v4l2_h264_reflist_builder reflist_builder;
> > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > +       int i;
> > +
> > +       update_dpb(dec_params, inst->dpb);
> > +
> > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > +                                  inst->dpb);
> > +       get_h264_dpb_list(inst, slice_param);
> > +
> > +       /* Prepare the fields for our reference lists */
> > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > +       /* Build the reference lists */
> > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > +                                      inst->dpb);
> > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > +       /* Adapt the built lists to the firmware's expectations */
> > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > +
> > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > +}
> > +
> > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > +{
> > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > +
> > +       return HW_MB_STORE_SZ * unit_size;
> > +}
> > +
> > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       int err = 0;
> > +
> > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > +       if (err) {
> > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > +               return err;
> > +       }
> > +
> > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > +       return 0;
> > +}
> > +
> > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +
> > +       mtk_vcodec_debug_enter(inst);
> > +
> > +       inst->vsi_ctx.pred_buf_dma = 0;
> > +       mem = &inst->pred_buf;
> > +       if (mem->va)
> > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > +}
> > +
> > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > +       struct vdec_pic_info *pic)
> > +{
> > +       int i;
> > +       int err;
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > +
> > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > +               mem = &inst->mv_buf[i];
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > +               mem->size = buf_sz;
> > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > +               if (err) {
> > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > +                       return err;
> > +               }
> > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > +{
> > +       int i;
> > +       struct mtk_vcodec_mem *mem = NULL;
> > +
> > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > +               mem = &inst->mv_buf[i];
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > +       }
> > +}
> > +
> > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > +                        struct vdec_pic_info *pic)
> > +{
> > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > +
> > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > +       inst->vsi_ctx.dec.cap_num_planes =
> > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > +
> > +       pic = &ctx->picinfo;
> > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > +               ctx->picinfo.fb_sz[1]);
> > +
> > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > +               inst->vsi_ctx.dec.resolution_changed = true;
> > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > +
> > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > +                       inst->vsi_ctx.dec.resolution_changed,
> > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > +                       ctx->last_decoded_picinfo.pic_w,
> > +                       ctx->last_decoded_picinfo.pic_h,
> > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > +       }
> > +}
> > +
> > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > +       struct v4l2_rect *cr)
> > +{
> > +       cr->left = inst->vsi_ctx.crop.left;
> > +       cr->top = inst->vsi_ctx.crop.top;
> > +       cr->width = inst->vsi_ctx.crop.width;
> > +       cr->height = inst->vsi_ctx.crop.height;
> > +
> > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > +                        cr->left, cr->top, cr->width, cr->height);
> > +}
> > +
> > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > +       unsigned int *dpb_sz)
> > +{
> > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > +}
> > +
> > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > +{
> > +       struct vdec_h264_slice_inst *inst = NULL;
> > +       int err;
> > +
> > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > +       if (!inst)
> > +               return -ENOMEM;
> > +
> > +       inst->ctx = ctx;
> > +
> > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > +       inst->vpu.ctx = ctx;
> > +
> > +       err = vpu_dec_init(&inst->vpu);
> > +       if (err) {
> > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > +               goto error_free_inst;
> > +       }
> > +
> > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > +       inst->vsi_ctx.dec.resolution_changed = true;
> > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > +
> > +       err = allocate_predication_buf(inst);
> > +       if (err)
> > +               goto error_deinit;
> > +
> > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > +               sizeof(struct mtk_h264_sps_param),
> > +               sizeof(struct mtk_h264_pps_param),
> > +               sizeof(struct mtk_h264_dec_slice_param),
> > +               sizeof(struct mtk_h264_dpb_info));
> > +
> > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > +
> > +       ctx->drv_handle = inst;
> > +       return 0;
> > +
> > +error_deinit:
> > +       vpu_dec_deinit(&inst->vpu);
> > +
> > +error_free_inst:
> > +       kfree(inst);
> > +       return err;
> > +}
> > +
> > +static void vdec_h264_slice_deinit(void *h_vdec)
> > +{
> > +       struct vdec_h264_slice_inst *inst =
> > +               (struct vdec_h264_slice_inst *)h_vdec;
> > +
> > +       mtk_vcodec_debug_enter(inst);
> > +
> > +       vpu_dec_deinit(&inst->vpu);
> > +       free_predication_buf(inst);
> > +       free_mv_buf(inst);
> > +
> > +       kfree(inst);
> > +}
> > +
> > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > +{
> > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > +               return 3;
> > +
> > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > +           data[3] == 1)
> > +               return 4;
> > +
> > +       return -1;
> > +}
> > +
> > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > +                                 struct vdec_fb *fb, bool *res_chg)
> > +{
> > +       struct vdec_h264_slice_inst *inst =
> > +               (struct vdec_h264_slice_inst *)h_vdec;
> > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > +       struct mtk_video_dec_buf *src_buf_info;
> > +       int nal_start_idx = 0, err = 0;
> > +       uint32_t nal_type, data[2];
> > +       unsigned char *buf;
> > +       uint64_t y_fb_dma;
> > +       uint64_t c_fb_dma;
> > +
> > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > +
> > +       /* bs NULL means flush decoder */
> > +       if (bs == NULL)
> > +               return vpu_dec_reset(vpu);
> > +
> > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > +
> > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > +
> > +       buf = (unsigned char *)bs->va;
>
> I can be completely wrong, but it would seem here
> is where the CPU mapping is used.

I think you're right. :)

>
> > +       nal_start_idx = find_start_code(buf, bs->size);
> > +       if (nal_start_idx < 0)
> > +               goto err_free_fb_out;
> > +
> > +       data[0] = bs->size;
> > +       data[1] = buf[nal_start_idx];
> > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
>
> Which seems to be used to parse the NAL type. But shouldn't
> you expect here VLC NALUs only?
>
> I.e. you only get IDR or non-IDR frames, marked with
> V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.

Yep, that's true. And as a matter of fact I can remove `nal_type` (and
the test using it below) and the driver is just as happy.

>
> > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > +                        nal_type);
> > +
> > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > +
> > +       get_vdec_decode_parameters(inst);
> > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > +       if (*res_chg) {
> > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > +                       if (err)
> > +                               goto err_free_fb_out;
> > +               }
> > +               *res_chg = false;
> > +       }
> > +
> > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > +       err = vpu_dec_start(vpu, data, 2);
>
> Then it seems this 2-bytes are passed to the firmware. Maybe you
> could test if that can be derived without the CPU mapping.
> That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.

This one is a bit trickier. It seems the NAL type is passed as part of
the decode request to the firmware. Which should be absolutely not
needed since the firmware can check this from the buffer itself. Just
for fun I have tried setting this parameter unconditionally to 0x1
(non-IDR picture) and all I get is green frames with seemingly random
garbage. If I set it to 0x5 (IDR picture) I also get green frames with
a different kind of garbage, and once every while a properly rendered
frame (presumably when it is *really* an IDR frame).

So, mmm, I'm afraid we cannot decode properly without this information
and thus without the mapping, unless Yunfei can tell us of a way to
achieve this. Yunfei, do you have any idea?

Cheers,
Alex.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-03-15 11:28       ` Alexandre Courbot
@ 2021-03-15 15:16         ` Nicolas Dufresne
  -1 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-15 15:16 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> Hi Ezequiel, thanks for the feedback!
> 
> On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> > Hello Alex,
> > 
> > Thanks for the patch.
> > 
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > 
> > > Support the stateless codec API that will be used by MT8183.
> > > 
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > >  create mode 100644 drivers/media/platform/mtk-
> > > vcodec/mtk_vcodec_dec_stateless.c
> > > 
> > [..]
> > 
> > > +
> > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > 
> > This "needed_in_request" is not really required, as controls
> > are not volatile, and their value is stored per-context (per-fd).
> > 
> > It's perfectly valid for an application to pass the SPS control
> > at the beginning of the sequence, and then omit it
> > in further requests.
> 
> If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> requests, this boolean only checks that the control has been provided
> at least once, and not that it is provided with every request. Without
> it we could send a frame to the firmware without e.g. setting an SPS,
> which would be a problem.

In other drivers (Cedrus and RKVDEC) this was actually checking if the control
was part of the request, I doubt the framework have a state for "being set
once", as control have no set/unset state. Did you wrote this code and tested
this intended behaviour or borred that code from somewhere else ?

> 
> > 
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > +                       .menu_skip_mask =
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE)
> > > |
> > > +                              
> > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > +                       .min =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .def =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .max =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +};
> > 
> > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > the driver supports. From a next patch, this case seems to be
> > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> 
> Indeed - I've added the control, thanks for catching this!
> 
> > 
> > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > +
> > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .type = MTK_FMT_DEC,
> > > +               .num_planes = 1,
> > > +       },
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > +               .type = MTK_FMT_FRAME,
> > > +               .num_planes = 2,
> > > +       },
> > > +};
> > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > +#define DEFAULT_OUT_FMT_IDX    0
> > > +#define DEFAULT_CAP_FMT_IDX    1
> > > +
> > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .stepwise = {
> > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > +               },
> > > +       },
> > > +};
> > > +
> > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > +
> > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx
> > > *ctx,
> > > +                                              struct vdec_fb *fb)
> > > +{
> > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > +       unsigned int cap_y_size = ctx-
> > > >q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               unsigned int cap_c_size =
> > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +
> > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > +       }
> > > +}
> > > +
> > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > +                                          struct vb2_v4l2_buffer
> > > *vb2_v4l2)
> > > +{
> > > +       struct mtk_video_dec_buf *framebuf =
> > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf,
> > > m2m_buf.vb);
> > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > +
> > > +       pfb = &framebuf->frame_buffer;
> > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > 
> > Are you sure you need a CPU mapping? It seems strange.
> > I'll comment some more on the next patch(es).
> 
> I'll answer on the next patch since this is where that mapping is being used.
> 
> > 
> > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > +               pfb->base_c.dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > +               pfb->base_c.size = ctx-
> > > >q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +       }
> > > +       mtk_v4l2_debug(1,
> > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad
> > > Size=%zx frame_count = %d",
> > > +               dst_buf->index, pfb,
> > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > +               ctx->decoded_frame_cnt);
> > > +
> > > +       return pfb;
> > > +}
> > > +
> > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +
> > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static int fops_media_request_validate(struct media_request *mreq)
> > > +{
> > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > +       struct media_request_object *req_obj;
> > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       switch (buffer_cnt) {
> > > +       case 1:
> > > +               /* We expect exactly one buffer with the request */
> > > +               break;
> > > +       case 0:
> > > +               mtk_v4l2_err("No buffer provided with the request");
> > > +               return -ENOENT;
> > > +       default:
> > > +               mtk_v4l2_err("Too many buffers (%d) provided with the
> > > request",
> > > +                            buffer_cnt);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > +               struct vb2_buffer *vb;
> > > +
> > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > +                       vb = container_of(req_obj, struct vb2_buffer,
> > > req_obj);
> > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +                       break;
> > > +               }
> > > +       }
> > > +
> > > +       if (!ctx) {
> > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       parent_hdl = &ctx->ctrl_hdl;
> > > +
> > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > +       if (!hdl) {
> > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               if (mtk_stateless_controls[i].codec_type != ctx-
> > > >current_codec)
> > > +                       continue;
> > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > +                       continue;
> > > +
> > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > +                                        
> > > mtk_stateless_controls[i].cfg.id);
> > > +               if (!ctrl) {
> > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > +                       return -ENOENT;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > +
> > > +       return vb2_request_validate(mreq);
> > > +}
> > > +
> > > +static void mtk_vdec_worker(struct work_struct *work)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx =
> > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > +       struct vb2_buffer *vb2_src;
> > > +       struct mtk_vcodec_mem *bs_src;
> > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > +       struct media_request *src_buf_req;
> > > +       struct vdec_fb *dst_buf;
> > > +       bool res_chg = false;
> > > +       int ret;
> > > +
> > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_src == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx-
> > > >id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_dst == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer",
> > > ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > +                                  m2m_buf.vb);
> > > +       bs_src = &dec_buf_src->bs_buffer;
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > +                       ctx->id, src_buf->vb2_queue->type,
> > > +                       src_buf->index, src_buf, src_buf_info);
> > > +
> > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > +       if (!bs_src->va) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > +                            vb2_src->index);
> > > +               return;
> > > +       }
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size,
> > > src_buf);
> > > +       /* Apply request controls. */
> > > +       src_buf_req = vb2_src->req_obj.req;
> > > +       if (src_buf_req)
> > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > +       else
> > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > +
> > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > +       if (ret) {
> > > +               mtk_v4l2_err(
> > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu
> > > vdec_if_decode() ret=%d res_chg=%d===>",
> > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > +                       vb2_src->timestamp, ret, res_chg);
> > > +               if (ret == -EIO) {
> > > +                       mutex_lock(&ctx->lock);
> > > +                       dec_buf_src->error = true;
> > > +                       mutex_unlock(&ctx->lock);
> > > +               }
> > > +       }
> > > +
> > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > +
> > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > +
> > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > +                       ctx->id, vb->vb2_queue->type,
> > > +                       vb->index, vb);
> > > +
> > > +       mutex_lock(&ctx->lock);
> > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > +       mutex_unlock(&ctx->lock);
> > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > +               return;
> > > +
> > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > +
> > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > +       if (ctx->state == MTK_STATE_INIT) {
> > > +               ctx->state = MTK_STATE_HEADER;
> > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > 
> > This state thing seems just something to make the rest
> > of the stateful-based driver happy, right?
> 
> Correct - if anything we should either use more of the state here
> (i.e. set the error state when relevant) or move the state entirely in
> the stateful part of the driver.
> 
> > 
> > Makes me wonder a bit if just splitting the stateless part to its
> > own driver, wouldn't make your maintenance easier.
> > 
> > What's the motivation for sharing the driver?
> 
> Technically you could do it both ways. Separating the driver would
> result in some boilerplate code and buffer-management structs
> duplication (unless we keep the shared part under another module - but
> in this case we are basically in the same situation as now). Also
> despite using different userspace-facing ABIs, MT8173 and MT8183
> follow a similar architecture and a similar firmware interface.
> Considering these similarities it seems simpler from an architectural
> point of view to have all the Mediatek codec support under the same
> driver. It also probably results in less code.
> 
> That being said, the split can probably be improved as you pointed out
> with this state variable. But the current split is not too bad IMHO,
> at least not worse than how the code was originally.
> 
> > 
> > > +       } else {
> > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > +                               ctx->id, ctx->state);
> > > +       }
> > > +}
> > > +
> > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       bool res_chg;
> > > +
> > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > +}
> > > +
> > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > +};
> > > +
> > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > +       if (ctx->ctrl_hdl.error) {
> > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > +               return ctx->ctrl_hdl.error;
> > > +       }
> > > +
> > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > +                               0, 32, 1, 1);
> > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > 
> > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > to return the DPB size. However, isn't this something userspace already
> > knows?
> 
> True, but that's also a control the driver is supposed to provide per
> the spec IIUC.
> 
> > 
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               struct v4l2_ctrl_config cfg =
> > > mtk_stateless_controls[i].cfg;
> > > +
> > > +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> > > +               if (ctx->ctrl_hdl.error) {
> > > +                       mtk_v4l2_err("Adding control %d failed %d",
> > > +                                       i, ctx->ctrl_hdl.error);
> > > +                       return ctx->ctrl_hdl.error;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +const struct media_device_ops mtk_vcodec_media_ops = {
> > > +       .req_validate   = fops_media_request_validate,
> > > +       .req_queue      = v4l2_m2m_request_queue,
> > > +};
> > > +
> > > +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vb2_queue *src_vq;
> > > +
> > > +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> > > +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> > > +
> > > +       /* Support request api for output plane */
> > > +       src_vq->supports_requests = true;
> > > +       src_vq->requires_requests = true;
> > > +}
> > > +
> > > +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> > > +{
> > 
> > I have to admit I do not remember exactly the reason,
> > but this should set the buffer field to V4L2_FIELD_NONE.
> 
> Right, I see all other drivers are doing this. Done.
> 
> Cheers,
> Alex.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-15 15:16         ` Nicolas Dufresne
  0 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-15 15:16 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> Hi Ezequiel, thanks for the feedback!
> 
> On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> > Hello Alex,
> > 
> > Thanks for the patch.
> > 
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > 
> > > Support the stateless codec API that will be used by MT8183.
> > > 
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > >  create mode 100644 drivers/media/platform/mtk-
> > > vcodec/mtk_vcodec_dec_stateless.c
> > > 
> > [..]
> > 
> > > +
> > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > 
> > This "needed_in_request" is not really required, as controls
> > are not volatile, and their value is stored per-context (per-fd).
> > 
> > It's perfectly valid for an application to pass the SPS control
> > at the beginning of the sequence, and then omit it
> > in further requests.
> 
> If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> requests, this boolean only checks that the control has been provided
> at least once, and not that it is provided with every request. Without
> it we could send a frame to the firmware without e.g. setting an SPS,
> which would be a problem.

In other drivers (Cedrus and RKVDEC) this was actually checking if the control
was part of the request, I doubt the framework have a state for "being set
once", as control have no set/unset state. Did you wrote this code and tested
this intended behaviour or borred that code from somewhere else ?

> 
> > 
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > +                       .menu_skip_mask =
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE)
> > > |
> > > +                              
> > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > +                       .min =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .def =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .max =
> > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +};
> > 
> > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > the driver supports. From a next patch, this case seems to be
> > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> 
> Indeed - I've added the control, thanks for catching this!
> 
> > 
> > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > +
> > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .type = MTK_FMT_DEC,
> > > +               .num_planes = 1,
> > > +       },
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > +               .type = MTK_FMT_FRAME,
> > > +               .num_planes = 2,
> > > +       },
> > > +};
> > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > +#define DEFAULT_OUT_FMT_IDX    0
> > > +#define DEFAULT_CAP_FMT_IDX    1
> > > +
> > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .stepwise = {
> > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > +               },
> > > +       },
> > > +};
> > > +
> > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > +
> > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx
> > > *ctx,
> > > +                                              struct vdec_fb *fb)
> > > +{
> > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > +       unsigned int cap_y_size = ctx-
> > > >q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               unsigned int cap_c_size =
> > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +
> > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > +       }
> > > +}
> > > +
> > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > +                                          struct vb2_v4l2_buffer
> > > *vb2_v4l2)
> > > +{
> > > +       struct mtk_video_dec_buf *framebuf =
> > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf,
> > > m2m_buf.vb);
> > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > +
> > > +       pfb = &framebuf->frame_buffer;
> > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > 
> > Are you sure you need a CPU mapping? It seems strange.
> > I'll comment some more on the next patch(es).
> 
> I'll answer on the next patch since this is where that mapping is being used.
> 
> > 
> > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > +               pfb->base_c.dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > +               pfb->base_c.size = ctx-
> > > >q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +       }
> > > +       mtk_v4l2_debug(1,
> > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad
> > > Size=%zx frame_count = %d",
> > > +               dst_buf->index, pfb,
> > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > +               ctx->decoded_frame_cnt);
> > > +
> > > +       return pfb;
> > > +}
> > > +
> > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +
> > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static int fops_media_request_validate(struct media_request *mreq)
> > > +{
> > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > +       struct media_request_object *req_obj;
> > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       switch (buffer_cnt) {
> > > +       case 1:
> > > +               /* We expect exactly one buffer with the request */
> > > +               break;
> > > +       case 0:
> > > +               mtk_v4l2_err("No buffer provided with the request");
> > > +               return -ENOENT;
> > > +       default:
> > > +               mtk_v4l2_err("Too many buffers (%d) provided with the
> > > request",
> > > +                            buffer_cnt);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > +               struct vb2_buffer *vb;
> > > +
> > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > +                       vb = container_of(req_obj, struct vb2_buffer,
> > > req_obj);
> > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +                       break;
> > > +               }
> > > +       }
> > > +
> > > +       if (!ctx) {
> > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       parent_hdl = &ctx->ctrl_hdl;
> > > +
> > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > +       if (!hdl) {
> > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               if (mtk_stateless_controls[i].codec_type != ctx-
> > > >current_codec)
> > > +                       continue;
> > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > +                       continue;
> > > +
> > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > +                                        
> > > mtk_stateless_controls[i].cfg.id);
> > > +               if (!ctrl) {
> > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > +                       return -ENOENT;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > +
> > > +       return vb2_request_validate(mreq);
> > > +}
> > > +
> > > +static void mtk_vdec_worker(struct work_struct *work)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx =
> > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > +       struct vb2_buffer *vb2_src;
> > > +       struct mtk_vcodec_mem *bs_src;
> > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > +       struct media_request *src_buf_req;
> > > +       struct vdec_fb *dst_buf;
> > > +       bool res_chg = false;
> > > +       int ret;
> > > +
> > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_src == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx-
> > > >id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_dst == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer",
> > > ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > +                                  m2m_buf.vb);
> > > +       bs_src = &dec_buf_src->bs_buffer;
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > +                       ctx->id, src_buf->vb2_queue->type,
> > > +                       src_buf->index, src_buf, src_buf_info);
> > > +
> > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > +       if (!bs_src->va) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > +                            vb2_src->index);
> > > +               return;
> > > +       }
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size,
> > > src_buf);
> > > +       /* Apply request controls. */
> > > +       src_buf_req = vb2_src->req_obj.req;
> > > +       if (src_buf_req)
> > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > +       else
> > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > +
> > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > +       if (ret) {
> > > +               mtk_v4l2_err(
> > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu
> > > vdec_if_decode() ret=%d res_chg=%d===>",
> > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > +                       vb2_src->timestamp, ret, res_chg);
> > > +               if (ret == -EIO) {
> > > +                       mutex_lock(&ctx->lock);
> > > +                       dec_buf_src->error = true;
> > > +                       mutex_unlock(&ctx->lock);
> > > +               }
> > > +       }
> > > +
> > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > +
> > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > +
> > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > +                       ctx->id, vb->vb2_queue->type,
> > > +                       vb->index, vb);
> > > +
> > > +       mutex_lock(&ctx->lock);
> > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > +       mutex_unlock(&ctx->lock);
> > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > +               return;
> > > +
> > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > +
> > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > +       if (ctx->state == MTK_STATE_INIT) {
> > > +               ctx->state = MTK_STATE_HEADER;
> > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > 
> > This state thing seems just something to make the rest
> > of the stateful-based driver happy, right?
> 
> Correct - if anything we should either use more of the state here
> (i.e. set the error state when relevant) or move the state entirely in
> the stateful part of the driver.
> 
> > 
> > Makes me wonder a bit if just splitting the stateless part to its
> > own driver, wouldn't make your maintenance easier.
> > 
> > What's the motivation for sharing the driver?
> 
> Technically you could do it both ways. Separating the driver would
> result in some boilerplate code and buffer-management structs
> duplication (unless we keep the shared part under another module - but
> in this case we are basically in the same situation as now). Also
> despite using different userspace-facing ABIs, MT8173 and MT8183
> follow a similar architecture and a similar firmware interface.
> Considering these similarities it seems simpler from an architectural
> point of view to have all the Mediatek codec support under the same
> driver. It also probably results in less code.
> 
> That being said, the split can probably be improved as you pointed out
> with this state variable. But the current split is not too bad IMHO,
> at least not worse than how the code was originally.
> 
> > 
> > > +       } else {
> > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > +                               ctx->id, ctx->state);
> > > +       }
> > > +}
> > > +
> > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       bool res_chg;
> > > +
> > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > +}
> > > +
> > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > +};
> > > +
> > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > +       if (ctx->ctrl_hdl.error) {
> > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > +               return ctx->ctrl_hdl.error;
> > > +       }
> > > +
> > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > +                               0, 32, 1, 1);
> > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > 
> > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > to return the DPB size. However, isn't this something userspace already
> > knows?
> 
> True, but that's also a control the driver is supposed to provide per
> the spec IIUC.
> 
> > 
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               struct v4l2_ctrl_config cfg =
> > > mtk_stateless_controls[i].cfg;
> > > +
> > > +               v4l2_ctrl_new_custom(&ctx->ctrl_hdl, &cfg, NULL);
> > > +               if (ctx->ctrl_hdl.error) {
> > > +                       mtk_v4l2_err("Adding control %d failed %d",
> > > +                                       i, ctx->ctrl_hdl.error);
> > > +                       return ctx->ctrl_hdl.error;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_handler_setup(&ctx->ctrl_hdl);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +const struct media_device_ops mtk_vcodec_media_ops = {
> > > +       .req_validate   = fops_media_request_validate,
> > > +       .req_queue      = v4l2_m2m_request_queue,
> > > +};
> > > +
> > > +static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vb2_queue *src_vq;
> > > +
> > > +       src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
> > > +                                V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
> > > +
> > > +       /* Support request api for output plane */
> > > +       src_vq->supports_requests = true;
> > > +       src_vq->requires_requests = true;
> > > +}
> > > +
> > > +static int vb2ops_vdec_out_buf_validate(struct vb2_buffer *vb)
> > > +{
> > 
> > I have to admit I do not remember exactly the reason,
> > but this should set the buffer field to V4L2_FIELD_NONE.
> 
> Right, I see all other drivers are doing this. Done.
> 
> Cheers,
> Alex.



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-03-15 11:28       ` Alexandre Courbot
@ 2021-03-15 15:21         ` Nicolas Dufresne
  -1 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-15 15:21 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia, Yunfei Dong
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Mauro Carvalho Chehab,
	Hans Verkuil, linux-media, Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> Hi Ezequiel,
> 
> On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> >  Hi Alex,
> > 
> > Thanks for the patch.
> > 
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > 
> > > Add support for H.264 decoding using the stateless API, as supported by
> > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > builders.
> > > 
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/Kconfig                |   1 +
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > >  5 files changed, 813 insertions(+)
> > >  create mode 100644 drivers/media/platform/mtk-
> > > vcodec/vdec/vdec_h264_req_if.c
> > > 
> > > diff --git a/drivers/media/platform/Kconfig
> > > b/drivers/media/platform/Kconfig
> > > index fd1831e97b22..c27db5643712 100644
> > > --- a/drivers/media/platform/Kconfig
> > > +++ b/drivers/media/platform/Kconfig
> > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > >         select V4L2_MEM2MEM_DEV
> > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > +       select V4L2_H264
> > >         help
> > >           Mediatek video codec driver provides HW capability to
> > >           encode and decode in a range of video formats on MT8173
> > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile
> > > b/drivers/media/platform/mtk-vcodec/Makefile
> > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > >                 vdec/vdec_vp8_if.o \
> > >                 vdec/vdec_vp9_if.o \
> > > +               vdec/vdec_h264_req_if.o \
> > >                 mtk_vcodec_dec_drv.o \
> > >                 vdec_drv_if.o \
> > >                 vdec_vpu_if.o \
> > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > new file mode 100644
> > > index 000000000000..2fbbfbbcfbec
> > > --- /dev/null
> > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > @@ -0,0 +1,807 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/slab.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +#include <media/v4l2-h264.h>
> > > +#include <media/videobuf2-dma-contig.h>
> > > +
> > > +#include "../vdec_drv_if.h"
> > > +#include "../mtk_vcodec_util.h"
> > > +#include "../mtk_vcodec_dec.h"
> > > +#include "../mtk_vcodec_intr.h"
> > > +#include "../vdec_vpu_if.h"
> > > +#include "../vdec_drv_base.h"
> > > +
> > > +#define NAL_NON_IDR_SLICE                      0x01
> > > +#define NAL_IDR_SLICE                          0x05
> > > +#define NAL_H264_PPS                           0x08
> > 
> > Not used?
> > 
> > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > +
> > 
> > I believe you may not need the NAL type.
> 
> True, removed this block of defines.
> 
> > 
> > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > +#define MB_UNIT_LEN                            16
> > > +
> > > +/* get used parameters for sps/pps */
> > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > +#define GET_MTK_VDEC_PARAM(param) \
> > > +       { dst_param->param = src_param->param; }
> > > +/* motion vector size (bytes) for every macro block */
> > > +#define HW_MB_STORE_SZ                         64
> > > +
> > > +#define H264_MAX_FB_NUM                                17
> > > +#define H264_MAX_MV_NUM                                32
> > > +#define HDR_PARSING_BUF_SZ                     1024
> > > +
> > > +/**
> > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > + * @y_dma_addr: Y bitstream physical address
> > > + * @c_dma_addr: CbCr bitstream physical address
> > > + * @reference_flag: reference picture flag (short/long term reference
> > > picture)
> > > + * @field: field picture flag
> > > + */
> > > +struct mtk_h264_dpb_info {
> > > +       dma_addr_t y_dma_addr;
> > > +       dma_addr_t c_dma_addr;
> > > +       int reference_flag;
> > > +       int field;
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_sps_param  - parameters for sps
> > > + */
> > > +struct mtk_h264_sps_param {
> > > +       unsigned char chroma_format_idc;
> > > +       unsigned char bit_depth_luma_minus8;
> > > +       unsigned char bit_depth_chroma_minus8;
> > > +       unsigned char log2_max_frame_num_minus4;
> > > +       unsigned char pic_order_cnt_type;
> > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > +       unsigned char max_num_ref_frames;
> > > +       unsigned char separate_colour_plane_flag;
> > > +       unsigned short pic_width_in_mbs_minus1;
> > > +       unsigned short pic_height_in_map_units_minus1;
> > > +       unsigned int max_frame_nums;
> > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > +       unsigned char delta_pic_order_always_zero_flag;
> > > +       unsigned char frame_mbs_only_flag;
> > > +       unsigned char mb_adaptive_frame_field_flag;
> > > +       unsigned char direct_8x8_inference_flag;
> > > +       unsigned char reserved[3];
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_pps_param  - parameters for pps
> > > + */
> > > +struct mtk_h264_pps_param {
> > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > +       unsigned char weighted_bipred_idc;
> > > +       char pic_init_qp_minus26;
> > > +       char chroma_qp_index_offset;
> > > +       char second_chroma_qp_index_offset;
> > > +       unsigned char entropy_coding_mode_flag;
> > > +       unsigned char pic_order_present_flag;
> > > +       unsigned char deblocking_filter_control_present_flag;
> > > +       unsigned char constrained_intra_pred_flag;
> > > +       unsigned char weighted_pred_flag;
> > > +       unsigned char redundant_pic_cnt_present_flag;
> > > +       unsigned char transform_8x8_mode_flag;
> > > +       unsigned char scaling_matrix_present_flag;
> > > +       unsigned char reserved[2];
> > > +};
> > > +
> > > +struct slice_api_h264_scaling_matrix {
> > 
> > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > Well I guess you don't want to mix a hardware-specific
> > thing with the V4L2 API maybe.
> 
> That's the idea. Although the layout match and the ABI is now stable,
> I think this communicates better the fact that this is a firmware
> structure.
> 
> > 
> > > +       unsigned char scaling_list_4x4[6][16];
> > > +       unsigned char scaling_list_8x8[6][64];
> > > +};
> > > +
> > > +struct slice_h264_dpb_entry {
> > > +       unsigned long long reference_ts;
> > > +       unsigned short frame_num;
> > > +       unsigned short pic_num;
> > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > + */
> > > +struct slice_api_h264_decode_param {
> > > +       struct slice_h264_dpb_entry dpb[16];
> > 
> > V4L2_H264_NUM_DPB_ENTRIES?
> 
> For the same reason as above (this being a firmware structure), I
> think it is clearer to not use the kernel definitions here.
> 
> > 
> > > +       unsigned short num_slices;
> > > +       unsigned short nal_ref_idc;
> > > +       unsigned char ref_pic_list_p0[32];
> > > +       unsigned char ref_pic_list_b0[32];
> > > +       unsigned char ref_pic_list_b1[32];
> > 
> > V4L2_H264_REF_LIST_LEN?
> 
> Ditto.
> 
> > 
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > + */
> > > +struct mtk_h264_dec_slice_param {
> > > +       struct mtk_h264_sps_param                       sps;
> > > +       struct mtk_h264_pps_param                       pps;
> > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > +       struct slice_api_h264_decode_param              decode_params;
> > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > 
> > V4L2_H264_NUM_DPB_ENTRIES?
> 
> Ditto.
> 
> > 
> > > +};
> > > +
> > > +/**
> > > + * struct h264_fb - h264 decode frame buffer information
> > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > + * @poc         : picture order count of frame buffer
> > > + * @reserved    : for 8 bytes alignment
> > > + */
> > > +struct h264_fb {
> > > +       uint64_t vdec_fb_va;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       int32_t poc;
> > > +       uint32_t reserved;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_dec_info - decode information
> > > + * @dpb_sz             : decoding picture buffer size
> > > + * @resolution_changed  : resoltion change happen
> > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > + * @cap_num_planes     : number planes of capture buffer
> > > + * @bs_dma             : Input bit-stream buffer dma address
> > > + * @y_fb_dma           : Y frame buffer dma address
> > > + * @c_fb_dma           : C frame buffer dma address
> > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > + */
> > > +struct vdec_h264_dec_info {
> > > +       uint32_t dpb_sz;
> > > +       uint32_t resolution_changed;
> > > +       uint32_t realloc_mv_buf;
> > > +       uint32_t cap_num_planes;
> > > +       uint64_t bs_dma;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       uint64_t vdec_fb_va;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > + *                        between VPU and Host.
> > > + *                        The memory is allocated by VPU then mapping to
> > > Host
> > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > + *                        by VPU.
> > > + *                        AP-W/R : AP is writer/reader on this item
> > > + *                        VPU-W/R: VPU is write/reader on this item
> > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-
> > > R)
> > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W,
> > > VPU-R)
> > > + * @dec          : decode information (AP-R, VPU-W)
> > > + * @pic          : picture information (AP-R, VPU-W)
> > > + * @crop         : crop information (AP-R, VPU-W)
> > > + */
> > > +struct vdec_h264_vsi {
> > > +       uint64_t pred_buf_dma;
> > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > +       struct vdec_h264_dec_info dec;
> > > +       struct vdec_pic_info pic;
> > > +       struct v4l2_rect crop;
> > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > + * @num_nalu : how many nalus be decoded
> > > + * @ctx      : point to mtk_vcodec_ctx
> > > + * @pred_buf : HW working predication buffer
> > > + * @mv_buf   : HW working motion vector buffer
> > > + * @vpu      : VPU instance
> > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > + */
> > > +struct vdec_h264_slice_inst {
> > > +       unsigned int num_nalu;
> > > +       struct mtk_vcodec_ctx *ctx;
> > > +       struct mtk_vcodec_mem pred_buf;
> > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > +       struct vdec_vpu_inst vpu;
> > > +       struct vdec_h264_vsi vsi_ctx;
> > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > +
> > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > +};
> > > +
> > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > +                                int id)
> > > +{
> > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > +
> > > +       return ctrl->p_cur.p;
> > > +}
> > > +
> > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > +                             struct mtk_h264_dec_slice_param
> > > *slice_param)
> > > +{
> > > +       struct vb2_queue *vq;
> > > +       struct vb2_buffer *vb;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > +       u64 index;
> > > +
> > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > +
> > > +       for (index = 0; index < 16; index++) {
> > 
> > Ditto, some macro instead of 16.
> 
> Changed this to use ARRAY_SIZE() which is appropriate here.
> 
> > 
> > > +               const struct slice_h264_dpb_entry *dpb;
> > > +               int vb2_index;
> > > +
> > > +               dpb = &slice_param->decode_params.dpb[index];
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 0;
> > > +                       continue;
> > > +               }
> > > +
> > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > +               if (vb2_index < 0) {
> > > +                       mtk_vcodec_err(inst, "Reference invalid:
> > > dpb_index(%lld) reference_ts(%lld)",
> > > +                               index, dpb->reference_ts);
> > > +                       continue;
> > > +               }
> > > +               /* 1 for short term reference, 2 for long term reference
> > > */
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 1;
> > > +               else
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 2;
> > > +
> > > +               vb = vq->bufs[vb2_index];
> > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer,
> > > vb2_buf);
> > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > +
> > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes ==
> > > 2) {
> > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > +
> > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > +}
> > > +
> > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > +
> > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > +              
> > > V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > +}
> > > +
> > > +static void
> > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > +                       const struct v4l2_ctrl_h264_scaling_matrix
> > > *src_matrix)
> > > +{
> > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > +
> > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > +}
> > > +
> > > +static void get_h264_decode_parameters(
> > > +       struct slice_api_h264_decode_param *dst_params,
> > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params-
> > > >dpb[i];
> > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > +
> > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > +               dst_entry->frame_num = src_entry->frame_num;
> > > +               dst_entry->pic_num = src_entry->pic_num;
> > > +               dst_entry->top_field_order_cnt = src_entry-
> > > >top_field_order_cnt;
> > > +               dst_entry->bottom_field_order_cnt =
> > > +                       src_entry->bottom_field_order_cnt;
> > > +               dst_entry->flags = src_entry->flags;
> > > +       }
> > > +
> > > +       // num_slices is a leftover from the old H.264 support and is
> > > ignored
> > > +       // by the firmware.
> > > +       dst_params->num_slices = 0;
> > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > +       dst_params->bottom_field_order_cnt = src_params-
> > > >bottom_field_order_cnt;
> > > +       dst_params->flags = src_params->flags;
> > > +}
> > > +
> > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > +                           const struct v4l2_h264_dpb_entry *b)
> > > +{
> > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > +}
> > > +
> > > +/*
> > > + * Move DPB entries of dec_param that refer to a frame already existing
> > > in dpb
> > > + * into the already existing slot in dpb, and move other entries into new
> > > slots.
> > > + *
> > > + * This function is an adaptation of the similarly-named function in
> > > + * hantro_h264.c.
> > > + */
> > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params
> > > *dec_param,
> > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > +{
> > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       unsigned int i, j;
> > > +
> > > +       /* Disable all entries by default, and mark the ones in use. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > +                       set_bit(i, in_use);
> > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > +       }
> > > +
> > > +       /* Try to match new DPB entries with existing ones by their POCs.
> > > */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > >dpb[i];
> > > +
> > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > +                       continue;
> > > +
> > > +               /*
> > > +                * To cut off some comparisons, iterate only on target DPB
> > > +                * entries were already used.
> > > +                */
> > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +                       cdpb = &dpb[j];
> > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > +                               continue;
> > > +
> > > +                       *cdpb = *ndpb;
> > > +                       set_bit(j, used);
> > > +                       /* Don't reiterate on this one. */
> > > +                       clear_bit(j, in_use);
> > > +                       break;
> > > +               }
> > > +
> > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > +                       set_bit(i, new);
> > > +       }
> > > +
> > > +       /* For entries that could not be matched, use remaining free
> > > slots. */
> > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > >dpb[i];
> > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +               /*
> > > +                * Both arrays are of the same sizes, so there is no way
> > > +                * we can end up with no space in target array, unless
> > > +                * something is buggy.
> > > +                */
> > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > +                       return;
> > > +
> > > +               cdpb = &dpb[j];
> > > +               *cdpb = *ndpb;
> > > +               set_bit(j, used);
> > > +       }
> > > +}
> > > +
> > > +/*
> > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > + */
> > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > +{
> > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > +}
> > > +
> > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > +               get_ctrl_ptr(inst->ctx,
> > > V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > +               get_ctrl_ptr(inst->ctx,
> > > V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > +       struct mtk_h264_dec_slice_param *slice_param = &inst-
> > > >h264_slice_param;
> > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > +       int i;
> > > +
> > > +       update_dpb(dec_params, inst->dpb);
> > > +
> > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix,
> > > scaling_matrix);
> > > +       get_h264_decode_parameters(&slice_param->decode_params,
> > > dec_params,
> > > +                                  inst->dpb);
> > > +       get_h264_dpb_list(inst, slice_param);
> > > +
> > > +       /* Prepare the fields for our reference lists */
> > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > +       /* Build the reference lists */
> > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > +                                      inst->dpb);
> > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist,
> > > b1_reflist);
> > > +       /* Adapt the built lists to the firmware's expectations */
> > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > +
> > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > +}
> > > +
> > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int
> > > height)
> > > +{
> > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) +
> > > 8;
> > > +
> > > +       return HW_MB_STORE_SZ * unit_size;
> > > +}
> > > +
> > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int err = 0;
> > > +
> > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > +               return err;
> > > +       }
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > +       mem = &inst->pred_buf;
> > > +       if (mem->va)
> > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > +}
> > > +
> > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > +       struct vdec_pic_info *pic)
> > > +{
> > > +       int i;
> > > +       int err;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > +
> > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +               mem->size = buf_sz;
> > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > +               if (err) {
> > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > +                       return err;
> > > +               }
> > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int i;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +       }
> > > +}
> > > +
> > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > +                        struct vdec_pic_info *pic)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > +
> > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > +
> > > +       pic = &ctx->picinfo;
> > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > +               ctx->picinfo.fb_sz[1]);
> > > +
> > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx-
> > > >picinfo.buf_w) ||
> > > +                       (ctx->last_decoded_picinfo.buf_h != ctx-
> > > >picinfo.buf_h))
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) ->
> > > new(%d, %d)",
> > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > +                       ctx->last_decoded_picinfo.pic_w,
> > > +                       ctx->last_decoded_picinfo.pic_h,
> > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > +       }
> > > +}
> > > +
> > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > +       struct v4l2_rect *cr)
> > > +{
> > > +       cr->left = inst->vsi_ctx.crop.left;
> > > +       cr->top = inst->vsi_ctx.crop.top;
> > > +       cr->width = inst->vsi_ctx.crop.width;
> > > +       cr->height = inst->vsi_ctx.crop.height;
> > > +
> > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > +                        cr->left, cr->top, cr->width, cr->height);
> > > +}
> > > +
> > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > +       unsigned int *dpb_sz)
> > > +{
> > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > +}
> > > +
> > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > +       int err;
> > > +
> > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > +       if (!inst)
> > > +               return -ENOMEM;
> > > +
> > > +       inst->ctx = ctx;
> > > +
> > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > +       inst->vpu.ctx = ctx;
> > > +
> > > +       err = vpu_dec_init(&inst->vpu);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > +               goto error_free_inst;
> > > +       }
> > > +
> > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +       err = allocate_predication_buf(inst);
> > > +       if (err)
> > > +               goto error_deinit;
> > > +
> > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > +               sizeof(struct mtk_h264_sps_param),
> > > +               sizeof(struct mtk_h264_pps_param),
> > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > +               sizeof(struct mtk_h264_dpb_info));
> > > +
> > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > +
> > > +       ctx->drv_handle = inst;
> > > +       return 0;
> > > +
> > > +error_deinit:
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +
> > > +error_free_inst:
> > > +       kfree(inst);
> > > +       return err;
> > > +}
> > > +
> > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +       free_predication_buf(inst);
> > > +       free_mv_buf(inst);
> > > +
> > > +       kfree(inst);
> > > +}
> > > +
> > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > +{
> > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > +               return 3;
> > > +
> > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > +           data[3] == 1)
> > > +               return 4;
> > > +
> > > +       return -1;
> > > +}
> > > +
> > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem
> > > *bs,
> > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > +       struct mtk_video_dec_buf *src_buf_info;
> > > +       int nal_start_idx = 0, err = 0;
> > > +       uint32_t nal_type, data[2];
> > > +       unsigned char *buf;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +
> > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > +
> > > +       /* bs NULL means flush decoder */
> > > +       if (bs == NULL)
> > > +               return vpu_dec_reset(vpu);
> > > +
> > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf,
> > > bs_buffer);
> > > +
> > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > +
> > > +       buf = (unsigned char *)bs->va;
> > 
> > I can be completely wrong, but it would seem here
> > is where the CPU mapping is used.
> 
> I think you're right. :)
> 
> > 
> > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > +       if (nal_start_idx < 0)
> > > +               goto err_free_fb_out;
> > > +
> > > +       data[0] = bs->size;
> > > +       data[1] = buf[nal_start_idx];
> > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > 
> > Which seems to be used to parse the NAL type. But shouldn't
> > you expect here VLC NALUs only?
> > 
> > I.e. you only get IDR or non-IDR frames, marked with
> > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> 
> Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> the test using it below) and the driver is just as happy.
> 
> > 
> > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst-
> > > >num_nalu,
> > > +                        nal_type);
> > > +
> > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > +
> > > +       get_vdec_decode_parameters(inst);
> > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > +       if (*res_chg) {
> > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > +                       if (err)
> > > +                               goto err_free_fb_out;
> > > +               }
> > > +               *res_chg = false;
> > > +       }
> > > +
> > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > +       err = vpu_dec_start(vpu, data, 2);
> > 
> > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > could test if that can be derived without the CPU mapping.
> > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> 
> This one is a bit trickier. It seems the NAL type is passed as part of
> the decode request to the firmware. Which should be absolutely not
> needed since the firmware can check this from the buffer itself. Just
> for fun I have tried setting this parameter unconditionally to 0x1
> (non-IDR picture) and all I get is green frames with seemingly random
> garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> a different kind of garbage, and once every while a properly rendered
> frame (presumably when it is *really* an IDR frame).

Can't you deduce this from the v4l2_ctrl_h264_slice_params.slice_type ?

> 
> So, mmm, I'm afraid we cannot decode properly without this information
> and thus without the mapping, unless Yunfei can tell us of a way to
> achieve this. Yunfei, do you have any idea?
> 
> Cheers,
> Alex.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-15 15:21         ` Nicolas Dufresne
  0 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-15 15:21 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia, Yunfei Dong
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Mauro Carvalho Chehab,
	Hans Verkuil, linux-media, Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> Hi Ezequiel,
> 
> On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> >  Hi Alex,
> > 
> > Thanks for the patch.
> > 
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > 
> > > Add support for H.264 decoding using the stateless API, as supported by
> > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > builders.
> > > 
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/Kconfig                |   1 +
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > >  5 files changed, 813 insertions(+)
> > >  create mode 100644 drivers/media/platform/mtk-
> > > vcodec/vdec/vdec_h264_req_if.c
> > > 
> > > diff --git a/drivers/media/platform/Kconfig
> > > b/drivers/media/platform/Kconfig
> > > index fd1831e97b22..c27db5643712 100644
> > > --- a/drivers/media/platform/Kconfig
> > > +++ b/drivers/media/platform/Kconfig
> > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > >         select V4L2_MEM2MEM_DEV
> > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > +       select V4L2_H264
> > >         help
> > >           Mediatek video codec driver provides HW capability to
> > >           encode and decode in a range of video formats on MT8173
> > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile
> > > b/drivers/media/platform/mtk-vcodec/Makefile
> > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > >                 vdec/vdec_vp8_if.o \
> > >                 vdec/vdec_vp9_if.o \
> > > +               vdec/vdec_h264_req_if.o \
> > >                 mtk_vcodec_dec_drv.o \
> > >                 vdec_drv_if.o \
> > >                 vdec_vpu_if.o \
> > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > new file mode 100644
> > > index 000000000000..2fbbfbbcfbec
> > > --- /dev/null
> > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > @@ -0,0 +1,807 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/slab.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +#include <media/v4l2-h264.h>
> > > +#include <media/videobuf2-dma-contig.h>
> > > +
> > > +#include "../vdec_drv_if.h"
> > > +#include "../mtk_vcodec_util.h"
> > > +#include "../mtk_vcodec_dec.h"
> > > +#include "../mtk_vcodec_intr.h"
> > > +#include "../vdec_vpu_if.h"
> > > +#include "../vdec_drv_base.h"
> > > +
> > > +#define NAL_NON_IDR_SLICE                      0x01
> > > +#define NAL_IDR_SLICE                          0x05
> > > +#define NAL_H264_PPS                           0x08
> > 
> > Not used?
> > 
> > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > +
> > 
> > I believe you may not need the NAL type.
> 
> True, removed this block of defines.
> 
> > 
> > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > +#define MB_UNIT_LEN                            16
> > > +
> > > +/* get used parameters for sps/pps */
> > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > +#define GET_MTK_VDEC_PARAM(param) \
> > > +       { dst_param->param = src_param->param; }
> > > +/* motion vector size (bytes) for every macro block */
> > > +#define HW_MB_STORE_SZ                         64
> > > +
> > > +#define H264_MAX_FB_NUM                                17
> > > +#define H264_MAX_MV_NUM                                32
> > > +#define HDR_PARSING_BUF_SZ                     1024
> > > +
> > > +/**
> > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > + * @y_dma_addr: Y bitstream physical address
> > > + * @c_dma_addr: CbCr bitstream physical address
> > > + * @reference_flag: reference picture flag (short/long term reference
> > > picture)
> > > + * @field: field picture flag
> > > + */
> > > +struct mtk_h264_dpb_info {
> > > +       dma_addr_t y_dma_addr;
> > > +       dma_addr_t c_dma_addr;
> > > +       int reference_flag;
> > > +       int field;
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_sps_param  - parameters for sps
> > > + */
> > > +struct mtk_h264_sps_param {
> > > +       unsigned char chroma_format_idc;
> > > +       unsigned char bit_depth_luma_minus8;
> > > +       unsigned char bit_depth_chroma_minus8;
> > > +       unsigned char log2_max_frame_num_minus4;
> > > +       unsigned char pic_order_cnt_type;
> > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > +       unsigned char max_num_ref_frames;
> > > +       unsigned char separate_colour_plane_flag;
> > > +       unsigned short pic_width_in_mbs_minus1;
> > > +       unsigned short pic_height_in_map_units_minus1;
> > > +       unsigned int max_frame_nums;
> > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > +       unsigned char delta_pic_order_always_zero_flag;
> > > +       unsigned char frame_mbs_only_flag;
> > > +       unsigned char mb_adaptive_frame_field_flag;
> > > +       unsigned char direct_8x8_inference_flag;
> > > +       unsigned char reserved[3];
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_pps_param  - parameters for pps
> > > + */
> > > +struct mtk_h264_pps_param {
> > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > +       unsigned char weighted_bipred_idc;
> > > +       char pic_init_qp_minus26;
> > > +       char chroma_qp_index_offset;
> > > +       char second_chroma_qp_index_offset;
> > > +       unsigned char entropy_coding_mode_flag;
> > > +       unsigned char pic_order_present_flag;
> > > +       unsigned char deblocking_filter_control_present_flag;
> > > +       unsigned char constrained_intra_pred_flag;
> > > +       unsigned char weighted_pred_flag;
> > > +       unsigned char redundant_pic_cnt_present_flag;
> > > +       unsigned char transform_8x8_mode_flag;
> > > +       unsigned char scaling_matrix_present_flag;
> > > +       unsigned char reserved[2];
> > > +};
> > > +
> > > +struct slice_api_h264_scaling_matrix {
> > 
> > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > Well I guess you don't want to mix a hardware-specific
> > thing with the V4L2 API maybe.
> 
> That's the idea. Although the layout match and the ABI is now stable,
> I think this communicates better the fact that this is a firmware
> structure.
> 
> > 
> > > +       unsigned char scaling_list_4x4[6][16];
> > > +       unsigned char scaling_list_8x8[6][64];
> > > +};
> > > +
> > > +struct slice_h264_dpb_entry {
> > > +       unsigned long long reference_ts;
> > > +       unsigned short frame_num;
> > > +       unsigned short pic_num;
> > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > + */
> > > +struct slice_api_h264_decode_param {
> > > +       struct slice_h264_dpb_entry dpb[16];
> > 
> > V4L2_H264_NUM_DPB_ENTRIES?
> 
> For the same reason as above (this being a firmware structure), I
> think it is clearer to not use the kernel definitions here.
> 
> > 
> > > +       unsigned short num_slices;
> > > +       unsigned short nal_ref_idc;
> > > +       unsigned char ref_pic_list_p0[32];
> > > +       unsigned char ref_pic_list_b0[32];
> > > +       unsigned char ref_pic_list_b1[32];
> > 
> > V4L2_H264_REF_LIST_LEN?
> 
> Ditto.
> 
> > 
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > + */
> > > +struct mtk_h264_dec_slice_param {
> > > +       struct mtk_h264_sps_param                       sps;
> > > +       struct mtk_h264_pps_param                       pps;
> > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > +       struct slice_api_h264_decode_param              decode_params;
> > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > 
> > V4L2_H264_NUM_DPB_ENTRIES?
> 
> Ditto.
> 
> > 
> > > +};
> > > +
> > > +/**
> > > + * struct h264_fb - h264 decode frame buffer information
> > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > + * @poc         : picture order count of frame buffer
> > > + * @reserved    : for 8 bytes alignment
> > > + */
> > > +struct h264_fb {
> > > +       uint64_t vdec_fb_va;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       int32_t poc;
> > > +       uint32_t reserved;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_dec_info - decode information
> > > + * @dpb_sz             : decoding picture buffer size
> > > + * @resolution_changed  : resoltion change happen
> > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > + * @cap_num_planes     : number planes of capture buffer
> > > + * @bs_dma             : Input bit-stream buffer dma address
> > > + * @y_fb_dma           : Y frame buffer dma address
> > > + * @c_fb_dma           : C frame buffer dma address
> > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > + */
> > > +struct vdec_h264_dec_info {
> > > +       uint32_t dpb_sz;
> > > +       uint32_t resolution_changed;
> > > +       uint32_t realloc_mv_buf;
> > > +       uint32_t cap_num_planes;
> > > +       uint64_t bs_dma;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       uint64_t vdec_fb_va;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > + *                        between VPU and Host.
> > > + *                        The memory is allocated by VPU then mapping to
> > > Host
> > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > + *                        by VPU.
> > > + *                        AP-W/R : AP is writer/reader on this item
> > > + *                        VPU-W/R: VPU is write/reader on this item
> > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-
> > > R)
> > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W,
> > > VPU-R)
> > > + * @dec          : decode information (AP-R, VPU-W)
> > > + * @pic          : picture information (AP-R, VPU-W)
> > > + * @crop         : crop information (AP-R, VPU-W)
> > > + */
> > > +struct vdec_h264_vsi {
> > > +       uint64_t pred_buf_dma;
> > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > +       struct vdec_h264_dec_info dec;
> > > +       struct vdec_pic_info pic;
> > > +       struct v4l2_rect crop;
> > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > + * @num_nalu : how many nalus be decoded
> > > + * @ctx      : point to mtk_vcodec_ctx
> > > + * @pred_buf : HW working predication buffer
> > > + * @mv_buf   : HW working motion vector buffer
> > > + * @vpu      : VPU instance
> > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > + */
> > > +struct vdec_h264_slice_inst {
> > > +       unsigned int num_nalu;
> > > +       struct mtk_vcodec_ctx *ctx;
> > > +       struct mtk_vcodec_mem pred_buf;
> > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > +       struct vdec_vpu_inst vpu;
> > > +       struct vdec_h264_vsi vsi_ctx;
> > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > +
> > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > +};
> > > +
> > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > +                                int id)
> > > +{
> > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > +
> > > +       return ctrl->p_cur.p;
> > > +}
> > > +
> > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > +                             struct mtk_h264_dec_slice_param
> > > *slice_param)
> > > +{
> > > +       struct vb2_queue *vq;
> > > +       struct vb2_buffer *vb;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > +       u64 index;
> > > +
> > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > +
> > > +       for (index = 0; index < 16; index++) {
> > 
> > Ditto, some macro instead of 16.
> 
> Changed this to use ARRAY_SIZE() which is appropriate here.
> 
> > 
> > > +               const struct slice_h264_dpb_entry *dpb;
> > > +               int vb2_index;
> > > +
> > > +               dpb = &slice_param->decode_params.dpb[index];
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 0;
> > > +                       continue;
> > > +               }
> > > +
> > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > +               if (vb2_index < 0) {
> > > +                       mtk_vcodec_err(inst, "Reference invalid:
> > > dpb_index(%lld) reference_ts(%lld)",
> > > +                               index, dpb->reference_ts);
> > > +                       continue;
> > > +               }
> > > +               /* 1 for short term reference, 2 for long term reference
> > > */
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 1;
> > > +               else
> > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > 2;
> > > +
> > > +               vb = vq->bufs[vb2_index];
> > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer,
> > > vb2_buf);
> > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > +
> > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes ==
> > > 2) {
> > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > +
> > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > +}
> > > +
> > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > +
> > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > +              
> > > V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > +}
> > > +
> > > +static void
> > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > +                       const struct v4l2_ctrl_h264_scaling_matrix
> > > *src_matrix)
> > > +{
> > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > +
> > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > +}
> > > +
> > > +static void get_h264_decode_parameters(
> > > +       struct slice_api_h264_decode_param *dst_params,
> > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params-
> > > >dpb[i];
> > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > +
> > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > +               dst_entry->frame_num = src_entry->frame_num;
> > > +               dst_entry->pic_num = src_entry->pic_num;
> > > +               dst_entry->top_field_order_cnt = src_entry-
> > > >top_field_order_cnt;
> > > +               dst_entry->bottom_field_order_cnt =
> > > +                       src_entry->bottom_field_order_cnt;
> > > +               dst_entry->flags = src_entry->flags;
> > > +       }
> > > +
> > > +       // num_slices is a leftover from the old H.264 support and is
> > > ignored
> > > +       // by the firmware.
> > > +       dst_params->num_slices = 0;
> > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > +       dst_params->bottom_field_order_cnt = src_params-
> > > >bottom_field_order_cnt;
> > > +       dst_params->flags = src_params->flags;
> > > +}
> > > +
> > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > +                           const struct v4l2_h264_dpb_entry *b)
> > > +{
> > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > +}
> > > +
> > > +/*
> > > + * Move DPB entries of dec_param that refer to a frame already existing
> > > in dpb
> > > + * into the already existing slot in dpb, and move other entries into new
> > > slots.
> > > + *
> > > + * This function is an adaptation of the similarly-named function in
> > > + * hantro_h264.c.
> > > + */
> > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params
> > > *dec_param,
> > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > +{
> > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       unsigned int i, j;
> > > +
> > > +       /* Disable all entries by default, and mark the ones in use. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > +                       set_bit(i, in_use);
> > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > +       }
> > > +
> > > +       /* Try to match new DPB entries with existing ones by their POCs.
> > > */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > >dpb[i];
> > > +
> > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > +                       continue;
> > > +
> > > +               /*
> > > +                * To cut off some comparisons, iterate only on target DPB
> > > +                * entries were already used.
> > > +                */
> > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +                       cdpb = &dpb[j];
> > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > +                               continue;
> > > +
> > > +                       *cdpb = *ndpb;
> > > +                       set_bit(j, used);
> > > +                       /* Don't reiterate on this one. */
> > > +                       clear_bit(j, in_use);
> > > +                       break;
> > > +               }
> > > +
> > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > +                       set_bit(i, new);
> > > +       }
> > > +
> > > +       /* For entries that could not be matched, use remaining free
> > > slots. */
> > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > >dpb[i];
> > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +               /*
> > > +                * Both arrays are of the same sizes, so there is no way
> > > +                * we can end up with no space in target array, unless
> > > +                * something is buggy.
> > > +                */
> > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > +                       return;
> > > +
> > > +               cdpb = &dpb[j];
> > > +               *cdpb = *ndpb;
> > > +               set_bit(j, used);
> > > +       }
> > > +}
> > > +
> > > +/*
> > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > + */
> > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > +{
> > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > +}
> > > +
> > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > +               get_ctrl_ptr(inst->ctx,
> > > V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > +               get_ctrl_ptr(inst->ctx,
> > > V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > +       struct mtk_h264_dec_slice_param *slice_param = &inst-
> > > >h264_slice_param;
> > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > +       int i;
> > > +
> > > +       update_dpb(dec_params, inst->dpb);
> > > +
> > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix,
> > > scaling_matrix);
> > > +       get_h264_decode_parameters(&slice_param->decode_params,
> > > dec_params,
> > > +                                  inst->dpb);
> > > +       get_h264_dpb_list(inst, slice_param);
> > > +
> > > +       /* Prepare the fields for our reference lists */
> > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > +       /* Build the reference lists */
> > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > +                                      inst->dpb);
> > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist,
> > > b1_reflist);
> > > +       /* Adapt the built lists to the firmware's expectations */
> > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > +
> > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > +}
> > > +
> > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int
> > > height)
> > > +{
> > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) +
> > > 8;
> > > +
> > > +       return HW_MB_STORE_SZ * unit_size;
> > > +}
> > > +
> > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int err = 0;
> > > +
> > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > +               return err;
> > > +       }
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > +       mem = &inst->pred_buf;
> > > +       if (mem->va)
> > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > +}
> > > +
> > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > +       struct vdec_pic_info *pic)
> > > +{
> > > +       int i;
> > > +       int err;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > +
> > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +               mem->size = buf_sz;
> > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > +               if (err) {
> > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > +                       return err;
> > > +               }
> > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int i;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +       }
> > > +}
> > > +
> > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > +                        struct vdec_pic_info *pic)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > +
> > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > +
> > > +       pic = &ctx->picinfo;
> > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > +               ctx->picinfo.fb_sz[1]);
> > > +
> > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx-
> > > >picinfo.buf_w) ||
> > > +                       (ctx->last_decoded_picinfo.buf_h != ctx-
> > > >picinfo.buf_h))
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) ->
> > > new(%d, %d)",
> > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > +                       ctx->last_decoded_picinfo.pic_w,
> > > +                       ctx->last_decoded_picinfo.pic_h,
> > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > +       }
> > > +}
> > > +
> > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > +       struct v4l2_rect *cr)
> > > +{
> > > +       cr->left = inst->vsi_ctx.crop.left;
> > > +       cr->top = inst->vsi_ctx.crop.top;
> > > +       cr->width = inst->vsi_ctx.crop.width;
> > > +       cr->height = inst->vsi_ctx.crop.height;
> > > +
> > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > +                        cr->left, cr->top, cr->width, cr->height);
> > > +}
> > > +
> > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > +       unsigned int *dpb_sz)
> > > +{
> > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > +}
> > > +
> > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > +       int err;
> > > +
> > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > +       if (!inst)
> > > +               return -ENOMEM;
> > > +
> > > +       inst->ctx = ctx;
> > > +
> > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > +       inst->vpu.ctx = ctx;
> > > +
> > > +       err = vpu_dec_init(&inst->vpu);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > +               goto error_free_inst;
> > > +       }
> > > +
> > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +       err = allocate_predication_buf(inst);
> > > +       if (err)
> > > +               goto error_deinit;
> > > +
> > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > +               sizeof(struct mtk_h264_sps_param),
> > > +               sizeof(struct mtk_h264_pps_param),
> > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > +               sizeof(struct mtk_h264_dpb_info));
> > > +
> > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > +
> > > +       ctx->drv_handle = inst;
> > > +       return 0;
> > > +
> > > +error_deinit:
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +
> > > +error_free_inst:
> > > +       kfree(inst);
> > > +       return err;
> > > +}
> > > +
> > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +       free_predication_buf(inst);
> > > +       free_mv_buf(inst);
> > > +
> > > +       kfree(inst);
> > > +}
> > > +
> > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > +{
> > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > +               return 3;
> > > +
> > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > +           data[3] == 1)
> > > +               return 4;
> > > +
> > > +       return -1;
> > > +}
> > > +
> > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem
> > > *bs,
> > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > +       struct mtk_video_dec_buf *src_buf_info;
> > > +       int nal_start_idx = 0, err = 0;
> > > +       uint32_t nal_type, data[2];
> > > +       unsigned char *buf;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +
> > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > +
> > > +       /* bs NULL means flush decoder */
> > > +       if (bs == NULL)
> > > +               return vpu_dec_reset(vpu);
> > > +
> > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf,
> > > bs_buffer);
> > > +
> > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > +
> > > +       buf = (unsigned char *)bs->va;
> > 
> > I can be completely wrong, but it would seem here
> > is where the CPU mapping is used.
> 
> I think you're right. :)
> 
> > 
> > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > +       if (nal_start_idx < 0)
> > > +               goto err_free_fb_out;
> > > +
> > > +       data[0] = bs->size;
> > > +       data[1] = buf[nal_start_idx];
> > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > 
> > Which seems to be used to parse the NAL type. But shouldn't
> > you expect here VLC NALUs only?
> > 
> > I.e. you only get IDR or non-IDR frames, marked with
> > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> 
> Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> the test using it below) and the driver is just as happy.
> 
> > 
> > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst-
> > > >num_nalu,
> > > +                        nal_type);
> > > +
> > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > +
> > > +       get_vdec_decode_parameters(inst);
> > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > +       if (*res_chg) {
> > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > +                       if (err)
> > > +                               goto err_free_fb_out;
> > > +               }
> > > +               *res_chg = false;
> > > +       }
> > > +
> > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > +       err = vpu_dec_start(vpu, data, 2);
> > 
> > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > could test if that can be derived without the CPU mapping.
> > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> 
> This one is a bit trickier. It seems the NAL type is passed as part of
> the decode request to the firmware. Which should be absolutely not
> needed since the firmware can check this from the buffer itself. Just
> for fun I have tried setting this parameter unconditionally to 0x1
> (non-IDR picture) and all I get is green frames with seemingly random
> garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> a different kind of garbage, and once every while a properly rendered
> frame (presumably when it is *really* an IDR frame).

Can't you deduce this from the v4l2_ctrl_h264_slice_params.slice_type ?

> 
> So, mmm, I'm afraid we cannot decode properly without this information
> and thus without the mapping, unless Yunfei can tell us of a way to
> achieve this. Yunfei, do you have any idea?
> 
> Cheers,
> Alex.



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-03-15 11:28       ` Alexandre Courbot
@ 2021-03-15 21:45         ` Ezequiel Garcia
  -1 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-15 21:45 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Alexandre,

On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Ezequiel, thanks for the feedback!
>
> On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> > Hello Alex,
> >
> > Thanks for the patch.
> >
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > >
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > >
> > > Support the stateless codec API that will be used by MT8183.
> > >
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > >
> > [..]
> >
> > > +
> > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> >
> > This "needed_in_request" is not really required, as controls
> > are not volatile, and their value is stored per-context (per-fd).
> >
> > It's perfectly valid for an application to pass the SPS control
> > at the beginning of the sequence, and then omit it
> > in further requests.
>
> If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> requests, this boolean only checks that the control has been provided
> at least once, and not that it is provided with every request. Without
> it we could send a frame to the firmware without e.g. setting an SPS,
> which would be a problem.
>

As Nicolas points out, in V4L2 controls have an initial value,
so no control can be unset.

> >
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > +                       .menu_skip_mask =
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +};
> >
> > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > the driver supports. From a next patch, this case seems to be
> > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
>
> Indeed - I've added the control, thanks for catching this!
>
> >
> > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > +
> > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .type = MTK_FMT_DEC,
> > > +               .num_planes = 1,
> > > +       },
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > +               .type = MTK_FMT_FRAME,
> > > +               .num_planes = 2,
> > > +       },
> > > +};
> > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > +#define DEFAULT_OUT_FMT_IDX    0
> > > +#define DEFAULT_CAP_FMT_IDX    1
> > > +
> > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .stepwise = {
> > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > +               },
> > > +       },
> > > +};
> > > +
> > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > +
> > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > > +                                              struct vdec_fb *fb)
> > > +{
> > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               unsigned int cap_c_size =
> > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +
> > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > +       }
> > > +}
> > > +
> > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > > +{
> > > +       struct mtk_video_dec_buf *framebuf =
> > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > +
> > > +       pfb = &framebuf->frame_buffer;
> > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> >
> > Are you sure you need a CPU mapping? It seems strange.
> > I'll comment some more on the next patch(es).
>
> I'll answer on the next patch since this is where that mapping is being used.
>
> >
> > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > +               pfb->base_c.dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +       }
> > > +       mtk_v4l2_debug(1,
> > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > > +               dst_buf->index, pfb,
> > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > +               ctx->decoded_frame_cnt);
> > > +
> > > +       return pfb;
> > > +}
> > > +
> > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +
> > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static int fops_media_request_validate(struct media_request *mreq)
> > > +{
> > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > +       struct media_request_object *req_obj;
> > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       switch (buffer_cnt) {
> > > +       case 1:
> > > +               /* We expect exactly one buffer with the request */
> > > +               break;
> > > +       case 0:
> > > +               mtk_v4l2_err("No buffer provided with the request");
> > > +               return -ENOENT;
> > > +       default:
> > > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > > +                            buffer_cnt);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > +               struct vb2_buffer *vb;
> > > +
> > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +                       break;
> > > +               }
> > > +       }
> > > +
> > > +       if (!ctx) {
> > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       parent_hdl = &ctx->ctrl_hdl;
> > > +
> > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > +       if (!hdl) {
> > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > > +                       continue;
> > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > +                       continue;
> > > +
> > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > +                                         mtk_stateless_controls[i].cfg.id);
> > > +               if (!ctrl) {
> > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > +                       return -ENOENT;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > +
> > > +       return vb2_request_validate(mreq);
> > > +}
> > > +
> > > +static void mtk_vdec_worker(struct work_struct *work)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx =
> > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > +       struct vb2_buffer *vb2_src;
> > > +       struct mtk_vcodec_mem *bs_src;
> > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > +       struct media_request *src_buf_req;
> > > +       struct vdec_fb *dst_buf;
> > > +       bool res_chg = false;
> > > +       int ret;
> > > +
> > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_src == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_dst == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > +                                  m2m_buf.vb);
> > > +       bs_src = &dec_buf_src->bs_buffer;
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > +                       ctx->id, src_buf->vb2_queue->type,
> > > +                       src_buf->index, src_buf, src_buf_info);
> > > +
> > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > +       if (!bs_src->va) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > +                            vb2_src->index);
> > > +               return;
> > > +       }
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > > +       /* Apply request controls. */
> > > +       src_buf_req = vb2_src->req_obj.req;
> > > +       if (src_buf_req)
> > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > +       else
> > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > +
> > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > +       if (ret) {
> > > +               mtk_v4l2_err(
> > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > +                       vb2_src->timestamp, ret, res_chg);
> > > +               if (ret == -EIO) {
> > > +                       mutex_lock(&ctx->lock);
> > > +                       dec_buf_src->error = true;
> > > +                       mutex_unlock(&ctx->lock);
> > > +               }
> > > +       }
> > > +
> > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > +
> > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > +
> > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > +                       ctx->id, vb->vb2_queue->type,
> > > +                       vb->index, vb);
> > > +
> > > +       mutex_lock(&ctx->lock);
> > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > +       mutex_unlock(&ctx->lock);
> > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > +               return;
> > > +
> > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > +
> > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > +       if (ctx->state == MTK_STATE_INIT) {
> > > +               ctx->state = MTK_STATE_HEADER;
> > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> >
> > This state thing seems just something to make the rest
> > of the stateful-based driver happy, right?
>
> Correct - if anything we should either use more of the state here
> (i.e. set the error state when relevant) or move the state entirely in
> the stateful part of the driver.
>
> >
> > Makes me wonder a bit if just splitting the stateless part to its
> > own driver, wouldn't make your maintenance easier.
> >
> > What's the motivation for sharing the driver?
>
> Technically you could do it both ways. Separating the driver would
> result in some boilerplate code and buffer-management structs
> duplication (unless we keep the shared part under another module - but
> in this case we are basically in the same situation as now). Also
> despite using different userspace-facing ABIs, MT8173 and MT8183
> follow a similar architecture and a similar firmware interface.
> Considering these similarities it seems simpler from an architectural
> point of view to have all the Mediatek codec support under the same
> driver. It also probably results in less code.
>
> That being said, the split can probably be improved as you pointed out
> with this state variable. But the current split is not too bad IMHO,
> at least not worse than how the code was originally.
>
> >
> > > +       } else {
> > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > +                               ctx->id, ctx->state);
> > > +       }
> > > +}
> > > +
> > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       bool res_chg;
> > > +
> > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > +}
> > > +
> > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > +};
> > > +
> > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > +       if (ctx->ctrl_hdl.error) {
> > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > +               return ctx->ctrl_hdl.error;
> > > +       }
> > > +
> > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > +                               0, 32, 1, 1);
> > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> >
> > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > to return the DPB size. However, isn't this something userspace already knows?
>
> True, but that's also a control the driver is supposed to provide per
> the spec IIUC.
>

I don't see the specification requiring this control. TBH, I'd just drop it
and if needed fix the application to support this as an optional
control.

In any case, stateless devices should just need 1 output and 1 capture buffer.

You might dislike this redundancy, note that you can also get the minimum
required buffers through VIDIOC_REQBUFS, where the count
v4l2_requestbuffers.field is returned back to userspace with the
number of allocated buffers.

If you request just 1 buffer, and your driver needed 3, you should
get a 3 there (vb2_ops.queue_setup takes care of that).

Thanks,
Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-15 21:45         ` Ezequiel Garcia
  0 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-15 21:45 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Alexandre,

On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Ezequiel, thanks for the feedback!
>
> On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> > Hello Alex,
> >
> > Thanks for the patch.
> >
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > >
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > >
> > > Support the stateless codec API that will be used by MT8183.
> > >
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > >
> > [..]
> >
> > > +
> > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> >
> > This "needed_in_request" is not really required, as controls
> > are not volatile, and their value is stored per-context (per-fd).
> >
> > It's perfectly valid for an application to pass the SPS control
> > at the beginning of the sequence, and then omit it
> > in further requests.
>
> If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> requests, this boolean only checks that the control has been provided
> at least once, and not that it is provided with every request. Without
> it we could send a frame to the firmware without e.g. setting an SPS,
> which would be a problem.
>

As Nicolas points out, in V4L2 controls have an initial value,
so no control can be unset.

> >
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +               .needed_in_request = true,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > +                       .menu_skip_mask =
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +       {
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > +               },
> > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > +       },
> > > +};
> >
> > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > the driver supports. From a next patch, this case seems to be
> > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
>
> Indeed - I've added the control, thanks for catching this!
>
> >
> > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > +
> > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .type = MTK_FMT_DEC,
> > > +               .num_planes = 1,
> > > +       },
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > +               .type = MTK_FMT_FRAME,
> > > +               .num_planes = 2,
> > > +       },
> > > +};
> > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > +#define DEFAULT_OUT_FMT_IDX    0
> > > +#define DEFAULT_CAP_FMT_IDX    1
> > > +
> > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > +       {
> > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > +               .stepwise = {
> > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > +               },
> > > +       },
> > > +};
> > > +
> > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > +
> > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > > +                                              struct vdec_fb *fb)
> > > +{
> > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               unsigned int cap_c_size =
> > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +
> > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > +       }
> > > +}
> > > +
> > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > > +{
> > > +       struct mtk_video_dec_buf *framebuf =
> > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > +
> > > +       pfb = &framebuf->frame_buffer;
> > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> >
> > Are you sure you need a CPU mapping? It seems strange.
> > I'll comment some more on the next patch(es).
>
> I'll answer on the next patch since this is where that mapping is being used.
>
> >
> > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > +
> > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > +               pfb->base_c.dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > +       }
> > > +       mtk_v4l2_debug(1,
> > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > > +               dst_buf->index, pfb,
> > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > +               ctx->decoded_frame_cnt);
> > > +
> > > +       return pfb;
> > > +}
> > > +
> > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +
> > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static int fops_media_request_validate(struct media_request *mreq)
> > > +{
> > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > +       struct media_request_object *req_obj;
> > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       switch (buffer_cnt) {
> > > +       case 1:
> > > +               /* We expect exactly one buffer with the request */
> > > +               break;
> > > +       case 0:
> > > +               mtk_v4l2_err("No buffer provided with the request");
> > > +               return -ENOENT;
> > > +       default:
> > > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > > +                            buffer_cnt);
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > +               struct vb2_buffer *vb;
> > > +
> > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +                       break;
> > > +               }
> > > +       }
> > > +
> > > +       if (!ctx) {
> > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       parent_hdl = &ctx->ctrl_hdl;
> > > +
> > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > +       if (!hdl) {
> > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > +               return -ENOENT;
> > > +       }
> > > +
> > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > > +                       continue;
> > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > +                       continue;
> > > +
> > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > +                                         mtk_stateless_controls[i].cfg.id);
> > > +               if (!ctrl) {
> > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > +                       return -ENOENT;
> > > +               }
> > > +       }
> > > +
> > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > +
> > > +       return vb2_request_validate(mreq);
> > > +}
> > > +
> > > +static void mtk_vdec_worker(struct work_struct *work)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx =
> > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > +       struct vb2_buffer *vb2_src;
> > > +       struct mtk_vcodec_mem *bs_src;
> > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > +       struct media_request *src_buf_req;
> > > +       struct vdec_fb *dst_buf;
> > > +       bool res_chg = false;
> > > +       int ret;
> > > +
> > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_src == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > +       if (vb2_v4l2_dst == NULL) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > > +               return;
> > > +       }
> > > +
> > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > +                                  m2m_buf.vb);
> > > +       bs_src = &dec_buf_src->bs_buffer;
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > +                       ctx->id, src_buf->vb2_queue->type,
> > > +                       src_buf->index, src_buf, src_buf_info);
> > > +
> > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > +       if (!bs_src->va) {
> > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > +                            vb2_src->index);
> > > +               return;
> > > +       }
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > > +       /* Apply request controls. */
> > > +       src_buf_req = vb2_src->req_obj.req;
> > > +       if (src_buf_req)
> > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > +       else
> > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > +
> > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > +       if (ret) {
> > > +               mtk_v4l2_err(
> > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > +                       vb2_src->timestamp, ret, res_chg);
> > > +               if (ret == -EIO) {
> > > +                       mutex_lock(&ctx->lock);
> > > +                       dec_buf_src->error = true;
> > > +                       mutex_unlock(&ctx->lock);
> > > +               }
> > > +       }
> > > +
> > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > +
> > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > +
> > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > +}
> > > +
> > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > +
> > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > +                       ctx->id, vb->vb2_queue->type,
> > > +                       vb->index, vb);
> > > +
> > > +       mutex_lock(&ctx->lock);
> > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > +       mutex_unlock(&ctx->lock);
> > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > +               return;
> > > +
> > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > +
> > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > +       if (ctx->state == MTK_STATE_INIT) {
> > > +               ctx->state = MTK_STATE_HEADER;
> > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> >
> > This state thing seems just something to make the rest
> > of the stateful-based driver happy, right?
>
> Correct - if anything we should either use more of the state here
> (i.e. set the error state when relevant) or move the state entirely in
> the stateful part of the driver.
>
> >
> > Makes me wonder a bit if just splitting the stateless part to its
> > own driver, wouldn't make your maintenance easier.
> >
> > What's the motivation for sharing the driver?
>
> Technically you could do it both ways. Separating the driver would
> result in some boilerplate code and buffer-management structs
> duplication (unless we keep the shared part under another module - but
> in this case we are basically in the same situation as now). Also
> despite using different userspace-facing ABIs, MT8173 and MT8183
> follow a similar architecture and a similar firmware interface.
> Considering these similarities it seems simpler from an architectural
> point of view to have all the Mediatek codec support under the same
> driver. It also probably results in less code.
>
> That being said, the split can probably be improved as you pointed out
> with this state variable. But the current split is not too bad IMHO,
> at least not worse than how the code was originally.
>
> >
> > > +       } else {
> > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > +                               ctx->id, ctx->state);
> > > +       }
> > > +}
> > > +
> > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       bool res_chg;
> > > +
> > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > +}
> > > +
> > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > +};
> > > +
> > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct v4l2_ctrl *ctrl;
> > > +       unsigned int i;
> > > +
> > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > +       if (ctx->ctrl_hdl.error) {
> > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > +               return ctx->ctrl_hdl.error;
> > > +       }
> > > +
> > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > +                               0, 32, 1, 1);
> > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> >
> > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > to return the DPB size. However, isn't this something userspace already knows?
>
> True, but that's also a control the driver is supposed to provide per
> the spec IIUC.
>

I don't see the specification requiring this control. TBH, I'd just drop it
and if needed fix the application to support this as an optional
control.

In any case, stateless devices should just need 1 output and 1 capture buffer.

You might dislike this redundancy, note that you can also get the minimum
required buffers through VIDIOC_REQBUFS, where the count
v4l2_requestbuffers.field is returned back to userspace with the
number of allocated buffers.

If you request just 1 buffer, and your driver needed 3, you should
get a 3 there (vb2_ops.queue_setup takes care of that).

Thanks,
Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-03-15 11:28       ` Alexandre Courbot
@ 2021-03-15 22:08         ` Ezequiel Garcia
  -1 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-15 22:08 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Yunfei Dong, Tiffany Lin, Andrew-CT Chen, Rob Herring,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Alex,

On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Ezequiel,
>
> On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> >  Hi Alex,
> >
> > Thanks for the patch.
> >
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > >
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > >
> > > Add support for H.264 decoding using the stateless API, as supported by
> > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > builders.
> > >
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/Kconfig                |   1 +
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > >  5 files changed, 813 insertions(+)
> > >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > >
> > > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > > index fd1831e97b22..c27db5643712 100644
> > > --- a/drivers/media/platform/Kconfig
> > > +++ b/drivers/media/platform/Kconfig
> > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > >         select V4L2_MEM2MEM_DEV
> > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > +       select V4L2_H264
> > >         help
> > >           Mediatek video codec driver provides HW capability to
> > >           encode and decode in a range of video formats on MT8173
> > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > >                 vdec/vdec_vp8_if.o \
> > >                 vdec/vdec_vp9_if.o \
> > > +               vdec/vdec_h264_req_if.o \
> > >                 mtk_vcodec_dec_drv.o \
> > >                 vdec_drv_if.o \
> > >                 vdec_vpu_if.o \
> > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > new file mode 100644
> > > index 000000000000..2fbbfbbcfbec
> > > --- /dev/null
> > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > @@ -0,0 +1,807 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/slab.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +#include <media/v4l2-h264.h>
> > > +#include <media/videobuf2-dma-contig.h>
> > > +
> > > +#include "../vdec_drv_if.h"
> > > +#include "../mtk_vcodec_util.h"
> > > +#include "../mtk_vcodec_dec.h"
> > > +#include "../mtk_vcodec_intr.h"
> > > +#include "../vdec_vpu_if.h"
> > > +#include "../vdec_drv_base.h"
> > > +
> > > +#define NAL_NON_IDR_SLICE                      0x01
> > > +#define NAL_IDR_SLICE                          0x05
> > > +#define NAL_H264_PPS                           0x08
> >
> > Not used?
> >
> > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > +
> >
> > I believe you may not need the NAL type.
>
> True, removed this block of defines.
>
> >
> > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > +#define MB_UNIT_LEN                            16
> > > +
> > > +/* get used parameters for sps/pps */
> > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > +#define GET_MTK_VDEC_PARAM(param) \
> > > +       { dst_param->param = src_param->param; }
> > > +/* motion vector size (bytes) for every macro block */
> > > +#define HW_MB_STORE_SZ                         64
> > > +
> > > +#define H264_MAX_FB_NUM                                17
> > > +#define H264_MAX_MV_NUM                                32
> > > +#define HDR_PARSING_BUF_SZ                     1024
> > > +
> > > +/**
> > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > + * @y_dma_addr: Y bitstream physical address
> > > + * @c_dma_addr: CbCr bitstream physical address
> > > + * @reference_flag: reference picture flag (short/long term reference picture)
> > > + * @field: field picture flag
> > > + */
> > > +struct mtk_h264_dpb_info {
> > > +       dma_addr_t y_dma_addr;
> > > +       dma_addr_t c_dma_addr;
> > > +       int reference_flag;
> > > +       int field;
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_sps_param  - parameters for sps
> > > + */
> > > +struct mtk_h264_sps_param {
> > > +       unsigned char chroma_format_idc;
> > > +       unsigned char bit_depth_luma_minus8;
> > > +       unsigned char bit_depth_chroma_minus8;
> > > +       unsigned char log2_max_frame_num_minus4;
> > > +       unsigned char pic_order_cnt_type;
> > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > +       unsigned char max_num_ref_frames;
> > > +       unsigned char separate_colour_plane_flag;
> > > +       unsigned short pic_width_in_mbs_minus1;
> > > +       unsigned short pic_height_in_map_units_minus1;
> > > +       unsigned int max_frame_nums;
> > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > +       unsigned char delta_pic_order_always_zero_flag;
> > > +       unsigned char frame_mbs_only_flag;
> > > +       unsigned char mb_adaptive_frame_field_flag;
> > > +       unsigned char direct_8x8_inference_flag;
> > > +       unsigned char reserved[3];
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_pps_param  - parameters for pps
> > > + */
> > > +struct mtk_h264_pps_param {
> > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > +       unsigned char weighted_bipred_idc;
> > > +       char pic_init_qp_minus26;
> > > +       char chroma_qp_index_offset;
> > > +       char second_chroma_qp_index_offset;
> > > +       unsigned char entropy_coding_mode_flag;
> > > +       unsigned char pic_order_present_flag;
> > > +       unsigned char deblocking_filter_control_present_flag;
> > > +       unsigned char constrained_intra_pred_flag;
> > > +       unsigned char weighted_pred_flag;
> > > +       unsigned char redundant_pic_cnt_present_flag;
> > > +       unsigned char transform_8x8_mode_flag;
> > > +       unsigned char scaling_matrix_present_flag;
> > > +       unsigned char reserved[2];
> > > +};
> > > +
> > > +struct slice_api_h264_scaling_matrix {
> >
> > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > Well I guess you don't want to mix a hardware-specific
> > thing with the V4L2 API maybe.
>
> That's the idea. Although the layout match and the ABI is now stable,
> I think this communicates better the fact that this is a firmware
> structure.
>
> >
> > > +       unsigned char scaling_list_4x4[6][16];
> > > +       unsigned char scaling_list_8x8[6][64];
> > > +};
> > > +
> > > +struct slice_h264_dpb_entry {
> > > +       unsigned long long reference_ts;
> > > +       unsigned short frame_num;
> > > +       unsigned short pic_num;
> > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > + */
> > > +struct slice_api_h264_decode_param {
> > > +       struct slice_h264_dpb_entry dpb[16];
> >
> > V4L2_H264_NUM_DPB_ENTRIES?
>
> For the same reason as above (this being a firmware structure), I
> think it is clearer to not use the kernel definitions here.
>
> >
> > > +       unsigned short num_slices;
> > > +       unsigned short nal_ref_idc;
> > > +       unsigned char ref_pic_list_p0[32];
> > > +       unsigned char ref_pic_list_b0[32];
> > > +       unsigned char ref_pic_list_b1[32];
> >
> > V4L2_H264_REF_LIST_LEN?
>
> Ditto.
>
> >
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > + */
> > > +struct mtk_h264_dec_slice_param {
> > > +       struct mtk_h264_sps_param                       sps;
> > > +       struct mtk_h264_pps_param                       pps;
> > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > +       struct slice_api_h264_decode_param              decode_params;
> > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> >
> > V4L2_H264_NUM_DPB_ENTRIES?
>
> Ditto.
>
> >
> > > +};
> > > +
> > > +/**
> > > + * struct h264_fb - h264 decode frame buffer information
> > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > + * @poc         : picture order count of frame buffer
> > > + * @reserved    : for 8 bytes alignment
> > > + */
> > > +struct h264_fb {
> > > +       uint64_t vdec_fb_va;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       int32_t poc;
> > > +       uint32_t reserved;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_dec_info - decode information
> > > + * @dpb_sz             : decoding picture buffer size
> > > + * @resolution_changed  : resoltion change happen
> > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > + * @cap_num_planes     : number planes of capture buffer
> > > + * @bs_dma             : Input bit-stream buffer dma address
> > > + * @y_fb_dma           : Y frame buffer dma address
> > > + * @c_fb_dma           : C frame buffer dma address
> > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > + */
> > > +struct vdec_h264_dec_info {
> > > +       uint32_t dpb_sz;
> > > +       uint32_t resolution_changed;
> > > +       uint32_t realloc_mv_buf;
> > > +       uint32_t cap_num_planes;
> > > +       uint64_t bs_dma;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       uint64_t vdec_fb_va;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > + *                        between VPU and Host.
> > > + *                        The memory is allocated by VPU then mapping to Host
> > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > + *                        by VPU.
> > > + *                        AP-W/R : AP is writer/reader on this item
> > > + *                        VPU-W/R: VPU is write/reader on this item
> > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > > + * @dec          : decode information (AP-R, VPU-W)
> > > + * @pic          : picture information (AP-R, VPU-W)
> > > + * @crop         : crop information (AP-R, VPU-W)
> > > + */
> > > +struct vdec_h264_vsi {
> > > +       uint64_t pred_buf_dma;
> > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > +       struct vdec_h264_dec_info dec;
> > > +       struct vdec_pic_info pic;
> > > +       struct v4l2_rect crop;
> > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > + * @num_nalu : how many nalus be decoded
> > > + * @ctx      : point to mtk_vcodec_ctx
> > > + * @pred_buf : HW working predication buffer
> > > + * @mv_buf   : HW working motion vector buffer
> > > + * @vpu      : VPU instance
> > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > + */
> > > +struct vdec_h264_slice_inst {
> > > +       unsigned int num_nalu;
> > > +       struct mtk_vcodec_ctx *ctx;
> > > +       struct mtk_vcodec_mem pred_buf;
> > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > +       struct vdec_vpu_inst vpu;
> > > +       struct vdec_h264_vsi vsi_ctx;
> > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > +
> > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > +};
> > > +
> > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > +                                int id)
> > > +{
> > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > +
> > > +       return ctrl->p_cur.p;
> > > +}
> > > +
> > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > +                             struct mtk_h264_dec_slice_param *slice_param)
> > > +{
> > > +       struct vb2_queue *vq;
> > > +       struct vb2_buffer *vb;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > +       u64 index;
> > > +
> > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > +
> > > +       for (index = 0; index < 16; index++) {
> >
> > Ditto, some macro instead of 16.
>
> Changed this to use ARRAY_SIZE() which is appropriate here.
>
> >
> > > +               const struct slice_h264_dpb_entry *dpb;
> > > +               int vb2_index;
> > > +
> > > +               dpb = &slice_param->decode_params.dpb[index];
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > > +                       continue;
> > > +               }
> > > +
> > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > +               if (vb2_index < 0) {
> > > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > > +                               index, dpb->reference_ts);
> > > +                       continue;
> > > +               }
> > > +               /* 1 for short term reference, 2 for long term reference */
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > > +               else
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > > +
> > > +               vb = vq->bufs[vb2_index];
> > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > +
> > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > +
> > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > +}
> > > +
> > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > +
> > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > +}
> > > +
> > > +static void
> > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > > +{
> > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > +
> > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > +}
> > > +
> > > +static void get_h264_decode_parameters(
> > > +       struct slice_api_h264_decode_param *dst_params,
> > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > +
> > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > +               dst_entry->frame_num = src_entry->frame_num;
> > > +               dst_entry->pic_num = src_entry->pic_num;
> > > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > > +               dst_entry->bottom_field_order_cnt =
> > > +                       src_entry->bottom_field_order_cnt;
> > > +               dst_entry->flags = src_entry->flags;
> > > +       }
> > > +
> > > +       // num_slices is a leftover from the old H.264 support and is ignored
> > > +       // by the firmware.
> > > +       dst_params->num_slices = 0;
> > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > > +       dst_params->flags = src_params->flags;
> > > +}
> > > +
> > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > +                           const struct v4l2_h264_dpb_entry *b)
> > > +{
> > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > +}
> > > +
> > > +/*
> > > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > > + * into the already existing slot in dpb, and move other entries into new slots.
> > > + *
> > > + * This function is an adaptation of the similarly-named function in
> > > + * hantro_h264.c.
> > > + */
> > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > +{
> > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       unsigned int i, j;
> > > +
> > > +       /* Disable all entries by default, and mark the ones in use. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > +                       set_bit(i, in_use);
> > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > +       }
> > > +
> > > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > +
> > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > +                       continue;
> > > +
> > > +               /*
> > > +                * To cut off some comparisons, iterate only on target DPB
> > > +                * entries were already used.
> > > +                */
> > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +                       cdpb = &dpb[j];
> > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > +                               continue;
> > > +
> > > +                       *cdpb = *ndpb;
> > > +                       set_bit(j, used);
> > > +                       /* Don't reiterate on this one. */
> > > +                       clear_bit(j, in_use);
> > > +                       break;
> > > +               }
> > > +
> > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > +                       set_bit(i, new);
> > > +       }
> > > +
> > > +       /* For entries that could not be matched, use remaining free slots. */
> > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +               /*
> > > +                * Both arrays are of the same sizes, so there is no way
> > > +                * we can end up with no space in target array, unless
> > > +                * something is buggy.
> > > +                */
> > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > +                       return;
> > > +
> > > +               cdpb = &dpb[j];
> > > +               *cdpb = *ndpb;
> > > +               set_bit(j, used);
> > > +       }
> > > +}
> > > +
> > > +/*
> > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > + */
> > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > +{
> > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > +}
> > > +
> > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > +       int i;
> > > +
> > > +       update_dpb(dec_params, inst->dpb);
> > > +
> > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > > +                                  inst->dpb);
> > > +       get_h264_dpb_list(inst, slice_param);
> > > +
> > > +       /* Prepare the fields for our reference lists */
> > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > +       /* Build the reference lists */
> > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > +                                      inst->dpb);
> > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > > +       /* Adapt the built lists to the firmware's expectations */
> > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > +
> > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > +}
> > > +
> > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > > +{
> > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > > +
> > > +       return HW_MB_STORE_SZ * unit_size;
> > > +}
> > > +
> > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int err = 0;
> > > +
> > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > +               return err;
> > > +       }
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > +       mem = &inst->pred_buf;
> > > +       if (mem->va)
> > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > +}
> > > +
> > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > +       struct vdec_pic_info *pic)
> > > +{
> > > +       int i;
> > > +       int err;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > +
> > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +               mem->size = buf_sz;
> > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > +               if (err) {
> > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > +                       return err;
> > > +               }
> > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int i;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +       }
> > > +}
> > > +
> > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > +                        struct vdec_pic_info *pic)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > +
> > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > +
> > > +       pic = &ctx->picinfo;
> > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > +               ctx->picinfo.fb_sz[1]);
> > > +
> > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > +                       ctx->last_decoded_picinfo.pic_w,
> > > +                       ctx->last_decoded_picinfo.pic_h,
> > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > +       }
> > > +}
> > > +
> > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > +       struct v4l2_rect *cr)
> > > +{
> > > +       cr->left = inst->vsi_ctx.crop.left;
> > > +       cr->top = inst->vsi_ctx.crop.top;
> > > +       cr->width = inst->vsi_ctx.crop.width;
> > > +       cr->height = inst->vsi_ctx.crop.height;
> > > +
> > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > +                        cr->left, cr->top, cr->width, cr->height);
> > > +}
> > > +
> > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > +       unsigned int *dpb_sz)
> > > +{
> > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > +}
> > > +
> > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > +       int err;
> > > +
> > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > +       if (!inst)
> > > +               return -ENOMEM;
> > > +
> > > +       inst->ctx = ctx;
> > > +
> > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > +       inst->vpu.ctx = ctx;
> > > +
> > > +       err = vpu_dec_init(&inst->vpu);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > +               goto error_free_inst;
> > > +       }
> > > +
> > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +       err = allocate_predication_buf(inst);
> > > +       if (err)
> > > +               goto error_deinit;
> > > +
> > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > +               sizeof(struct mtk_h264_sps_param),
> > > +               sizeof(struct mtk_h264_pps_param),
> > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > +               sizeof(struct mtk_h264_dpb_info));
> > > +
> > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > +
> > > +       ctx->drv_handle = inst;
> > > +       return 0;
> > > +
> > > +error_deinit:
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +
> > > +error_free_inst:
> > > +       kfree(inst);
> > > +       return err;
> > > +}
> > > +
> > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +       free_predication_buf(inst);
> > > +       free_mv_buf(inst);
> > > +
> > > +       kfree(inst);
> > > +}
> > > +
> > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > +{
> > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > +               return 3;
> > > +
> > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > +           data[3] == 1)
> > > +               return 4;
> > > +
> > > +       return -1;
> > > +}
> > > +
> > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > +       struct mtk_video_dec_buf *src_buf_info;
> > > +       int nal_start_idx = 0, err = 0;
> > > +       uint32_t nal_type, data[2];
> > > +       unsigned char *buf;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +
> > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > +
> > > +       /* bs NULL means flush decoder */
> > > +       if (bs == NULL)
> > > +               return vpu_dec_reset(vpu);
> > > +
> > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > > +
> > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > +
> > > +       buf = (unsigned char *)bs->va;
> >
> > I can be completely wrong, but it would seem here
> > is where the CPU mapping is used.
>
> I think you're right. :)
>
> >
> > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > +       if (nal_start_idx < 0)
> > > +               goto err_free_fb_out;
> > > +
> > > +       data[0] = bs->size;
> > > +       data[1] = buf[nal_start_idx];
> > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> >
> > Which seems to be used to parse the NAL type. But shouldn't
> > you expect here VLC NALUs only?
> >
> > I.e. you only get IDR or non-IDR frames, marked with
> > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
>
> Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> the test using it below) and the driver is just as happy.
>
> >
> > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > > +                        nal_type);
> > > +
> > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > +
> > > +       get_vdec_decode_parameters(inst);
> > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > +       if (*res_chg) {
> > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > +                       if (err)
> > > +                               goto err_free_fb_out;
> > > +               }
> > > +               *res_chg = false;
> > > +       }
> > > +
> > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > +       err = vpu_dec_start(vpu, data, 2);
> >
> > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > could test if that can be derived without the CPU mapping.
> > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
>
> This one is a bit trickier. It seems the NAL type is passed as part of
> the decode request to the firmware. Which should be absolutely not
> needed since the firmware can check this from the buffer itself. Just
> for fun I have tried setting this parameter unconditionally to 0x1
> (non-IDR picture) and all I get is green frames with seemingly random
> garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> a different kind of garbage, and once every while a properly rendered
> frame (presumably when it is *really* an IDR frame).
>
> So, mmm, I'm afraid we cannot decode properly without this information
> and thus without the mapping, unless Yunfei can tell us of a way to
> achieve this. Yunfei, do you have any idea?
>

Sorry, I wasn't clear with my suggestion. I didn't want to imply to avoid
passing the firmware the information, just to stop deriving it from the buffers.

Along these lines:

        data[0] = bs->size;
-       data[1] = buf[nal_start_idx];
-       nal_type = NAL_TYPE(buf[nal_start_idx]);
+       data[1] = nal_type = (dec_param->flags &
V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) ?
+               NAL_IDR_SLICE : NAL_NON_IDR_SLICE;

(or slice type as Nicolas was suggesting)

This should allow you to not allocate your buffers coherently,
as you are doing now (i.e. not requiring a CPU mapping).

I don't know what would be the performance gains in your platform,
but they might be worth it (note that this depends on the platform
support for non-coherent DMA mappings).

In any case, this is just a suggestion, mostly because a CPU mapping
means the kernel is parsing the buffers, which is a bit unexpected.

Regards,
Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-15 22:08         ` Ezequiel Garcia
  0 siblings, 0 replies; 56+ messages in thread
From: Ezequiel Garcia @ 2021-03-15 22:08 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Yunfei Dong, Tiffany Lin, Andrew-CT Chen, Rob Herring,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Hi Alex,

On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Ezequiel,
>
> On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> >
> >  Hi Alex,
> >
> > Thanks for the patch.
> >
> > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > >
> > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > >
> > > Add support for H.264 decoding using the stateless API, as supported by
> > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > builders.
> > >
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > [acourbot: refactor, cleanup and split]
> > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > ---
> > >  drivers/media/platform/Kconfig                |   1 +
> > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > >  5 files changed, 813 insertions(+)
> > >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > >
> > > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > > index fd1831e97b22..c27db5643712 100644
> > > --- a/drivers/media/platform/Kconfig
> > > +++ b/drivers/media/platform/Kconfig
> > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > >         select V4L2_MEM2MEM_DEV
> > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > +       select V4L2_H264
> > >         help
> > >           Mediatek video codec driver provides HW capability to
> > >           encode and decode in a range of video formats on MT8173
> > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > >                 vdec/vdec_vp8_if.o \
> > >                 vdec/vdec_vp9_if.o \
> > > +               vdec/vdec_h264_req_if.o \
> > >                 mtk_vcodec_dec_drv.o \
> > >                 vdec_drv_if.o \
> > >                 vdec_vpu_if.o \
> > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > new file mode 100644
> > > index 000000000000..2fbbfbbcfbec
> > > --- /dev/null
> > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > @@ -0,0 +1,807 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/slab.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +#include <media/v4l2-h264.h>
> > > +#include <media/videobuf2-dma-contig.h>
> > > +
> > > +#include "../vdec_drv_if.h"
> > > +#include "../mtk_vcodec_util.h"
> > > +#include "../mtk_vcodec_dec.h"
> > > +#include "../mtk_vcodec_intr.h"
> > > +#include "../vdec_vpu_if.h"
> > > +#include "../vdec_drv_base.h"
> > > +
> > > +#define NAL_NON_IDR_SLICE                      0x01
> > > +#define NAL_IDR_SLICE                          0x05
> > > +#define NAL_H264_PPS                           0x08
> >
> > Not used?
> >
> > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > +
> >
> > I believe you may not need the NAL type.
>
> True, removed this block of defines.
>
> >
> > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > +#define MB_UNIT_LEN                            16
> > > +
> > > +/* get used parameters for sps/pps */
> > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > +#define GET_MTK_VDEC_PARAM(param) \
> > > +       { dst_param->param = src_param->param; }
> > > +/* motion vector size (bytes) for every macro block */
> > > +#define HW_MB_STORE_SZ                         64
> > > +
> > > +#define H264_MAX_FB_NUM                                17
> > > +#define H264_MAX_MV_NUM                                32
> > > +#define HDR_PARSING_BUF_SZ                     1024
> > > +
> > > +/**
> > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > + * @y_dma_addr: Y bitstream physical address
> > > + * @c_dma_addr: CbCr bitstream physical address
> > > + * @reference_flag: reference picture flag (short/long term reference picture)
> > > + * @field: field picture flag
> > > + */
> > > +struct mtk_h264_dpb_info {
> > > +       dma_addr_t y_dma_addr;
> > > +       dma_addr_t c_dma_addr;
> > > +       int reference_flag;
> > > +       int field;
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_sps_param  - parameters for sps
> > > + */
> > > +struct mtk_h264_sps_param {
> > > +       unsigned char chroma_format_idc;
> > > +       unsigned char bit_depth_luma_minus8;
> > > +       unsigned char bit_depth_chroma_minus8;
> > > +       unsigned char log2_max_frame_num_minus4;
> > > +       unsigned char pic_order_cnt_type;
> > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > +       unsigned char max_num_ref_frames;
> > > +       unsigned char separate_colour_plane_flag;
> > > +       unsigned short pic_width_in_mbs_minus1;
> > > +       unsigned short pic_height_in_map_units_minus1;
> > > +       unsigned int max_frame_nums;
> > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > +       unsigned char delta_pic_order_always_zero_flag;
> > > +       unsigned char frame_mbs_only_flag;
> > > +       unsigned char mb_adaptive_frame_field_flag;
> > > +       unsigned char direct_8x8_inference_flag;
> > > +       unsigned char reserved[3];
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_pps_param  - parameters for pps
> > > + */
> > > +struct mtk_h264_pps_param {
> > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > +       unsigned char weighted_bipred_idc;
> > > +       char pic_init_qp_minus26;
> > > +       char chroma_qp_index_offset;
> > > +       char second_chroma_qp_index_offset;
> > > +       unsigned char entropy_coding_mode_flag;
> > > +       unsigned char pic_order_present_flag;
> > > +       unsigned char deblocking_filter_control_present_flag;
> > > +       unsigned char constrained_intra_pred_flag;
> > > +       unsigned char weighted_pred_flag;
> > > +       unsigned char redundant_pic_cnt_present_flag;
> > > +       unsigned char transform_8x8_mode_flag;
> > > +       unsigned char scaling_matrix_present_flag;
> > > +       unsigned char reserved[2];
> > > +};
> > > +
> > > +struct slice_api_h264_scaling_matrix {
> >
> > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > Well I guess you don't want to mix a hardware-specific
> > thing with the V4L2 API maybe.
>
> That's the idea. Although the layout match and the ABI is now stable,
> I think this communicates better the fact that this is a firmware
> structure.
>
> >
> > > +       unsigned char scaling_list_4x4[6][16];
> > > +       unsigned char scaling_list_8x8[6][64];
> > > +};
> > > +
> > > +struct slice_h264_dpb_entry {
> > > +       unsigned long long reference_ts;
> > > +       unsigned short frame_num;
> > > +       unsigned short pic_num;
> > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > + */
> > > +struct slice_api_h264_decode_param {
> > > +       struct slice_h264_dpb_entry dpb[16];
> >
> > V4L2_H264_NUM_DPB_ENTRIES?
>
> For the same reason as above (this being a firmware structure), I
> think it is clearer to not use the kernel definitions here.
>
> >
> > > +       unsigned short num_slices;
> > > +       unsigned short nal_ref_idc;
> > > +       unsigned char ref_pic_list_p0[32];
> > > +       unsigned char ref_pic_list_b0[32];
> > > +       unsigned char ref_pic_list_b1[32];
> >
> > V4L2_H264_REF_LIST_LEN?
>
> Ditto.
>
> >
> > > +       int top_field_order_cnt;
> > > +       int bottom_field_order_cnt;
> > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > +};
> > > +
> > > +/**
> > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > + */
> > > +struct mtk_h264_dec_slice_param {
> > > +       struct mtk_h264_sps_param                       sps;
> > > +       struct mtk_h264_pps_param                       pps;
> > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > +       struct slice_api_h264_decode_param              decode_params;
> > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> >
> > V4L2_H264_NUM_DPB_ENTRIES?
>
> Ditto.
>
> >
> > > +};
> > > +
> > > +/**
> > > + * struct h264_fb - h264 decode frame buffer information
> > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > + * @poc         : picture order count of frame buffer
> > > + * @reserved    : for 8 bytes alignment
> > > + */
> > > +struct h264_fb {
> > > +       uint64_t vdec_fb_va;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       int32_t poc;
> > > +       uint32_t reserved;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_dec_info - decode information
> > > + * @dpb_sz             : decoding picture buffer size
> > > + * @resolution_changed  : resoltion change happen
> > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > + * @cap_num_planes     : number planes of capture buffer
> > > + * @bs_dma             : Input bit-stream buffer dma address
> > > + * @y_fb_dma           : Y frame buffer dma address
> > > + * @c_fb_dma           : C frame buffer dma address
> > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > + */
> > > +struct vdec_h264_dec_info {
> > > +       uint32_t dpb_sz;
> > > +       uint32_t resolution_changed;
> > > +       uint32_t realloc_mv_buf;
> > > +       uint32_t cap_num_planes;
> > > +       uint64_t bs_dma;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +       uint64_t vdec_fb_va;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > + *                        between VPU and Host.
> > > + *                        The memory is allocated by VPU then mapping to Host
> > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > + *                        by VPU.
> > > + *                        AP-W/R : AP is writer/reader on this item
> > > + *                        VPU-W/R: VPU is write/reader on this item
> > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > > + * @dec          : decode information (AP-R, VPU-W)
> > > + * @pic          : picture information (AP-R, VPU-W)
> > > + * @crop         : crop information (AP-R, VPU-W)
> > > + */
> > > +struct vdec_h264_vsi {
> > > +       uint64_t pred_buf_dma;
> > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > +       struct vdec_h264_dec_info dec;
> > > +       struct vdec_pic_info pic;
> > > +       struct v4l2_rect crop;
> > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > +};
> > > +
> > > +/**
> > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > + * @num_nalu : how many nalus be decoded
> > > + * @ctx      : point to mtk_vcodec_ctx
> > > + * @pred_buf : HW working predication buffer
> > > + * @mv_buf   : HW working motion vector buffer
> > > + * @vpu      : VPU instance
> > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > + */
> > > +struct vdec_h264_slice_inst {
> > > +       unsigned int num_nalu;
> > > +       struct mtk_vcodec_ctx *ctx;
> > > +       struct mtk_vcodec_mem pred_buf;
> > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > +       struct vdec_vpu_inst vpu;
> > > +       struct vdec_h264_vsi vsi_ctx;
> > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > +
> > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > +};
> > > +
> > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > +                                int id)
> > > +{
> > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > +
> > > +       return ctrl->p_cur.p;
> > > +}
> > > +
> > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > +                             struct mtk_h264_dec_slice_param *slice_param)
> > > +{
> > > +       struct vb2_queue *vq;
> > > +       struct vb2_buffer *vb;
> > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > +       u64 index;
> > > +
> > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > +
> > > +       for (index = 0; index < 16; index++) {
> >
> > Ditto, some macro instead of 16.
>
> Changed this to use ARRAY_SIZE() which is appropriate here.
>
> >
> > > +               const struct slice_h264_dpb_entry *dpb;
> > > +               int vb2_index;
> > > +
> > > +               dpb = &slice_param->decode_params.dpb[index];
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > > +                       continue;
> > > +               }
> > > +
> > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > +               if (vb2_index < 0) {
> > > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > > +                               index, dpb->reference_ts);
> > > +                       continue;
> > > +               }
> > > +               /* 1 for short term reference, 2 for long term reference */
> > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > > +               else
> > > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > > +
> > > +               vb = vq->bufs[vb2_index];
> > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > +
> > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > +
> > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > +}
> > > +
> > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > +{
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > +
> > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > +}
> > > +
> > > +static void
> > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > > +{
> > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > +
> > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > +}
> > > +
> > > +static void get_h264_decode_parameters(
> > > +       struct slice_api_h264_decode_param *dst_params,
> > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > +
> > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > +               dst_entry->frame_num = src_entry->frame_num;
> > > +               dst_entry->pic_num = src_entry->pic_num;
> > > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > > +               dst_entry->bottom_field_order_cnt =
> > > +                       src_entry->bottom_field_order_cnt;
> > > +               dst_entry->flags = src_entry->flags;
> > > +       }
> > > +
> > > +       // num_slices is a leftover from the old H.264 support and is ignored
> > > +       // by the firmware.
> > > +       dst_params->num_slices = 0;
> > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > > +       dst_params->flags = src_params->flags;
> > > +}
> > > +
> > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > +                           const struct v4l2_h264_dpb_entry *b)
> > > +{
> > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > +}
> > > +
> > > +/*
> > > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > > + * into the already existing slot in dpb, and move other entries into new slots.
> > > + *
> > > + * This function is an adaptation of the similarly-named function in
> > > + * hantro_h264.c.
> > > + */
> > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > +{
> > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > +       unsigned int i, j;
> > > +
> > > +       /* Disable all entries by default, and mark the ones in use. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > +                       set_bit(i, in_use);
> > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > +       }
> > > +
> > > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > +
> > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > +                       continue;
> > > +
> > > +               /*
> > > +                * To cut off some comparisons, iterate only on target DPB
> > > +                * entries were already used.
> > > +                */
> > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +                       cdpb = &dpb[j];
> > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > +                               continue;
> > > +
> > > +                       *cdpb = *ndpb;
> > > +                       set_bit(j, used);
> > > +                       /* Don't reiterate on this one. */
> > > +                       clear_bit(j, in_use);
> > > +                       break;
> > > +               }
> > > +
> > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > +                       set_bit(i, new);
> > > +       }
> > > +
> > > +       /* For entries that could not be matched, use remaining free slots. */
> > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > +
> > > +               /*
> > > +                * Both arrays are of the same sizes, so there is no way
> > > +                * we can end up with no space in target array, unless
> > > +                * something is buggy.
> > > +                */
> > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > +                       return;
> > > +
> > > +               cdpb = &dpb[j];
> > > +               *cdpb = *ndpb;
> > > +               set_bit(j, used);
> > > +       }
> > > +}
> > > +
> > > +/*
> > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > + */
> > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > +{
> > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > +}
> > > +
> > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > +       int i;
> > > +
> > > +       update_dpb(dec_params, inst->dpb);
> > > +
> > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > > +                                  inst->dpb);
> > > +       get_h264_dpb_list(inst, slice_param);
> > > +
> > > +       /* Prepare the fields for our reference lists */
> > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > +       /* Build the reference lists */
> > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > +                                      inst->dpb);
> > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > > +       /* Adapt the built lists to the firmware's expectations */
> > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > +
> > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > +}
> > > +
> > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > > +{
> > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > > +
> > > +       return HW_MB_STORE_SZ * unit_size;
> > > +}
> > > +
> > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int err = 0;
> > > +
> > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > +               return err;
> > > +       }
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > +       mem = &inst->pred_buf;
> > > +       if (mem->va)
> > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > +}
> > > +
> > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > +       struct vdec_pic_info *pic)
> > > +{
> > > +       int i;
> > > +       int err;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > +
> > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +               mem->size = buf_sz;
> > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > +               if (err) {
> > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > +                       return err;
> > > +               }
> > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > +{
> > > +       int i;
> > > +       struct mtk_vcodec_mem *mem = NULL;
> > > +
> > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > +               mem = &inst->mv_buf[i];
> > > +               if (mem->va)
> > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > +       }
> > > +}
> > > +
> > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > +                        struct vdec_pic_info *pic)
> > > +{
> > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > +
> > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > +
> > > +       pic = &ctx->picinfo;
> > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > +               ctx->picinfo.fb_sz[1]);
> > > +
> > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > +                       ctx->last_decoded_picinfo.pic_w,
> > > +                       ctx->last_decoded_picinfo.pic_h,
> > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > +       }
> > > +}
> > > +
> > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > +       struct v4l2_rect *cr)
> > > +{
> > > +       cr->left = inst->vsi_ctx.crop.left;
> > > +       cr->top = inst->vsi_ctx.crop.top;
> > > +       cr->width = inst->vsi_ctx.crop.width;
> > > +       cr->height = inst->vsi_ctx.crop.height;
> > > +
> > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > +                        cr->left, cr->top, cr->width, cr->height);
> > > +}
> > > +
> > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > +       unsigned int *dpb_sz)
> > > +{
> > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > +}
> > > +
> > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > +       int err;
> > > +
> > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > +       if (!inst)
> > > +               return -ENOMEM;
> > > +
> > > +       inst->ctx = ctx;
> > > +
> > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > +       inst->vpu.ctx = ctx;
> > > +
> > > +       err = vpu_dec_init(&inst->vpu);
> > > +       if (err) {
> > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > +               goto error_free_inst;
> > > +       }
> > > +
> > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > +
> > > +       err = allocate_predication_buf(inst);
> > > +       if (err)
> > > +               goto error_deinit;
> > > +
> > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > +               sizeof(struct mtk_h264_sps_param),
> > > +               sizeof(struct mtk_h264_pps_param),
> > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > +               sizeof(struct mtk_h264_dpb_info));
> > > +
> > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > +
> > > +       ctx->drv_handle = inst;
> > > +       return 0;
> > > +
> > > +error_deinit:
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +
> > > +error_free_inst:
> > > +       kfree(inst);
> > > +       return err;
> > > +}
> > > +
> > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +
> > > +       mtk_vcodec_debug_enter(inst);
> > > +
> > > +       vpu_dec_deinit(&inst->vpu);
> > > +       free_predication_buf(inst);
> > > +       free_mv_buf(inst);
> > > +
> > > +       kfree(inst);
> > > +}
> > > +
> > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > +{
> > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > +               return 3;
> > > +
> > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > +           data[3] == 1)
> > > +               return 4;
> > > +
> > > +       return -1;
> > > +}
> > > +
> > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > +{
> > > +       struct vdec_h264_slice_inst *inst =
> > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > +       struct mtk_video_dec_buf *src_buf_info;
> > > +       int nal_start_idx = 0, err = 0;
> > > +       uint32_t nal_type, data[2];
> > > +       unsigned char *buf;
> > > +       uint64_t y_fb_dma;
> > > +       uint64_t c_fb_dma;
> > > +
> > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > +
> > > +       /* bs NULL means flush decoder */
> > > +       if (bs == NULL)
> > > +               return vpu_dec_reset(vpu);
> > > +
> > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > > +
> > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > +
> > > +       buf = (unsigned char *)bs->va;
> >
> > I can be completely wrong, but it would seem here
> > is where the CPU mapping is used.
>
> I think you're right. :)
>
> >
> > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > +       if (nal_start_idx < 0)
> > > +               goto err_free_fb_out;
> > > +
> > > +       data[0] = bs->size;
> > > +       data[1] = buf[nal_start_idx];
> > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> >
> > Which seems to be used to parse the NAL type. But shouldn't
> > you expect here VLC NALUs only?
> >
> > I.e. you only get IDR or non-IDR frames, marked with
> > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
>
> Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> the test using it below) and the driver is just as happy.
>
> >
> > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > > +                        nal_type);
> > > +
> > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > +
> > > +       get_vdec_decode_parameters(inst);
> > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > +       if (*res_chg) {
> > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > +                       if (err)
> > > +                               goto err_free_fb_out;
> > > +               }
> > > +               *res_chg = false;
> > > +       }
> > > +
> > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > +       err = vpu_dec_start(vpu, data, 2);
> >
> > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > could test if that can be derived without the CPU mapping.
> > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
>
> This one is a bit trickier. It seems the NAL type is passed as part of
> the decode request to the firmware. Which should be absolutely not
> needed since the firmware can check this from the buffer itself. Just
> for fun I have tried setting this parameter unconditionally to 0x1
> (non-IDR picture) and all I get is green frames with seemingly random
> garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> a different kind of garbage, and once every while a properly rendered
> frame (presumably when it is *really* an IDR frame).
>
> So, mmm, I'm afraid we cannot decode properly without this information
> and thus without the mapping, unless Yunfei can tell us of a way to
> achieve this. Yunfei, do you have any idea?
>

Sorry, I wasn't clear with my suggestion. I didn't want to imply to avoid
passing the firmware the information, just to stop deriving it from the buffers.

Along these lines:

        data[0] = bs->size;
-       data[1] = buf[nal_start_idx];
-       nal_type = NAL_TYPE(buf[nal_start_idx]);
+       data[1] = nal_type = (dec_param->flags &
V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) ?
+               NAL_IDR_SLICE : NAL_NON_IDR_SLICE;

(or slice type as Nicolas was suggesting)

This should allow you to not allocate your buffers coherently,
as you are doing now (i.e. not requiring a CPU mapping).

I don't know what would be the performance gains in your platform,
but they might be worth it (note that this depends on the platform
support for non-coherent DMA mappings).

In any case, this is just a suggestion, mostly because a CPU mapping
means the kernel is parsing the buffers, which is a bit unexpected.

Regards,
Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-03-15 21:45         ` Ezequiel Garcia
@ 2021-03-17  3:13           ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:13 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

On Tue, Mar 16, 2021 at 6:45 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Alexandre,
>
> On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > Hi Ezequiel, thanks for the feedback!
> >
> > On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > > Hello Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Support the stateless codec API that will be used by MT8183.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > > >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > > >
> > > [..]
> > >
> > > > +
> > > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > >
> > > This "needed_in_request" is not really required, as controls
> > > are not volatile, and their value is stored per-context (per-fd).
> > >
> > > It's perfectly valid for an application to pass the SPS control
> > > at the beginning of the sequence, and then omit it
> > > in further requests.
> >
> > If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> > requests, this boolean only checks that the control has been provided
> > at least once, and not that it is provided with every request. Without
> > it we could send a frame to the firmware without e.g. setting an SPS,
> > which would be a problem.
> >
>
> As Nicolas points out, in V4L2 controls have an initial value,
> so no control can be unset.

I see. So I guess the expectation is that failure will occur later as
the firmware reports it cannot decode properly (or returns a corrupted
frame). Thanks for the precision.

>
> > >
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > > +                       .menu_skip_mask =
> > > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +       },
> > > > +};
> > >
> > > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > > the driver supports. From a next patch, this case seems to be
> > > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> >
> > Indeed - I've added the control, thanks for catching this!
> >
> > >
> > > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > > +
> > > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .type = MTK_FMT_DEC,
> > > > +               .num_planes = 1,
> > > > +       },
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > > +               .type = MTK_FMT_FRAME,
> > > > +               .num_planes = 2,
> > > > +       },
> > > > +};
> > > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > > +#define DEFAULT_OUT_FMT_IDX    0
> > > > +#define DEFAULT_CAP_FMT_IDX    1
> > > > +
> > > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .stepwise = {
> > > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > > +               },
> > > > +       },
> > > > +};
> > > > +
> > > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > > +
> > > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > > > +                                              struct vdec_fb *fb)
> > > > +{
> > > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > +
> > > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +               unsigned int cap_c_size =
> > > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > +
> > > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > > +       }
> > > > +}
> > > > +
> > > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > > > +{
> > > > +       struct mtk_video_dec_buf *framebuf =
> > > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > > +
> > > > +       pfb = &framebuf->frame_buffer;
> > > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > >
> > > Are you sure you need a CPU mapping? It seems strange.
> > > I'll comment some more on the next patch(es).
> >
> > I'll answer on the next patch since this is where that mapping is being used.
> >
> > >
> > > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > +
> > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > > +               pfb->base_c.dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > +       }
> > > > +       mtk_v4l2_debug(1,
> > > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > > > +               dst_buf->index, pfb,
> > > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > > +               ctx->decoded_frame_cnt);
> > > > +
> > > > +       return pfb;
> > > > +}
> > > > +
> > > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +
> > > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > > +}
> > > > +
> > > > +static int fops_media_request_validate(struct media_request *mreq)
> > > > +{
> > > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > > +       struct media_request_object *req_obj;
> > > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > > +       struct v4l2_ctrl *ctrl;
> > > > +       unsigned int i;
> > > > +
> > > > +       switch (buffer_cnt) {
> > > > +       case 1:
> > > > +               /* We expect exactly one buffer with the request */
> > > > +               break;
> > > > +       case 0:
> > > > +               mtk_v4l2_err("No buffer provided with the request");
> > > > +               return -ENOENT;
> > > > +       default:
> > > > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > > > +                            buffer_cnt);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > > +               struct vb2_buffer *vb;
> > > > +
> > > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +                       break;
> > > > +               }
> > > > +       }
> > > > +
> > > > +       if (!ctx) {
> > > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > > +               return -ENOENT;
> > > > +       }
> > > > +
> > > > +       parent_hdl = &ctx->ctrl_hdl;
> > > > +
> > > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > > +       if (!hdl) {
> > > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > > +               return -ENOENT;
> > > > +       }
> > > > +
> > > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > > > +                       continue;
> > > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > > +                       continue;
> > > > +
> > > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > > +                                         mtk_stateless_controls[i].cfg.id);
> > > > +               if (!ctrl) {
> > > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > > +                       return -ENOENT;
> > > > +               }
> > > > +       }
> > > > +
> > > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > > +
> > > > +       return vb2_request_validate(mreq);
> > > > +}
> > > > +
> > > > +static void mtk_vdec_worker(struct work_struct *work)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx =
> > > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > > +       struct vb2_buffer *vb2_src;
> > > > +       struct mtk_vcodec_mem *bs_src;
> > > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > > +       struct media_request *src_buf_req;
> > > > +       struct vdec_fb *dst_buf;
> > > > +       bool res_chg = false;
> > > > +       int ret;
> > > > +
> > > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > > +       if (vb2_v4l2_src == NULL) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > > +       if (vb2_v4l2_dst == NULL) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > > +                                  m2m_buf.vb);
> > > > +       bs_src = &dec_buf_src->bs_buffer;
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > > +                       ctx->id, src_buf->vb2_queue->type,
> > > > +                       src_buf->index, src_buf, src_buf_info);
> > > > +
> > > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > > +       if (!bs_src->va) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > > +                            vb2_src->index);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > > > +       /* Apply request controls. */
> > > > +       src_buf_req = vb2_src->req_obj.req;
> > > > +       if (src_buf_req)
> > > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > > +       else
> > > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > > +
> > > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > > +       if (ret) {
> > > > +               mtk_v4l2_err(
> > > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > > +                       vb2_src->timestamp, ret, res_chg);
> > > > +               if (ret == -EIO) {
> > > > +                       mutex_lock(&ctx->lock);
> > > > +                       dec_buf_src->error = true;
> > > > +                       mutex_unlock(&ctx->lock);
> > > > +               }
> > > > +       }
> > > > +
> > > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > > +
> > > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > > +
> > > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > > +}
> > > > +
> > > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > > +                       ctx->id, vb->vb2_queue->type,
> > > > +                       vb->index, vb);
> > > > +
> > > > +       mutex_lock(&ctx->lock);
> > > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > > +       mutex_unlock(&ctx->lock);
> > > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > > +               return;
> > > > +
> > > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > > +
> > > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > > +       if (ctx->state == MTK_STATE_INIT) {
> > > > +               ctx->state = MTK_STATE_HEADER;
> > > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > >
> > > This state thing seems just something to make the rest
> > > of the stateful-based driver happy, right?
> >
> > Correct - if anything we should either use more of the state here
> > (i.e. set the error state when relevant) or move the state entirely in
> > the stateful part of the driver.
> >
> > >
> > > Makes me wonder a bit if just splitting the stateless part to its
> > > own driver, wouldn't make your maintenance easier.
> > >
> > > What's the motivation for sharing the driver?
> >
> > Technically you could do it both ways. Separating the driver would
> > result in some boilerplate code and buffer-management structs
> > duplication (unless we keep the shared part under another module - but
> > in this case we are basically in the same situation as now). Also
> > despite using different userspace-facing ABIs, MT8173 and MT8183
> > follow a similar architecture and a similar firmware interface.
> > Considering these similarities it seems simpler from an architectural
> > point of view to have all the Mediatek codec support under the same
> > driver. It also probably results in less code.
> >
> > That being said, the split can probably be improved as you pointed out
> > with this state variable. But the current split is not too bad IMHO,
> > at least not worse than how the code was originally.
> >
> > >
> > > > +       } else {
> > > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > > +                               ctx->id, ctx->state);
> > > > +       }
> > > > +}
> > > > +
> > > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       bool res_chg;
> > > > +
> > > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > > +}
> > > > +
> > > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > > +};
> > > > +
> > > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl;
> > > > +       unsigned int i;
> > > > +
> > > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > > +       if (ctx->ctrl_hdl.error) {
> > > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > > +               return ctx->ctrl_hdl.error;
> > > > +       }
> > > > +
> > > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > > +                               0, 32, 1, 1);
> > > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > >
> > > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > > to return the DPB size. However, isn't this something userspace already knows?
> >
> > True, but that's also a control the driver is supposed to provide per
> > the spec IIUC.
> >
>
> I don't see the specification requiring this control. TBH, I'd just drop it
> and if needed fix the application to support this as an optional
> control.
>
> In any case, stateless devices should just need 1 output and 1 capture buffer.

Mmm, you're correct indeed, and checking with our user-space it does
not rely on this control for stateless codecs. Moving this control to
the stateful part of the driver.


>
> You might dislike this redundancy, note that you can also get the minimum
> required buffers through VIDIOC_REQBUFS, where the count
> v4l2_requestbuffers.field is returned back to userspace with the
> number of allocated buffers.
>
> If you request just 1 buffer, and your driver needed 3, you should
> get a 3 there (vb2_ops.queue_setup takes care of that).
>
> Thanks,
> Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-17  3:13           ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:13 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

On Tue, Mar 16, 2021 at 6:45 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Alexandre,
>
> On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > Hi Ezequiel, thanks for the feedback!
> >
> > On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > > Hello Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Support the stateless codec API that will be used by MT8183.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427 ++++++++++++++++++
> > > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > > >  create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > > >
> > > [..]
> > >
> > > > +
> > > > +static const struct mtk_stateless_control mtk_stateless_controls[] = {
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > >
> > > This "needed_in_request" is not really required, as controls
> > > are not volatile, and their value is stored per-context (per-fd).
> > >
> > > It's perfectly valid for an application to pass the SPS control
> > > at the beginning of the sequence, and then omit it
> > > in further requests.
> >
> > If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> > requests, this boolean only checks that the control has been provided
> > at least once, and not that it is provided with every request. Without
> > it we could send a frame to the firmware without e.g. setting an SPS,
> > which would be a problem.
> >
>
> As Nicolas points out, in V4L2 controls have an initial value,
> so no control can be unset.

I see. So I guess the expectation is that failure will occur later as
the firmware reports it cannot decode properly (or returns a corrupted
frame). Thanks for the precision.

>
> > >
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .needed_in_request = true,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > > +                       .menu_skip_mask =
> > > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > > +                               BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +       },
> > > > +       {
> > > > +               .cfg = {
> > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > > +                       .min = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +                       .def = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +                       .max = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > +               },
> > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > +       },
> > > > +};
> > >
> > > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > > the driver supports. From a next patch, this case seems to be
> > > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> >
> > Indeed - I've added the control, thanks for catching this!
> >
> > >
> > > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > > +
> > > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .type = MTK_FMT_DEC,
> > > > +               .num_planes = 1,
> > > > +       },
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > > +               .type = MTK_FMT_FRAME,
> > > > +               .num_planes = 2,
> > > > +       },
> > > > +};
> > > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > > +#define DEFAULT_OUT_FMT_IDX    0
> > > > +#define DEFAULT_CAP_FMT_IDX    1
> > > > +
> > > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > > +       {
> > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > +               .stepwise = {
> > > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > > +               },
> > > > +       },
> > > > +};
> > > > +
> > > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > > +
> > > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> > > > +                                              struct vdec_fb *fb)
> > > > +{
> > > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > > +               container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> > > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > > +       unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > +
> > > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +               unsigned int cap_c_size =
> > > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > +
> > > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > > +       }
> > > > +}
> > > > +
> > > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> > > > +                                          struct vb2_v4l2_buffer *vb2_v4l2)
> > > > +{
> > > > +       struct mtk_video_dec_buf *framebuf =
> > > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> > > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > > +
> > > > +       pfb = &framebuf->frame_buffer;
> > > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > >
> > > Are you sure you need a CPU mapping? It seems strange.
> > > I'll comment some more on the next patch(es).
> >
> > I'll answer on the next patch since this is where that mapping is being used.
> >
> > >
> > > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
> > > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > +
> > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > > +               pfb->base_c.dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > > +               pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > +       }
> > > > +       mtk_v4l2_debug(1,
> > > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad Size=%zx frame_count = %d",
> > > > +               dst_buf->index, pfb,
> > > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > > +               ctx->decoded_frame_cnt);
> > > > +
> > > > +       return pfb;
> > > > +}
> > > > +
> > > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +
> > > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > > +}
> > > > +
> > > > +static int fops_media_request_validate(struct media_request *mreq)
> > > > +{
> > > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > > +       struct media_request_object *req_obj;
> > > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > > +       struct v4l2_ctrl *ctrl;
> > > > +       unsigned int i;
> > > > +
> > > > +       switch (buffer_cnt) {
> > > > +       case 1:
> > > > +               /* We expect exactly one buffer with the request */
> > > > +               break;
> > > > +       case 0:
> > > > +               mtk_v4l2_err("No buffer provided with the request");
> > > > +               return -ENOENT;
> > > > +       default:
> > > > +               mtk_v4l2_err("Too many buffers (%d) provided with the request",
> > > > +                            buffer_cnt);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +
> > > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > > +               struct vb2_buffer *vb;
> > > > +
> > > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > > +                       vb = container_of(req_obj, struct vb2_buffer, req_obj);
> > > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +                       break;
> > > > +               }
> > > > +       }
> > > > +
> > > > +       if (!ctx) {
> > > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > > +               return -ENOENT;
> > > > +       }
> > > > +
> > > > +       parent_hdl = &ctx->ctrl_hdl;
> > > > +
> > > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > > +       if (!hdl) {
> > > > +               mtk_v4l2_err("Cannot find control handler for request\n");
> > > > +               return -ENOENT;
> > > > +       }
> > > > +
> > > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > > +               if (mtk_stateless_controls[i].codec_type != ctx->current_codec)
> > > > +                       continue;
> > > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > > +                       continue;
> > > > +
> > > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > > +                                         mtk_stateless_controls[i].cfg.id);
> > > > +               if (!ctrl) {
> > > > +                       mtk_v4l2_err("Missing required codec control\n");
> > > > +                       return -ENOENT;
> > > > +               }
> > > > +       }
> > > > +
> > > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > > +
> > > > +       return vb2_request_validate(mreq);
> > > > +}
> > > > +
> > > > +static void mtk_vdec_worker(struct work_struct *work)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx =
> > > > +               container_of(work, struct mtk_vcodec_ctx, decode_work);
> > > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > > +       struct vb2_buffer *vb2_src;
> > > > +       struct mtk_vcodec_mem *bs_src;
> > > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > > +       struct media_request *src_buf_req;
> > > > +       struct vdec_fb *dst_buf;
> > > > +       bool res_chg = false;
> > > > +       int ret;
> > > > +
> > > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > > +       if (vb2_v4l2_src == NULL) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_debug(1, "[%d] no available source buffer", ctx->id);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > > +       if (vb2_v4l2_dst == NULL) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > > +       dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
> > > > +                                  m2m_buf.vb);
> > > > +       bs_src = &dec_buf_src->bs_buffer;
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > > +                       ctx->id, src_buf->vb2_queue->type,
> > > > +                       src_buf->index, src_buf, src_buf_info);
> > > > +
> > > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > > +       if (!bs_src->va) {
> > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> > > > +                            vb2_src->index);
> > > > +               return;
> > > > +       }
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
> > > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size, src_buf);
> > > > +       /* Apply request controls. */
> > > > +       src_buf_req = vb2_src->req_obj.req;
> > > > +       if (src_buf_req)
> > > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > > +       else
> > > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > > +
> > > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > > +       if (ret) {
> > > > +               mtk_v4l2_err(
> > > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
> > > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > > +                       vb2_src->timestamp, ret, res_chg);
> > > > +               if (ret == -EIO) {
> > > > +                       mutex_lock(&ctx->lock);
> > > > +                       dec_buf_src->error = true;
> > > > +                       mutex_unlock(&ctx->lock);
> > > > +               }
> > > > +       }
> > > > +
> > > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > > +
> > > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> > > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > > +
> > > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > > +}
> > > > +
> > > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > > +
> > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > > +                       ctx->id, vb->vb2_queue->type,
> > > > +                       vb->index, vb);
> > > > +
> > > > +       mutex_lock(&ctx->lock);
> > > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > > +       mutex_unlock(&ctx->lock);
> > > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > > +               return;
> > > > +
> > > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > > +
> > > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > > +       if (ctx->state == MTK_STATE_INIT) {
> > > > +               ctx->state = MTK_STATE_HEADER;
> > > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > >
> > > This state thing seems just something to make the rest
> > > of the stateful-based driver happy, right?
> >
> > Correct - if anything we should either use more of the state here
> > (i.e. set the error state when relevant) or move the state entirely in
> > the stateful part of the driver.
> >
> > >
> > > Makes me wonder a bit if just splitting the stateless part to its
> > > own driver, wouldn't make your maintenance easier.
> > >
> > > What's the motivation for sharing the driver?
> >
> > Technically you could do it both ways. Separating the driver would
> > result in some boilerplate code and buffer-management structs
> > duplication (unless we keep the shared part under another module - but
> > in this case we are basically in the same situation as now). Also
> > despite using different userspace-facing ABIs, MT8173 and MT8183
> > follow a similar architecture and a similar firmware interface.
> > Considering these similarities it seems simpler from an architectural
> > point of view to have all the Mediatek codec support under the same
> > driver. It also probably results in less code.
> >
> > That being said, the split can probably be improved as you pointed out
> > with this state variable. But the current split is not too bad IMHO,
> > at least not worse than how the code was originally.
> >
> > >
> > > > +       } else {
> > > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > > +                               ctx->id, ctx->state);
> > > > +       }
> > > > +}
> > > > +
> > > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       bool res_chg;
> > > > +
> > > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > > +}
> > > > +
> > > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > > +};
> > > > +
> > > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl;
> > > > +       unsigned int i;
> > > > +
> > > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > > +       if (ctx->ctrl_hdl.error) {
> > > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > > +               return ctx->ctrl_hdl.error;
> > > > +       }
> > > > +
> > > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > > +                               0, 32, 1, 1);
> > > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > >
> > > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > > to return the DPB size. However, isn't this something userspace already knows?
> >
> > True, but that's also a control the driver is supposed to provide per
> > the spec IIUC.
> >
>
> I don't see the specification requiring this control. TBH, I'd just drop it
> and if needed fix the application to support this as an optional
> control.
>
> In any case, stateless devices should just need 1 output and 1 capture buffer.

Mmm, you're correct indeed, and checking with our user-space it does
not rely on this control for stateless codecs. Moving this control to
the stateful part of the driver.


>
> You might dislike this redundancy, note that you can also get the minimum
> required buffers through VIDIOC_REQBUFS, where the count
> v4l2_requestbuffers.field is returned back to userspace with the
> number of allocated buffers.
>
> If you request just 1 buffer, and your driver needed 3, you should
> get a 3 there (vb2_ops.queue_setup takes care of that).
>
> Thanks,
> Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-03-15 22:08         ` Ezequiel Garcia
@ 2021-03-17  3:13           ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:13 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Yunfei Dong, Tiffany Lin, Andrew-CT Chen, Rob Herring,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

 On Tue, Mar 16, 2021 at 7:08 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Alex,
>
> On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > Hi Ezequiel,
> >
> > On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > >  Hi Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Add support for H.264 decoding using the stateless API, as supported by
> > > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > > builders.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/Kconfig                |   1 +
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > > >  5 files changed, 813 insertions(+)
> > > >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > >
> > > > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > > > index fd1831e97b22..c27db5643712 100644
> > > > --- a/drivers/media/platform/Kconfig
> > > > +++ b/drivers/media/platform/Kconfig
> > > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > > >         select V4L2_MEM2MEM_DEV
> > > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > > +       select V4L2_H264
> > > >         help
> > > >           Mediatek video codec driver provides HW capability to
> > > >           encode and decode in a range of video formats on MT8173
> > > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > > >                 vdec/vdec_vp8_if.o \
> > > >                 vdec/vdec_vp9_if.o \
> > > > +               vdec/vdec_h264_req_if.o \
> > > >                 mtk_vcodec_dec_drv.o \
> > > >                 vdec_drv_if.o \
> > > >                 vdec_vpu_if.o \
> > > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > new file mode 100644
> > > > index 000000000000..2fbbfbbcfbec
> > > > --- /dev/null
> > > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > @@ -0,0 +1,807 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include <linux/module.h>
> > > > +#include <linux/slab.h>
> > > > +#include <media/v4l2-mem2mem.h>
> > > > +#include <media/v4l2-h264.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > +
> > > > +#include "../vdec_drv_if.h"
> > > > +#include "../mtk_vcodec_util.h"
> > > > +#include "../mtk_vcodec_dec.h"
> > > > +#include "../mtk_vcodec_intr.h"
> > > > +#include "../vdec_vpu_if.h"
> > > > +#include "../vdec_drv_base.h"
> > > > +
> > > > +#define NAL_NON_IDR_SLICE                      0x01
> > > > +#define NAL_IDR_SLICE                          0x05
> > > > +#define NAL_H264_PPS                           0x08
> > >
> > > Not used?
> > >
> > > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > > +
> > >
> > > I believe you may not need the NAL type.
> >
> > True, removed this block of defines.
> >
> > >
> > > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > > +#define MB_UNIT_LEN                            16
> > > > +
> > > > +/* get used parameters for sps/pps */
> > > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > > +#define GET_MTK_VDEC_PARAM(param) \
> > > > +       { dst_param->param = src_param->param; }
> > > > +/* motion vector size (bytes) for every macro block */
> > > > +#define HW_MB_STORE_SZ                         64
> > > > +
> > > > +#define H264_MAX_FB_NUM                                17
> > > > +#define H264_MAX_MV_NUM                                32
> > > > +#define HDR_PARSING_BUF_SZ                     1024
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > > + * @y_dma_addr: Y bitstream physical address
> > > > + * @c_dma_addr: CbCr bitstream physical address
> > > > + * @reference_flag: reference picture flag (short/long term reference picture)
> > > > + * @field: field picture flag
> > > > + */
> > > > +struct mtk_h264_dpb_info {
> > > > +       dma_addr_t y_dma_addr;
> > > > +       dma_addr_t c_dma_addr;
> > > > +       int reference_flag;
> > > > +       int field;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_sps_param  - parameters for sps
> > > > + */
> > > > +struct mtk_h264_sps_param {
> > > > +       unsigned char chroma_format_idc;
> > > > +       unsigned char bit_depth_luma_minus8;
> > > > +       unsigned char bit_depth_chroma_minus8;
> > > > +       unsigned char log2_max_frame_num_minus4;
> > > > +       unsigned char pic_order_cnt_type;
> > > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > > +       unsigned char max_num_ref_frames;
> > > > +       unsigned char separate_colour_plane_flag;
> > > > +       unsigned short pic_width_in_mbs_minus1;
> > > > +       unsigned short pic_height_in_map_units_minus1;
> > > > +       unsigned int max_frame_nums;
> > > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > > +       unsigned char delta_pic_order_always_zero_flag;
> > > > +       unsigned char frame_mbs_only_flag;
> > > > +       unsigned char mb_adaptive_frame_field_flag;
> > > > +       unsigned char direct_8x8_inference_flag;
> > > > +       unsigned char reserved[3];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_pps_param  - parameters for pps
> > > > + */
> > > > +struct mtk_h264_pps_param {
> > > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > > +       unsigned char weighted_bipred_idc;
> > > > +       char pic_init_qp_minus26;
> > > > +       char chroma_qp_index_offset;
> > > > +       char second_chroma_qp_index_offset;
> > > > +       unsigned char entropy_coding_mode_flag;
> > > > +       unsigned char pic_order_present_flag;
> > > > +       unsigned char deblocking_filter_control_present_flag;
> > > > +       unsigned char constrained_intra_pred_flag;
> > > > +       unsigned char weighted_pred_flag;
> > > > +       unsigned char redundant_pic_cnt_present_flag;
> > > > +       unsigned char transform_8x8_mode_flag;
> > > > +       unsigned char scaling_matrix_present_flag;
> > > > +       unsigned char reserved[2];
> > > > +};
> > > > +
> > > > +struct slice_api_h264_scaling_matrix {
> > >
> > > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > > Well I guess you don't want to mix a hardware-specific
> > > thing with the V4L2 API maybe.
> >
> > That's the idea. Although the layout match and the ABI is now stable,
> > I think this communicates better the fact that this is a firmware
> > structure.
> >
> > >
> > > > +       unsigned char scaling_list_4x4[6][16];
> > > > +       unsigned char scaling_list_8x8[6][64];
> > > > +};
> > > > +
> > > > +struct slice_h264_dpb_entry {
> > > > +       unsigned long long reference_ts;
> > > > +       unsigned short frame_num;
> > > > +       unsigned short pic_num;
> > > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > > + */
> > > > +struct slice_api_h264_decode_param {
> > > > +       struct slice_h264_dpb_entry dpb[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > For the same reason as above (this being a firmware structure), I
> > think it is clearer to not use the kernel definitions here.
> >
> > >
> > > > +       unsigned short num_slices;
> > > > +       unsigned short nal_ref_idc;
> > > > +       unsigned char ref_pic_list_p0[32];
> > > > +       unsigned char ref_pic_list_b0[32];
> > > > +       unsigned char ref_pic_list_b1[32];
> > >
> > > V4L2_H264_REF_LIST_LEN?
> >
> > Ditto.
> >
> > >
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > > + */
> > > > +struct mtk_h264_dec_slice_param {
> > > > +       struct mtk_h264_sps_param                       sps;
> > > > +       struct mtk_h264_pps_param                       pps;
> > > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > > +       struct slice_api_h264_decode_param              decode_params;
> > > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > Ditto.
> >
> > >
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct h264_fb - h264 decode frame buffer information
> > > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > > + * @poc         : picture order count of frame buffer
> > > > + * @reserved    : for 8 bytes alignment
> > > > + */
> > > > +struct h264_fb {
> > > > +       uint64_t vdec_fb_va;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       int32_t poc;
> > > > +       uint32_t reserved;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_dec_info - decode information
> > > > + * @dpb_sz             : decoding picture buffer size
> > > > + * @resolution_changed  : resoltion change happen
> > > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > > + * @cap_num_planes     : number planes of capture buffer
> > > > + * @bs_dma             : Input bit-stream buffer dma address
> > > > + * @y_fb_dma           : Y frame buffer dma address
> > > > + * @c_fb_dma           : C frame buffer dma address
> > > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > > + */
> > > > +struct vdec_h264_dec_info {
> > > > +       uint32_t dpb_sz;
> > > > +       uint32_t resolution_changed;
> > > > +       uint32_t realloc_mv_buf;
> > > > +       uint32_t cap_num_planes;
> > > > +       uint64_t bs_dma;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       uint64_t vdec_fb_va;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > > + *                        between VPU and Host.
> > > > + *                        The memory is allocated by VPU then mapping to Host
> > > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > > + *                        by VPU.
> > > > + *                        AP-W/R : AP is writer/reader on this item
> > > > + *                        VPU-W/R: VPU is write/reader on this item
> > > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > > > + * @dec          : decode information (AP-R, VPU-W)
> > > > + * @pic          : picture information (AP-R, VPU-W)
> > > > + * @crop         : crop information (AP-R, VPU-W)
> > > > + */
> > > > +struct vdec_h264_vsi {
> > > > +       uint64_t pred_buf_dma;
> > > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > > +       struct vdec_h264_dec_info dec;
> > > > +       struct vdec_pic_info pic;
> > > > +       struct v4l2_rect crop;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > > + * @num_nalu : how many nalus be decoded
> > > > + * @ctx      : point to mtk_vcodec_ctx
> > > > + * @pred_buf : HW working predication buffer
> > > > + * @mv_buf   : HW working motion vector buffer
> > > > + * @vpu      : VPU instance
> > > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > > + */
> > > > +struct vdec_h264_slice_inst {
> > > > +       unsigned int num_nalu;
> > > > +       struct mtk_vcodec_ctx *ctx;
> > > > +       struct mtk_vcodec_mem pred_buf;
> > > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > > +       struct vdec_vpu_inst vpu;
> > > > +       struct vdec_h264_vsi vsi_ctx;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > > +
> > > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > > +};
> > > > +
> > > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > > +                                int id)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > > +
> > > > +       return ctrl->p_cur.p;
> > > > +}
> > > > +
> > > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > > +                             struct mtk_h264_dec_slice_param *slice_param)
> > > > +{
> > > > +       struct vb2_queue *vq;
> > > > +       struct vb2_buffer *vb;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > > +       u64 index;
> > > > +
> > > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > > +
> > > > +       for (index = 0; index < 16; index++) {
> > >
> > > Ditto, some macro instead of 16.
> >
> > Changed this to use ARRAY_SIZE() which is appropriate here.
> >
> > >
> > > > +               const struct slice_h264_dpb_entry *dpb;
> > > > +               int vb2_index;
> > > > +
> > > > +               dpb = &slice_param->decode_params.dpb[index];
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > > +               if (vb2_index < 0) {
> > > > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > > > +                               index, dpb->reference_ts);
> > > > +                       continue;
> > > > +               }
> > > > +               /* 1 for short term reference, 2 for long term reference */
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > > > +               else
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > > > +
> > > > +               vb = vq->bufs[vb2_index];
> > > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > > +
> > > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > > +               }
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > > +}
> > > > +
> > > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > > +}
> > > > +
> > > > +static void
> > > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > > > +{
> > > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > > +
> > > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > > +}
> > > > +
> > > > +static void get_h264_decode_parameters(
> > > > +       struct slice_api_h264_decode_param *dst_params,
> > > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > > +
> > > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > > +               dst_entry->frame_num = src_entry->frame_num;
> > > > +               dst_entry->pic_num = src_entry->pic_num;
> > > > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > > > +               dst_entry->bottom_field_order_cnt =
> > > > +                       src_entry->bottom_field_order_cnt;
> > > > +               dst_entry->flags = src_entry->flags;
> > > > +       }
> > > > +
> > > > +       // num_slices is a leftover from the old H.264 support and is ignored
> > > > +       // by the firmware.
> > > > +       dst_params->num_slices = 0;
> > > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > > > +       dst_params->flags = src_params->flags;
> > > > +}
> > > > +
> > > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > > +                           const struct v4l2_h264_dpb_entry *b)
> > > > +{
> > > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > > > + * into the already existing slot in dpb, and move other entries into new slots.
> > > > + *
> > > > + * This function is an adaptation of the similarly-named function in
> > > > + * hantro_h264.c.
> > > > + */
> > > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > > +{
> > > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       unsigned int i, j;
> > > > +
> > > > +       /* Disable all entries by default, and mark the ones in use. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > > +                       set_bit(i, in_use);
> > > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > > +       }
> > > > +
> > > > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > > +
> > > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > > +                       continue;
> > > > +
> > > > +               /*
> > > > +                * To cut off some comparisons, iterate only on target DPB
> > > > +                * entries were already used.
> > > > +                */
> > > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +                       cdpb = &dpb[j];
> > > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > > +                               continue;
> > > > +
> > > > +                       *cdpb = *ndpb;
> > > > +                       set_bit(j, used);
> > > > +                       /* Don't reiterate on this one. */
> > > > +                       clear_bit(j, in_use);
> > > > +                       break;
> > > > +               }
> > > > +
> > > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > > +                       set_bit(i, new);
> > > > +       }
> > > > +
> > > > +       /* For entries that could not be matched, use remaining free slots. */
> > > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +               /*
> > > > +                * Both arrays are of the same sizes, so there is no way
> > > > +                * we can end up with no space in target array, unless
> > > > +                * something is buggy.
> > > > +                */
> > > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > > +                       return;
> > > > +
> > > > +               cdpb = &dpb[j];
> > > > +               *cdpb = *ndpb;
> > > > +               set_bit(j, used);
> > > > +       }
> > > > +}
> > > > +
> > > > +/*
> > > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > > + */
> > > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > > +{
> > > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > > +}
> > > > +
> > > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > > +       int i;
> > > > +
> > > > +       update_dpb(dec_params, inst->dpb);
> > > > +
> > > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > > > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > > > +                                  inst->dpb);
> > > > +       get_h264_dpb_list(inst, slice_param);
> > > > +
> > > > +       /* Prepare the fields for our reference lists */
> > > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > > +       /* Build the reference lists */
> > > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > > +                                      inst->dpb);
> > > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > > > +       /* Adapt the built lists to the firmware's expectations */
> > > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > > +
> > > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > > +}
> > > > +
> > > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > > > +{
> > > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > > > +
> > > > +       return HW_MB_STORE_SZ * unit_size;
> > > > +}
> > > > +
> > > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int err = 0;
> > > > +
> > > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > > +               return err;
> > > > +       }
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > > +       mem = &inst->pred_buf;
> > > > +       if (mem->va)
> > > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +}
> > > > +
> > > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > > +       struct vdec_pic_info *pic)
> > > > +{
> > > > +       int i;
> > > > +       int err;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > > +
> > > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +               mem->size = buf_sz;
> > > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > > +               if (err) {
> > > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > > +                       return err;
> > > > +               }
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int i;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > > +                        struct vdec_pic_info *pic)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > > +
> > > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > > +
> > > > +       pic = &ctx->picinfo;
> > > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > > +               ctx->picinfo.fb_sz[1]);
> > > > +
> > > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > > > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > > +                       ctx->last_decoded_picinfo.pic_w,
> > > > +                       ctx->last_decoded_picinfo.pic_h,
> > > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > > +       struct v4l2_rect *cr)
> > > > +{
> > > > +       cr->left = inst->vsi_ctx.crop.left;
> > > > +       cr->top = inst->vsi_ctx.crop.top;
> > > > +       cr->width = inst->vsi_ctx.crop.width;
> > > > +       cr->height = inst->vsi_ctx.crop.height;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > > +                        cr->left, cr->top, cr->width, cr->height);
> > > > +}
> > > > +
> > > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > > +       unsigned int *dpb_sz)
> > > > +{
> > > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > > +       int err;
> > > > +
> > > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > > +       if (!inst)
> > > > +               return -ENOMEM;
> > > > +
> > > > +       inst->ctx = ctx;
> > > > +
> > > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > > +       inst->vpu.ctx = ctx;
> > > > +
> > > > +       err = vpu_dec_init(&inst->vpu);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > > +               goto error_free_inst;
> > > > +       }
> > > > +
> > > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +       err = allocate_predication_buf(inst);
> > > > +       if (err)
> > > > +               goto error_deinit;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > > +               sizeof(struct mtk_h264_sps_param),
> > > > +               sizeof(struct mtk_h264_pps_param),
> > > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > > +               sizeof(struct mtk_h264_dpb_info));
> > > > +
> > > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > > +
> > > > +       ctx->drv_handle = inst;
> > > > +       return 0;
> > > > +
> > > > +error_deinit:
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +
> > > > +error_free_inst:
> > > > +       kfree(inst);
> > > > +       return err;
> > > > +}
> > > > +
> > > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +       free_predication_buf(inst);
> > > > +       free_mv_buf(inst);
> > > > +
> > > > +       kfree(inst);
> > > > +}
> > > > +
> > > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > > +{
> > > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > > +               return 3;
> > > > +
> > > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > > +           data[3] == 1)
> > > > +               return 4;
> > > > +
> > > > +       return -1;
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > > +       struct mtk_video_dec_buf *src_buf_info;
> > > > +       int nal_start_idx = 0, err = 0;
> > > > +       uint32_t nal_type, data[2];
> > > > +       unsigned char *buf;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > > +
> > > > +       /* bs NULL means flush decoder */
> > > > +       if (bs == NULL)
> > > > +               return vpu_dec_reset(vpu);
> > > > +
> > > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > > > +
> > > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > > +
> > > > +       buf = (unsigned char *)bs->va;
> > >
> > > I can be completely wrong, but it would seem here
> > > is where the CPU mapping is used.
> >
> > I think you're right. :)
> >
> > >
> > > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > > +       if (nal_start_idx < 0)
> > > > +               goto err_free_fb_out;
> > > > +
> > > > +       data[0] = bs->size;
> > > > +       data[1] = buf[nal_start_idx];
> > > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > >
> > > Which seems to be used to parse the NAL type. But shouldn't
> > > you expect here VLC NALUs only?
> > >
> > > I.e. you only get IDR or non-IDR frames, marked with
> > > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> >
> > Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> > the test using it below) and the driver is just as happy.
> >
> > >
> > > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > > > +                        nal_type);
> > > > +
> > > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > > +
> > > > +       get_vdec_decode_parameters(inst);
> > > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > > +       if (*res_chg) {
> > > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > > +                       if (err)
> > > > +                               goto err_free_fb_out;
> > > > +               }
> > > > +               *res_chg = false;
> > > > +       }
> > > > +
> > > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > > +       err = vpu_dec_start(vpu, data, 2);
> > >
> > > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > > could test if that can be derived without the CPU mapping.
> > > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> >
> > This one is a bit trickier. It seems the NAL type is passed as part of
> > the decode request to the firmware. Which should be absolutely not
> > needed since the firmware can check this from the buffer itself. Just
> > for fun I have tried setting this parameter unconditionally to 0x1
> > (non-IDR picture) and all I get is green frames with seemingly random
> > garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> > a different kind of garbage, and once every while a properly rendered
> > frame (presumably when it is *really* an IDR frame).
> >
> > So, mmm, I'm afraid we cannot decode properly without this information
> > and thus without the mapping, unless Yunfei can tell us of a way to
> > achieve this. Yunfei, do you have any idea?
> >
>
> Sorry, I wasn't clear with my suggestion. I didn't want to imply to avoid
> passing the firmware the information, just to stop deriving it from the buffers.
>
> Along these lines:
>
>         data[0] = bs->size;
> -       data[1] = buf[nal_start_idx];
> -       nal_type = NAL_TYPE(buf[nal_start_idx]);
> +       data[1] = nal_type = (dec_param->flags &
> V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) ?
> +               NAL_IDR_SLICE : NAL_NON_IDR_SLICE;
>
> (or slice type as Nicolas was suggesting)

Ahhh I see. Great idea!

When trying it, I realized that this data[1] byte also contains the
nal_ref_idc on top of the NAL type. Not including it in the value
would result in the decoder misbehaving. But fortunately that
parameter is also passed by user-space, so using that and the IDR_PIC
flag I can reconstruct the information accurately, and the decoder is
happy. Nice!

This means I can also remove the CPU mappings done in the previous
patch. I suspect there is room for further improvement since I see the
stateful part of the driver does the same thing (and submits the same
information to the firmware) - only here we cannot get that
information from userspace-provided parameters.

But at least for the stateless part we don't need the CPU mapping at
all, so I've removed it.



>
> This should allow you to not allocate your buffers coherently,
> as you are doing now (i.e. not requiring a CPU mapping).
>
> I don't know what would be the performance gains in your platform,
> but they might be worth it (note that this depends on the platform
> support for non-coherent DMA mappings).
>
> In any case, this is just a suggestion, mostly because a CPU mapping
> means the kernel is parsing the buffers, which is a bit unexpected.
>
> Regards,
> Ezequiel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-17  3:13           ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:13 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Yunfei Dong, Tiffany Lin, Andrew-CT Chen, Rob Herring,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

 On Tue, Mar 16, 2021 at 7:08 AM Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
>
> Hi Alex,
>
> On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org> wrote:
> >
> > Hi Ezequiel,
> >
> > On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > >  Hi Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org> wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Add support for H.264 decoding using the stateless API, as supported by
> > > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > > builders.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/Kconfig                |   1 +
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > > >  5 files changed, 813 insertions(+)
> > > >  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > >
> > > > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > > > index fd1831e97b22..c27db5643712 100644
> > > > --- a/drivers/media/platform/Kconfig
> > > > +++ b/drivers/media/platform/Kconfig
> > > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > > >         select V4L2_MEM2MEM_DEV
> > > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > > +       select V4L2_H264
> > > >         help
> > > >           Mediatek video codec driver provides HW capability to
> > > >           encode and decode in a range of video formats on MT8173
> > > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> > > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > > >                 vdec/vdec_vp8_if.o \
> > > >                 vdec/vdec_vp9_if.o \
> > > > +               vdec/vdec_h264_req_if.o \
> > > >                 mtk_vcodec_dec_drv.o \
> > > >                 vdec_drv_if.o \
> > > >                 vdec_vpu_if.o \
> > > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > new file mode 100644
> > > > index 000000000000..2fbbfbbcfbec
> > > > --- /dev/null
> > > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > @@ -0,0 +1,807 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include <linux/module.h>
> > > > +#include <linux/slab.h>
> > > > +#include <media/v4l2-mem2mem.h>
> > > > +#include <media/v4l2-h264.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > +
> > > > +#include "../vdec_drv_if.h"
> > > > +#include "../mtk_vcodec_util.h"
> > > > +#include "../mtk_vcodec_dec.h"
> > > > +#include "../mtk_vcodec_intr.h"
> > > > +#include "../vdec_vpu_if.h"
> > > > +#include "../vdec_drv_base.h"
> > > > +
> > > > +#define NAL_NON_IDR_SLICE                      0x01
> > > > +#define NAL_IDR_SLICE                          0x05
> > > > +#define NAL_H264_PPS                           0x08
> > >
> > > Not used?
> > >
> > > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > > +
> > >
> > > I believe you may not need the NAL type.
> >
> > True, removed this block of defines.
> >
> > >
> > > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > > +#define MB_UNIT_LEN                            16
> > > > +
> > > > +/* get used parameters for sps/pps */
> > > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > > +#define GET_MTK_VDEC_PARAM(param) \
> > > > +       { dst_param->param = src_param->param; }
> > > > +/* motion vector size (bytes) for every macro block */
> > > > +#define HW_MB_STORE_SZ                         64
> > > > +
> > > > +#define H264_MAX_FB_NUM                                17
> > > > +#define H264_MAX_MV_NUM                                32
> > > > +#define HDR_PARSING_BUF_SZ                     1024
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > > + * @y_dma_addr: Y bitstream physical address
> > > > + * @c_dma_addr: CbCr bitstream physical address
> > > > + * @reference_flag: reference picture flag (short/long term reference picture)
> > > > + * @field: field picture flag
> > > > + */
> > > > +struct mtk_h264_dpb_info {
> > > > +       dma_addr_t y_dma_addr;
> > > > +       dma_addr_t c_dma_addr;
> > > > +       int reference_flag;
> > > > +       int field;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_sps_param  - parameters for sps
> > > > + */
> > > > +struct mtk_h264_sps_param {
> > > > +       unsigned char chroma_format_idc;
> > > > +       unsigned char bit_depth_luma_minus8;
> > > > +       unsigned char bit_depth_chroma_minus8;
> > > > +       unsigned char log2_max_frame_num_minus4;
> > > > +       unsigned char pic_order_cnt_type;
> > > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > > +       unsigned char max_num_ref_frames;
> > > > +       unsigned char separate_colour_plane_flag;
> > > > +       unsigned short pic_width_in_mbs_minus1;
> > > > +       unsigned short pic_height_in_map_units_minus1;
> > > > +       unsigned int max_frame_nums;
> > > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > > +       unsigned char delta_pic_order_always_zero_flag;
> > > > +       unsigned char frame_mbs_only_flag;
> > > > +       unsigned char mb_adaptive_frame_field_flag;
> > > > +       unsigned char direct_8x8_inference_flag;
> > > > +       unsigned char reserved[3];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_pps_param  - parameters for pps
> > > > + */
> > > > +struct mtk_h264_pps_param {
> > > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > > +       unsigned char weighted_bipred_idc;
> > > > +       char pic_init_qp_minus26;
> > > > +       char chroma_qp_index_offset;
> > > > +       char second_chroma_qp_index_offset;
> > > > +       unsigned char entropy_coding_mode_flag;
> > > > +       unsigned char pic_order_present_flag;
> > > > +       unsigned char deblocking_filter_control_present_flag;
> > > > +       unsigned char constrained_intra_pred_flag;
> > > > +       unsigned char weighted_pred_flag;
> > > > +       unsigned char redundant_pic_cnt_present_flag;
> > > > +       unsigned char transform_8x8_mode_flag;
> > > > +       unsigned char scaling_matrix_present_flag;
> > > > +       unsigned char reserved[2];
> > > > +};
> > > > +
> > > > +struct slice_api_h264_scaling_matrix {
> > >
> > > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > > Well I guess you don't want to mix a hardware-specific
> > > thing with the V4L2 API maybe.
> >
> > That's the idea. Although the layout match and the ABI is now stable,
> > I think this communicates better the fact that this is a firmware
> > structure.
> >
> > >
> > > > +       unsigned char scaling_list_4x4[6][16];
> > > > +       unsigned char scaling_list_8x8[6][64];
> > > > +};
> > > > +
> > > > +struct slice_h264_dpb_entry {
> > > > +       unsigned long long reference_ts;
> > > > +       unsigned short frame_num;
> > > > +       unsigned short pic_num;
> > > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > > + */
> > > > +struct slice_api_h264_decode_param {
> > > > +       struct slice_h264_dpb_entry dpb[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > For the same reason as above (this being a firmware structure), I
> > think it is clearer to not use the kernel definitions here.
> >
> > >
> > > > +       unsigned short num_slices;
> > > > +       unsigned short nal_ref_idc;
> > > > +       unsigned char ref_pic_list_p0[32];
> > > > +       unsigned char ref_pic_list_b0[32];
> > > > +       unsigned char ref_pic_list_b1[32];
> > >
> > > V4L2_H264_REF_LIST_LEN?
> >
> > Ditto.
> >
> > >
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > > + */
> > > > +struct mtk_h264_dec_slice_param {
> > > > +       struct mtk_h264_sps_param                       sps;
> > > > +       struct mtk_h264_pps_param                       pps;
> > > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > > +       struct slice_api_h264_decode_param              decode_params;
> > > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > Ditto.
> >
> > >
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct h264_fb - h264 decode frame buffer information
> > > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > > + * @poc         : picture order count of frame buffer
> > > > + * @reserved    : for 8 bytes alignment
> > > > + */
> > > > +struct h264_fb {
> > > > +       uint64_t vdec_fb_va;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       int32_t poc;
> > > > +       uint32_t reserved;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_dec_info - decode information
> > > > + * @dpb_sz             : decoding picture buffer size
> > > > + * @resolution_changed  : resoltion change happen
> > > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > > + * @cap_num_planes     : number planes of capture buffer
> > > > + * @bs_dma             : Input bit-stream buffer dma address
> > > > + * @y_fb_dma           : Y frame buffer dma address
> > > > + * @c_fb_dma           : C frame buffer dma address
> > > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > > + */
> > > > +struct vdec_h264_dec_info {
> > > > +       uint32_t dpb_sz;
> > > > +       uint32_t resolution_changed;
> > > > +       uint32_t realloc_mv_buf;
> > > > +       uint32_t cap_num_planes;
> > > > +       uint64_t bs_dma;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       uint64_t vdec_fb_va;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > > + *                        between VPU and Host.
> > > > + *                        The memory is allocated by VPU then mapping to Host
> > > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > > + *                        by VPU.
> > > > + *                        AP-W/R : AP is writer/reader on this item
> > > > + *                        VPU-W/R: VPU is write/reader on this item
> > > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-R)
> > > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W, VPU-R)
> > > > + * @dec          : decode information (AP-R, VPU-W)
> > > > + * @pic          : picture information (AP-R, VPU-W)
> > > > + * @crop         : crop information (AP-R, VPU-W)
> > > > + */
> > > > +struct vdec_h264_vsi {
> > > > +       uint64_t pred_buf_dma;
> > > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > > +       struct vdec_h264_dec_info dec;
> > > > +       struct vdec_pic_info pic;
> > > > +       struct v4l2_rect crop;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > > + * @num_nalu : how many nalus be decoded
> > > > + * @ctx      : point to mtk_vcodec_ctx
> > > > + * @pred_buf : HW working predication buffer
> > > > + * @mv_buf   : HW working motion vector buffer
> > > > + * @vpu      : VPU instance
> > > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > > + */
> > > > +struct vdec_h264_slice_inst {
> > > > +       unsigned int num_nalu;
> > > > +       struct mtk_vcodec_ctx *ctx;
> > > > +       struct mtk_vcodec_mem pred_buf;
> > > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > > +       struct vdec_vpu_inst vpu;
> > > > +       struct vdec_h264_vsi vsi_ctx;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > > +
> > > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > > +};
> > > > +
> > > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > > +                                int id)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > > +
> > > > +       return ctrl->p_cur.p;
> > > > +}
> > > > +
> > > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > > +                             struct mtk_h264_dec_slice_param *slice_param)
> > > > +{
> > > > +       struct vb2_queue *vq;
> > > > +       struct vb2_buffer *vb;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > > +       u64 index;
> > > > +
> > > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > > +
> > > > +       for (index = 0; index < 16; index++) {
> > >
> > > Ditto, some macro instead of 16.
> >
> > Changed this to use ARRAY_SIZE() which is appropriate here.
> >
> > >
> > > > +               const struct slice_h264_dpb_entry *dpb;
> > > > +               int vb2_index;
> > > > +
> > > > +               dpb = &slice_param->decode_params.dpb[index];
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 0;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > > +               if (vb2_index < 0) {
> > > > +                       mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> > > > +                               index, dpb->reference_ts);
> > > > +                       continue;
> > > > +               }
> > > > +               /* 1 for short term reference, 2 for long term reference */
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 1;
> > > > +               else
> > > > +                       slice_param->h264_dpb_info[index].reference_flag = 2;
> > > > +
> > > > +               vb = vq->bufs[vb2_index];
> > > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> > > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > > +
> > > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > > +               }
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > > +}
> > > > +
> > > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > > +}
> > > > +
> > > > +static void
> > > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > > +                       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> > > > +{
> > > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > > +
> > > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > > +}
> > > > +
> > > > +static void get_h264_decode_parameters(
> > > > +       struct slice_api_h264_decode_param *dst_params,
> > > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> > > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > > +
> > > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > > +               dst_entry->frame_num = src_entry->frame_num;
> > > > +               dst_entry->pic_num = src_entry->pic_num;
> > > > +               dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> > > > +               dst_entry->bottom_field_order_cnt =
> > > > +                       src_entry->bottom_field_order_cnt;
> > > > +               dst_entry->flags = src_entry->flags;
> > > > +       }
> > > > +
> > > > +       // num_slices is a leftover from the old H.264 support and is ignored
> > > > +       // by the firmware.
> > > > +       dst_params->num_slices = 0;
> > > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > > +       dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> > > > +       dst_params->flags = src_params->flags;
> > > > +}
> > > > +
> > > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > > +                           const struct v4l2_h264_dpb_entry *b)
> > > > +{
> > > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> > > > + * into the already existing slot in dpb, and move other entries into new slots.
> > > > + *
> > > > + * This function is an adaptation of the similarly-named function in
> > > > + * hantro_h264.c.
> > > > + */
> > > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> > > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > > +{
> > > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       unsigned int i, j;
> > > > +
> > > > +       /* Disable all entries by default, and mark the ones in use. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > > +                       set_bit(i, in_use);
> > > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > > +       }
> > > > +
> > > > +       /* Try to match new DPB entries with existing ones by their POCs. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > > +
> > > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > > +                       continue;
> > > > +
> > > > +               /*
> > > > +                * To cut off some comparisons, iterate only on target DPB
> > > > +                * entries were already used.
> > > > +                */
> > > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +                       cdpb = &dpb[j];
> > > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > > +                               continue;
> > > > +
> > > > +                       *cdpb = *ndpb;
> > > > +                       set_bit(j, used);
> > > > +                       /* Don't reiterate on this one. */
> > > > +                       clear_bit(j, in_use);
> > > > +                       break;
> > > > +               }
> > > > +
> > > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > > +                       set_bit(i, new);
> > > > +       }
> > > > +
> > > > +       /* For entries that could not be matched, use remaining free slots. */
> > > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> > > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +               /*
> > > > +                * Both arrays are of the same sizes, so there is no way
> > > > +                * we can end up with no space in target array, unless
> > > > +                * something is buggy.
> > > > +                */
> > > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > > +                       return;
> > > > +
> > > > +               cdpb = &dpb[j];
> > > > +               *cdpb = *ndpb;
> > > > +               set_bit(j, used);
> > > > +       }
> > > > +}
> > > > +
> > > > +/*
> > > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > > + */
> > > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > > +{
> > > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > > +}
> > > > +
> > > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > > +       struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
> > > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > > +       int i;
> > > > +
> > > > +       update_dpb(dec_params, inst->dpb);
> > > > +
> > > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> > > > +       get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> > > > +                                  inst->dpb);
> > > > +       get_h264_dpb_list(inst, slice_param);
> > > > +
> > > > +       /* Prepare the fields for our reference lists */
> > > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > > +       /* Build the reference lists */
> > > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > > +                                      inst->dpb);
> > > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> > > > +       /* Adapt the built lists to the firmware's expectations */
> > > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > > +
> > > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > > +}
> > > > +
> > > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> > > > +{
> > > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> > > > +
> > > > +       return HW_MB_STORE_SZ * unit_size;
> > > > +}
> > > > +
> > > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int err = 0;
> > > > +
> > > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > > +               return err;
> > > > +       }
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > > +       mem = &inst->pred_buf;
> > > > +       if (mem->va)
> > > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +}
> > > > +
> > > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > > +       struct vdec_pic_info *pic)
> > > > +{
> > > > +       int i;
> > > > +       int err;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > > +
> > > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +               mem->size = buf_sz;
> > > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > > +               if (err) {
> > > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > > +                       return err;
> > > > +               }
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int i;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > > +                        struct vdec_pic_info *pic)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > > +
> > > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > > +
> > > > +       pic = &ctx->picinfo;
> > > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > > +               ctx->picinfo.fb_sz[1]);
> > > > +
> > > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w) ||
> > > > +                       (ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h))
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> > > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > > +                       ctx->last_decoded_picinfo.pic_w,
> > > > +                       ctx->last_decoded_picinfo.pic_h,
> > > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > > +       struct v4l2_rect *cr)
> > > > +{
> > > > +       cr->left = inst->vsi_ctx.crop.left;
> > > > +       cr->top = inst->vsi_ctx.crop.top;
> > > > +       cr->width = inst->vsi_ctx.crop.width;
> > > > +       cr->height = inst->vsi_ctx.crop.height;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > > +                        cr->left, cr->top, cr->width, cr->height);
> > > > +}
> > > > +
> > > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > > +       unsigned int *dpb_sz)
> > > > +{
> > > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > > +       int err;
> > > > +
> > > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > > +       if (!inst)
> > > > +               return -ENOMEM;
> > > > +
> > > > +       inst->ctx = ctx;
> > > > +
> > > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > > +       inst->vpu.ctx = ctx;
> > > > +
> > > > +       err = vpu_dec_init(&inst->vpu);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > > +               goto error_free_inst;
> > > > +       }
> > > > +
> > > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +       err = allocate_predication_buf(inst);
> > > > +       if (err)
> > > > +               goto error_deinit;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > > +               sizeof(struct mtk_h264_sps_param),
> > > > +               sizeof(struct mtk_h264_pps_param),
> > > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > > +               sizeof(struct mtk_h264_dpb_info));
> > > > +
> > > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > > +
> > > > +       ctx->drv_handle = inst;
> > > > +       return 0;
> > > > +
> > > > +error_deinit:
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +
> > > > +error_free_inst:
> > > > +       kfree(inst);
> > > > +       return err;
> > > > +}
> > > > +
> > > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +       free_predication_buf(inst);
> > > > +       free_mv_buf(inst);
> > > > +
> > > > +       kfree(inst);
> > > > +}
> > > > +
> > > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > > +{
> > > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > > +               return 3;
> > > > +
> > > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > > +           data[3] == 1)
> > > > +               return 4;
> > > > +
> > > > +       return -1;
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> > > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > > +       struct mtk_video_dec_buf *src_buf_info;
> > > > +       int nal_start_idx = 0, err = 0;
> > > > +       uint32_t nal_type, data[2];
> > > > +       unsigned char *buf;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > > +
> > > > +       /* bs NULL means flush decoder */
> > > > +       if (bs == NULL)
> > > > +               return vpu_dec_reset(vpu);
> > > > +
> > > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> > > > +
> > > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > > +
> > > > +       buf = (unsigned char *)bs->va;
> > >
> > > I can be completely wrong, but it would seem here
> > > is where the CPU mapping is used.
> >
> > I think you're right. :)
> >
> > >
> > > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > > +       if (nal_start_idx < 0)
> > > > +               goto err_free_fb_out;
> > > > +
> > > > +       data[0] = bs->size;
> > > > +       data[1] = buf[nal_start_idx];
> > > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > >
> > > Which seems to be used to parse the NAL type. But shouldn't
> > > you expect here VLC NALUs only?
> > >
> > > I.e. you only get IDR or non-IDR frames, marked with
> > > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> >
> > Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> > the test using it below) and the driver is just as happy.
> >
> > >
> > > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst->num_nalu,
> > > > +                        nal_type);
> > > > +
> > > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > > +
> > > > +       get_vdec_decode_parameters(inst);
> > > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > > +       if (*res_chg) {
> > > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > > +                       if (err)
> > > > +                               goto err_free_fb_out;
> > > > +               }
> > > > +               *res_chg = false;
> > > > +       }
> > > > +
> > > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > > +       err = vpu_dec_start(vpu, data, 2);
> > >
> > > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > > could test if that can be derived without the CPU mapping.
> > > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> >
> > This one is a bit trickier. It seems the NAL type is passed as part of
> > the decode request to the firmware. Which should be absolutely not
> > needed since the firmware can check this from the buffer itself. Just
> > for fun I have tried setting this parameter unconditionally to 0x1
> > (non-IDR picture) and all I get is green frames with seemingly random
> > garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> > a different kind of garbage, and once every while a properly rendered
> > frame (presumably when it is *really* an IDR frame).
> >
> > So, mmm, I'm afraid we cannot decode properly without this information
> > and thus without the mapping, unless Yunfei can tell us of a way to
> > achieve this. Yunfei, do you have any idea?
> >
>
> Sorry, I wasn't clear with my suggestion. I didn't want to imply to avoid
> passing the firmware the information, just to stop deriving it from the buffers.
>
> Along these lines:
>
>         data[0] = bs->size;
> -       data[1] = buf[nal_start_idx];
> -       nal_type = NAL_TYPE(buf[nal_start_idx]);
> +       data[1] = nal_type = (dec_param->flags &
> V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC) ?
> +               NAL_IDR_SLICE : NAL_NON_IDR_SLICE;
>
> (or slice type as Nicolas was suggesting)

Ahhh I see. Great idea!

When trying it, I realized that this data[1] byte also contains the
nal_ref_idc on top of the NAL type. Not including it in the value
would result in the decoder misbehaving. But fortunately that
parameter is also passed by user-space, so using that and the IDR_PIC
flag I can reconstruct the information accurately, and the decoder is
happy. Nice!

This means I can also remove the CPU mappings done in the previous
patch. I suspect there is room for further improvement since I see the
stateful part of the driver does the same thing (and submits the same
information to the firmware) - only here we cannot get that
information from userspace-provided parameters.

But at least for the stateless part we don't need the CPU mapping at
all, so I've removed it.



>
> This should allow you to not allocate your buffers coherently,
> as you are doing now (i.e. not requiring a CPU mapping).
>
> I don't know what would be the performance gains in your platform,
> but they might be worth it (note that this depends on the platform
> support for non-coherent DMA mappings).
>
> In any case, this is just a suggestion, mostly because a CPU mapping
> means the kernel is parsing the buffers, which is a bit unexpected.
>
> Regards,
> Ezequiel

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
  2021-03-15 15:21         ` Nicolas Dufresne
@ 2021-03-17  3:14           ` Alexandre Courbot
  -1 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:14 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Ezequiel Garcia, Yunfei Dong, Tiffany Lin, Andrew-CT Chen,
	Rob Herring, Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

On Tue, Mar 16, 2021 at 12:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> > Hi Ezequiel,
> >
> > On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > >  Hi Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > > wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Add support for H.264 decoding using the stateless API, as supported by
> > > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > > builders.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/Kconfig                |   1 +
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > > >  5 files changed, 813 insertions(+)
> > > >  create mode 100644 drivers/media/platform/mtk-
> > > > vcodec/vdec/vdec_h264_req_if.c
> > > >
> > > > diff --git a/drivers/media/platform/Kconfig
> > > > b/drivers/media/platform/Kconfig
> > > > index fd1831e97b22..c27db5643712 100644
> > > > --- a/drivers/media/platform/Kconfig
> > > > +++ b/drivers/media/platform/Kconfig
> > > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > > >         select V4L2_MEM2MEM_DEV
> > > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > > +       select V4L2_H264
> > > >         help
> > > >           Mediatek video codec driver provides HW capability to
> > > >           encode and decode in a range of video formats on MT8173
> > > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile
> > > > b/drivers/media/platform/mtk-vcodec/Makefile
> > > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > > >                 vdec/vdec_vp8_if.o \
> > > >                 vdec/vdec_vp9_if.o \
> > > > +               vdec/vdec_h264_req_if.o \
> > > >                 mtk_vcodec_dec_drv.o \
> > > >                 vdec_drv_if.o \
> > > >                 vdec_vpu_if.o \
> > > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > new file mode 100644
> > > > index 000000000000..2fbbfbbcfbec
> > > > --- /dev/null
> > > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > @@ -0,0 +1,807 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include <linux/module.h>
> > > > +#include <linux/slab.h>
> > > > +#include <media/v4l2-mem2mem.h>
> > > > +#include <media/v4l2-h264.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > +
> > > > +#include "../vdec_drv_if.h"
> > > > +#include "../mtk_vcodec_util.h"
> > > > +#include "../mtk_vcodec_dec.h"
> > > > +#include "../mtk_vcodec_intr.h"
> > > > +#include "../vdec_vpu_if.h"
> > > > +#include "../vdec_drv_base.h"
> > > > +
> > > > +#define NAL_NON_IDR_SLICE                      0x01
> > > > +#define NAL_IDR_SLICE                          0x05
> > > > +#define NAL_H264_PPS                           0x08
> > >
> > > Not used?
> > >
> > > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > > +
> > >
> > > I believe you may not need the NAL type.
> >
> > True, removed this block of defines.
> >
> > >
> > > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > > +#define MB_UNIT_LEN                            16
> > > > +
> > > > +/* get used parameters for sps/pps */
> > > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > > +#define GET_MTK_VDEC_PARAM(param) \
> > > > +       { dst_param->param = src_param->param; }
> > > > +/* motion vector size (bytes) for every macro block */
> > > > +#define HW_MB_STORE_SZ                         64
> > > > +
> > > > +#define H264_MAX_FB_NUM                                17
> > > > +#define H264_MAX_MV_NUM                                32
> > > > +#define HDR_PARSING_BUF_SZ                     1024
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > > + * @y_dma_addr: Y bitstream physical address
> > > > + * @c_dma_addr: CbCr bitstream physical address
> > > > + * @reference_flag: reference picture flag (short/long term reference
> > > > picture)
> > > > + * @field: field picture flag
> > > > + */
> > > > +struct mtk_h264_dpb_info {
> > > > +       dma_addr_t y_dma_addr;
> > > > +       dma_addr_t c_dma_addr;
> > > > +       int reference_flag;
> > > > +       int field;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_sps_param  - parameters for sps
> > > > + */
> > > > +struct mtk_h264_sps_param {
> > > > +       unsigned char chroma_format_idc;
> > > > +       unsigned char bit_depth_luma_minus8;
> > > > +       unsigned char bit_depth_chroma_minus8;
> > > > +       unsigned char log2_max_frame_num_minus4;
> > > > +       unsigned char pic_order_cnt_type;
> > > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > > +       unsigned char max_num_ref_frames;
> > > > +       unsigned char separate_colour_plane_flag;
> > > > +       unsigned short pic_width_in_mbs_minus1;
> > > > +       unsigned short pic_height_in_map_units_minus1;
> > > > +       unsigned int max_frame_nums;
> > > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > > +       unsigned char delta_pic_order_always_zero_flag;
> > > > +       unsigned char frame_mbs_only_flag;
> > > > +       unsigned char mb_adaptive_frame_field_flag;
> > > > +       unsigned char direct_8x8_inference_flag;
> > > > +       unsigned char reserved[3];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_pps_param  - parameters for pps
> > > > + */
> > > > +struct mtk_h264_pps_param {
> > > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > > +       unsigned char weighted_bipred_idc;
> > > > +       char pic_init_qp_minus26;
> > > > +       char chroma_qp_index_offset;
> > > > +       char second_chroma_qp_index_offset;
> > > > +       unsigned char entropy_coding_mode_flag;
> > > > +       unsigned char pic_order_present_flag;
> > > > +       unsigned char deblocking_filter_control_present_flag;
> > > > +       unsigned char constrained_intra_pred_flag;
> > > > +       unsigned char weighted_pred_flag;
> > > > +       unsigned char redundant_pic_cnt_present_flag;
> > > > +       unsigned char transform_8x8_mode_flag;
> > > > +       unsigned char scaling_matrix_present_flag;
> > > > +       unsigned char reserved[2];
> > > > +};
> > > > +
> > > > +struct slice_api_h264_scaling_matrix {
> > >
> > > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > > Well I guess you don't want to mix a hardware-specific
> > > thing with the V4L2 API maybe.
> >
> > That's the idea. Although the layout match and the ABI is now stable,
> > I think this communicates better the fact that this is a firmware
> > structure.
> >
> > >
> > > > +       unsigned char scaling_list_4x4[6][16];
> > > > +       unsigned char scaling_list_8x8[6][64];
> > > > +};
> > > > +
> > > > +struct slice_h264_dpb_entry {
> > > > +       unsigned long long reference_ts;
> > > > +       unsigned short frame_num;
> > > > +       unsigned short pic_num;
> > > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > > + */
> > > > +struct slice_api_h264_decode_param {
> > > > +       struct slice_h264_dpb_entry dpb[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > For the same reason as above (this being a firmware structure), I
> > think it is clearer to not use the kernel definitions here.
> >
> > >
> > > > +       unsigned short num_slices;
> > > > +       unsigned short nal_ref_idc;
> > > > +       unsigned char ref_pic_list_p0[32];
> > > > +       unsigned char ref_pic_list_b0[32];
> > > > +       unsigned char ref_pic_list_b1[32];
> > >
> > > V4L2_H264_REF_LIST_LEN?
> >
> > Ditto.
> >
> > >
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > > + */
> > > > +struct mtk_h264_dec_slice_param {
> > > > +       struct mtk_h264_sps_param                       sps;
> > > > +       struct mtk_h264_pps_param                       pps;
> > > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > > +       struct slice_api_h264_decode_param              decode_params;
> > > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > Ditto.
> >
> > >
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct h264_fb - h264 decode frame buffer information
> > > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > > + * @poc         : picture order count of frame buffer
> > > > + * @reserved    : for 8 bytes alignment
> > > > + */
> > > > +struct h264_fb {
> > > > +       uint64_t vdec_fb_va;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       int32_t poc;
> > > > +       uint32_t reserved;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_dec_info - decode information
> > > > + * @dpb_sz             : decoding picture buffer size
> > > > + * @resolution_changed  : resoltion change happen
> > > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > > + * @cap_num_planes     : number planes of capture buffer
> > > > + * @bs_dma             : Input bit-stream buffer dma address
> > > > + * @y_fb_dma           : Y frame buffer dma address
> > > > + * @c_fb_dma           : C frame buffer dma address
> > > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > > + */
> > > > +struct vdec_h264_dec_info {
> > > > +       uint32_t dpb_sz;
> > > > +       uint32_t resolution_changed;
> > > > +       uint32_t realloc_mv_buf;
> > > > +       uint32_t cap_num_planes;
> > > > +       uint64_t bs_dma;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       uint64_t vdec_fb_va;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > > + *                        between VPU and Host.
> > > > + *                        The memory is allocated by VPU then mapping to
> > > > Host
> > > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > > + *                        by VPU.
> > > > + *                        AP-W/R : AP is writer/reader on this item
> > > > + *                        VPU-W/R: VPU is write/reader on this item
> > > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-
> > > > R)
> > > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W,
> > > > VPU-R)
> > > > + * @dec          : decode information (AP-R, VPU-W)
> > > > + * @pic          : picture information (AP-R, VPU-W)
> > > > + * @crop         : crop information (AP-R, VPU-W)
> > > > + */
> > > > +struct vdec_h264_vsi {
> > > > +       uint64_t pred_buf_dma;
> > > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > > +       struct vdec_h264_dec_info dec;
> > > > +       struct vdec_pic_info pic;
> > > > +       struct v4l2_rect crop;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > > + * @num_nalu : how many nalus be decoded
> > > > + * @ctx      : point to mtk_vcodec_ctx
> > > > + * @pred_buf : HW working predication buffer
> > > > + * @mv_buf   : HW working motion vector buffer
> > > > + * @vpu      : VPU instance
> > > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > > + */
> > > > +struct vdec_h264_slice_inst {
> > > > +       unsigned int num_nalu;
> > > > +       struct mtk_vcodec_ctx *ctx;
> > > > +       struct mtk_vcodec_mem pred_buf;
> > > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > > +       struct vdec_vpu_inst vpu;
> > > > +       struct vdec_h264_vsi vsi_ctx;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > > +
> > > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > > +};
> > > > +
> > > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > > +                                int id)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > > +
> > > > +       return ctrl->p_cur.p;
> > > > +}
> > > > +
> > > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > > +                             struct mtk_h264_dec_slice_param
> > > > *slice_param)
> > > > +{
> > > > +       struct vb2_queue *vq;
> > > > +       struct vb2_buffer *vb;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > > +       u64 index;
> > > > +
> > > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > > +
> > > > +       for (index = 0; index < 16; index++) {
> > >
> > > Ditto, some macro instead of 16.
> >
> > Changed this to use ARRAY_SIZE() which is appropriate here.
> >
> > >
> > > > +               const struct slice_h264_dpb_entry *dpb;
> > > > +               int vb2_index;
> > > > +
> > > > +               dpb = &slice_param->decode_params.dpb[index];
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 0;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > > +               if (vb2_index < 0) {
> > > > +                       mtk_vcodec_err(inst, "Reference invalid:
> > > > dpb_index(%lld) reference_ts(%lld)",
> > > > +                               index, dpb->reference_ts);
> > > > +                       continue;
> > > > +               }
> > > > +               /* 1 for short term reference, 2 for long term reference
> > > > */
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 1;
> > > > +               else
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 2;
> > > > +
> > > > +               vb = vq->bufs[vb2_index];
> > > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer,
> > > > vb2_buf);
> > > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > > +
> > > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes ==
> > > > 2) {
> > > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > > +               }
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > > +}
> > > > +
> > > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > > +
> > > > V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > > +}
> > > > +
> > > > +static void
> > > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > > +                       const struct v4l2_ctrl_h264_scaling_matrix
> > > > *src_matrix)
> > > > +{
> > > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > > +
> > > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > > +}
> > > > +
> > > > +static void get_h264_decode_parameters(
> > > > +       struct slice_api_h264_decode_param *dst_params,
> > > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params-
> > > > >dpb[i];
> > > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > > +
> > > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > > +               dst_entry->frame_num = src_entry->frame_num;
> > > > +               dst_entry->pic_num = src_entry->pic_num;
> > > > +               dst_entry->top_field_order_cnt = src_entry-
> > > > >top_field_order_cnt;
> > > > +               dst_entry->bottom_field_order_cnt =
> > > > +                       src_entry->bottom_field_order_cnt;
> > > > +               dst_entry->flags = src_entry->flags;
> > > > +       }
> > > > +
> > > > +       // num_slices is a leftover from the old H.264 support and is
> > > > ignored
> > > > +       // by the firmware.
> > > > +       dst_params->num_slices = 0;
> > > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > > +       dst_params->bottom_field_order_cnt = src_params-
> > > > >bottom_field_order_cnt;
> > > > +       dst_params->flags = src_params->flags;
> > > > +}
> > > > +
> > > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > > +                           const struct v4l2_h264_dpb_entry *b)
> > > > +{
> > > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Move DPB entries of dec_param that refer to a frame already existing
> > > > in dpb
> > > > + * into the already existing slot in dpb, and move other entries into new
> > > > slots.
> > > > + *
> > > > + * This function is an adaptation of the similarly-named function in
> > > > + * hantro_h264.c.
> > > > + */
> > > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params
> > > > *dec_param,
> > > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > > +{
> > > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       unsigned int i, j;
> > > > +
> > > > +       /* Disable all entries by default, and mark the ones in use. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > > +                       set_bit(i, in_use);
> > > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > > +       }
> > > > +
> > > > +       /* Try to match new DPB entries with existing ones by their POCs.
> > > > */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > > >dpb[i];
> > > > +
> > > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > > +                       continue;
> > > > +
> > > > +               /*
> > > > +                * To cut off some comparisons, iterate only on target DPB
> > > > +                * entries were already used.
> > > > +                */
> > > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +                       cdpb = &dpb[j];
> > > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > > +                               continue;
> > > > +
> > > > +                       *cdpb = *ndpb;
> > > > +                       set_bit(j, used);
> > > > +                       /* Don't reiterate on this one. */
> > > > +                       clear_bit(j, in_use);
> > > > +                       break;
> > > > +               }
> > > > +
> > > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > > +                       set_bit(i, new);
> > > > +       }
> > > > +
> > > > +       /* For entries that could not be matched, use remaining free
> > > > slots. */
> > > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > > >dpb[i];
> > > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +               /*
> > > > +                * Both arrays are of the same sizes, so there is no way
> > > > +                * we can end up with no space in target array, unless
> > > > +                * something is buggy.
> > > > +                */
> > > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > > +                       return;
> > > > +
> > > > +               cdpb = &dpb[j];
> > > > +               *cdpb = *ndpb;
> > > > +               set_bit(j, used);
> > > > +       }
> > > > +}
> > > > +
> > > > +/*
> > > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > > + */
> > > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > > +{
> > > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > > +}
> > > > +
> > > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > > +               get_ctrl_ptr(inst->ctx,
> > > > V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > > +               get_ctrl_ptr(inst->ctx,
> > > > V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > > +       struct mtk_h264_dec_slice_param *slice_param = &inst-
> > > > >h264_slice_param;
> > > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > > +       int i;
> > > > +
> > > > +       update_dpb(dec_params, inst->dpb);
> > > > +
> > > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix,
> > > > scaling_matrix);
> > > > +       get_h264_decode_parameters(&slice_param->decode_params,
> > > > dec_params,
> > > > +                                  inst->dpb);
> > > > +       get_h264_dpb_list(inst, slice_param);
> > > > +
> > > > +       /* Prepare the fields for our reference lists */
> > > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > > +       /* Build the reference lists */
> > > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > > +                                      inst->dpb);
> > > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist,
> > > > b1_reflist);
> > > > +       /* Adapt the built lists to the firmware's expectations */
> > > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > > +
> > > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > > +}
> > > > +
> > > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int
> > > > height)
> > > > +{
> > > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) +
> > > > 8;
> > > > +
> > > > +       return HW_MB_STORE_SZ * unit_size;
> > > > +}
> > > > +
> > > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int err = 0;
> > > > +
> > > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > > +               return err;
> > > > +       }
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > > +       mem = &inst->pred_buf;
> > > > +       if (mem->va)
> > > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +}
> > > > +
> > > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > > +       struct vdec_pic_info *pic)
> > > > +{
> > > > +       int i;
> > > > +       int err;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > > +
> > > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +               mem->size = buf_sz;
> > > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > > +               if (err) {
> > > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > > +                       return err;
> > > > +               }
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int i;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > > +                        struct vdec_pic_info *pic)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > > +
> > > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > > +
> > > > +       pic = &ctx->picinfo;
> > > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > > +               ctx->picinfo.fb_sz[1]);
> > > > +
> > > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx-
> > > > >picinfo.buf_w) ||
> > > > +                       (ctx->last_decoded_picinfo.buf_h != ctx-
> > > > >picinfo.buf_h))
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) ->
> > > > new(%d, %d)",
> > > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > > +                       ctx->last_decoded_picinfo.pic_w,
> > > > +                       ctx->last_decoded_picinfo.pic_h,
> > > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > > +       struct v4l2_rect *cr)
> > > > +{
> > > > +       cr->left = inst->vsi_ctx.crop.left;
> > > > +       cr->top = inst->vsi_ctx.crop.top;
> > > > +       cr->width = inst->vsi_ctx.crop.width;
> > > > +       cr->height = inst->vsi_ctx.crop.height;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > > +                        cr->left, cr->top, cr->width, cr->height);
> > > > +}
> > > > +
> > > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > > +       unsigned int *dpb_sz)
> > > > +{
> > > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > > +       int err;
> > > > +
> > > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > > +       if (!inst)
> > > > +               return -ENOMEM;
> > > > +
> > > > +       inst->ctx = ctx;
> > > > +
> > > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > > +       inst->vpu.ctx = ctx;
> > > > +
> > > > +       err = vpu_dec_init(&inst->vpu);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > > +               goto error_free_inst;
> > > > +       }
> > > > +
> > > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +       err = allocate_predication_buf(inst);
> > > > +       if (err)
> > > > +               goto error_deinit;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > > +               sizeof(struct mtk_h264_sps_param),
> > > > +               sizeof(struct mtk_h264_pps_param),
> > > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > > +               sizeof(struct mtk_h264_dpb_info));
> > > > +
> > > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > > +
> > > > +       ctx->drv_handle = inst;
> > > > +       return 0;
> > > > +
> > > > +error_deinit:
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +
> > > > +error_free_inst:
> > > > +       kfree(inst);
> > > > +       return err;
> > > > +}
> > > > +
> > > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +       free_predication_buf(inst);
> > > > +       free_mv_buf(inst);
> > > > +
> > > > +       kfree(inst);
> > > > +}
> > > > +
> > > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > > +{
> > > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > > +               return 3;
> > > > +
> > > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > > +           data[3] == 1)
> > > > +               return 4;
> > > > +
> > > > +       return -1;
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem
> > > > *bs,
> > > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > > +       struct mtk_video_dec_buf *src_buf_info;
> > > > +       int nal_start_idx = 0, err = 0;
> > > > +       uint32_t nal_type, data[2];
> > > > +       unsigned char *buf;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > > +
> > > > +       /* bs NULL means flush decoder */
> > > > +       if (bs == NULL)
> > > > +               return vpu_dec_reset(vpu);
> > > > +
> > > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf,
> > > > bs_buffer);
> > > > +
> > > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > > +
> > > > +       buf = (unsigned char *)bs->va;
> > >
> > > I can be completely wrong, but it would seem here
> > > is where the CPU mapping is used.
> >
> > I think you're right. :)
> >
> > >
> > > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > > +       if (nal_start_idx < 0)
> > > > +               goto err_free_fb_out;
> > > > +
> > > > +       data[0] = bs->size;
> > > > +       data[1] = buf[nal_start_idx];
> > > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > >
> > > Which seems to be used to parse the NAL type. But shouldn't
> > > you expect here VLC NALUs only?
> > >
> > > I.e. you only get IDR or non-IDR frames, marked with
> > > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> >
> > Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> > the test using it below) and the driver is just as happy.
> >
> > >
> > > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst-
> > > > >num_nalu,
> > > > +                        nal_type);
> > > > +
> > > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > > +
> > > > +       get_vdec_decode_parameters(inst);
> > > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > > +       if (*res_chg) {
> > > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > > +                       if (err)
> > > > +                               goto err_free_fb_out;
> > > > +               }
> > > > +               *res_chg = false;
> > > > +       }
> > > > +
> > > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > > +       err = vpu_dec_start(vpu, data, 2);
> > >
> > > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > > could test if that can be derived without the CPU mapping.
> > > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> >
> > This one is a bit trickier. It seems the NAL type is passed as part of
> > the decode request to the firmware. Which should be absolutely not
> > needed since the firmware can check this from the buffer itself. Just
> > for fun I have tried setting this parameter unconditionally to 0x1
> > (non-IDR picture) and all I get is green frames with seemingly random
> > garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> > a different kind of garbage, and once every while a properly rendered
> > frame (presumably when it is *really* an IDR frame).
>
> Can't you deduce this from the v4l2_ctrl_h264_slice_params.slice_type ?

This decoder is frame-based, so it doesn't receive a
v4l2_ctrl_h264_slice_params from user-space unfortunately. But
thankfully we can deduce this from the decode params.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding
@ 2021-03-17  3:14           ` Alexandre Courbot
  0 siblings, 0 replies; 56+ messages in thread
From: Alexandre Courbot @ 2021-03-17  3:14 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Ezequiel Garcia, Yunfei Dong, Tiffany Lin, Andrew-CT Chen,
	Rob Herring, Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

On Tue, Mar 16, 2021 at 12:21 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le lundi 15 mars 2021 à 20:28 +0900, Alexandre Courbot a écrit :
> > Hi Ezequiel,
> >
> > On Thu, Mar 4, 2021 at 6:47 AM Ezequiel Garcia
> > <ezequiel@vanguardiasur.com.ar> wrote:
> > >
> > >  Hi Alex,
> > >
> > > Thanks for the patch.
> > >
> > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > > wrote:
> > > >
> > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > >
> > > > Add support for H.264 decoding using the stateless API, as supported by
> > > > MT8183. This support takes advantage of the V4L2 H.264 reference list
> > > > builders.
> > > >
> > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > [acourbot: refactor, cleanup and split]
> > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > ---
> > > >  drivers/media/platform/Kconfig                |   1 +
> > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > >  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 807 ++++++++++++++++++
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   3 +
> > > >  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
> > > >  5 files changed, 813 insertions(+)
> > > >  create mode 100644 drivers/media/platform/mtk-
> > > > vcodec/vdec/vdec_h264_req_if.c
> > > >
> > > > diff --git a/drivers/media/platform/Kconfig
> > > > b/drivers/media/platform/Kconfig
> > > > index fd1831e97b22..c27db5643712 100644
> > > > --- a/drivers/media/platform/Kconfig
> > > > +++ b/drivers/media/platform/Kconfig
> > > > @@ -295,6 +295,7 @@ config VIDEO_MEDIATEK_VCODEC
> > > >         select V4L2_MEM2MEM_DEV
> > > >         select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
> > > >         select VIDEO_MEDIATEK_VCODEC_SCP if MTK_SCP
> > > > +       select V4L2_H264
> > > >         help
> > > >           Mediatek video codec driver provides HW capability to
> > > >           encode and decode in a range of video formats on MT8173
> > > > diff --git a/drivers/media/platform/mtk-vcodec/Makefile
> > > > b/drivers/media/platform/mtk-vcodec/Makefile
> > > > index 4ba93d838ab6..ca8e9e7a9c4e 100644
> > > > --- a/drivers/media/platform/mtk-vcodec/Makefile
> > > > +++ b/drivers/media/platform/mtk-vcodec/Makefile
> > > > @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
> > > >  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
> > > >                 vdec/vdec_vp8_if.o \
> > > >                 vdec/vdec_vp9_if.o \
> > > > +               vdec/vdec_h264_req_if.o \
> > > >                 mtk_vcodec_dec_drv.o \
> > > >                 vdec_drv_if.o \
> > > >                 vdec_vpu_if.o \
> > > > diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > new file mode 100644
> > > > index 000000000000..2fbbfbbcfbec
> > > > --- /dev/null
> > > > +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> > > > @@ -0,0 +1,807 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include <linux/module.h>
> > > > +#include <linux/slab.h>
> > > > +#include <media/v4l2-mem2mem.h>
> > > > +#include <media/v4l2-h264.h>
> > > > +#include <media/videobuf2-dma-contig.h>
> > > > +
> > > > +#include "../vdec_drv_if.h"
> > > > +#include "../mtk_vcodec_util.h"
> > > > +#include "../mtk_vcodec_dec.h"
> > > > +#include "../mtk_vcodec_intr.h"
> > > > +#include "../vdec_vpu_if.h"
> > > > +#include "../vdec_drv_base.h"
> > > > +
> > > > +#define NAL_NON_IDR_SLICE                      0x01
> > > > +#define NAL_IDR_SLICE                          0x05
> > > > +#define NAL_H264_PPS                           0x08
> > >
> > > Not used?
> > >
> > > > +#define NAL_TYPE(value)                                ((value) & 0x1F)
> > > > +
> > >
> > > I believe you may not need the NAL type.
> >
> > True, removed this block of defines.
> >
> > >
> > > > +#define BUF_PREDICTION_SZ                      (64 * 4096)
> > > > +#define MB_UNIT_LEN                            16
> > > > +
> > > > +/* get used parameters for sps/pps */
> > > > +#define GET_MTK_VDEC_FLAG(cond, flag) \
> > > > +       { dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> > > > +#define GET_MTK_VDEC_PARAM(param) \
> > > > +       { dst_param->param = src_param->param; }
> > > > +/* motion vector size (bytes) for every macro block */
> > > > +#define HW_MB_STORE_SZ                         64
> > > > +
> > > > +#define H264_MAX_FB_NUM                                17
> > > > +#define H264_MAX_MV_NUM                                32
> > > > +#define HDR_PARSING_BUF_SZ                     1024
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dpb_info  - h264 dpb information
> > > > + * @y_dma_addr: Y bitstream physical address
> > > > + * @c_dma_addr: CbCr bitstream physical address
> > > > + * @reference_flag: reference picture flag (short/long term reference
> > > > picture)
> > > > + * @field: field picture flag
> > > > + */
> > > > +struct mtk_h264_dpb_info {
> > > > +       dma_addr_t y_dma_addr;
> > > > +       dma_addr_t c_dma_addr;
> > > > +       int reference_flag;
> > > > +       int field;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_sps_param  - parameters for sps
> > > > + */
> > > > +struct mtk_h264_sps_param {
> > > > +       unsigned char chroma_format_idc;
> > > > +       unsigned char bit_depth_luma_minus8;
> > > > +       unsigned char bit_depth_chroma_minus8;
> > > > +       unsigned char log2_max_frame_num_minus4;
> > > > +       unsigned char pic_order_cnt_type;
> > > > +       unsigned char log2_max_pic_order_cnt_lsb_minus4;
> > > > +       unsigned char max_num_ref_frames;
> > > > +       unsigned char separate_colour_plane_flag;
> > > > +       unsigned short pic_width_in_mbs_minus1;
> > > > +       unsigned short pic_height_in_map_units_minus1;
> > > > +       unsigned int max_frame_nums;
> > > > +       unsigned char qpprime_y_zero_transform_bypass_flag;
> > > > +       unsigned char delta_pic_order_always_zero_flag;
> > > > +       unsigned char frame_mbs_only_flag;
> > > > +       unsigned char mb_adaptive_frame_field_flag;
> > > > +       unsigned char direct_8x8_inference_flag;
> > > > +       unsigned char reserved[3];
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_pps_param  - parameters for pps
> > > > + */
> > > > +struct mtk_h264_pps_param {
> > > > +       unsigned char num_ref_idx_l0_default_active_minus1;
> > > > +       unsigned char num_ref_idx_l1_default_active_minus1;
> > > > +       unsigned char weighted_bipred_idc;
> > > > +       char pic_init_qp_minus26;
> > > > +       char chroma_qp_index_offset;
> > > > +       char second_chroma_qp_index_offset;
> > > > +       unsigned char entropy_coding_mode_flag;
> > > > +       unsigned char pic_order_present_flag;
> > > > +       unsigned char deblocking_filter_control_present_flag;
> > > > +       unsigned char constrained_intra_pred_flag;
> > > > +       unsigned char weighted_pred_flag;
> > > > +       unsigned char redundant_pic_cnt_present_flag;
> > > > +       unsigned char transform_8x8_mode_flag;
> > > > +       unsigned char scaling_matrix_present_flag;
> > > > +       unsigned char reserved[2];
> > > > +};
> > > > +
> > > > +struct slice_api_h264_scaling_matrix {
> > >
> > > Equal to v4l2_ctrl_h264_scaling_matrix ?
> > > Well I guess you don't want to mix a hardware-specific
> > > thing with the V4L2 API maybe.
> >
> > That's the idea. Although the layout match and the ABI is now stable,
> > I think this communicates better the fact that this is a firmware
> > structure.
> >
> > >
> > > > +       unsigned char scaling_list_4x4[6][16];
> > > > +       unsigned char scaling_list_8x8[6][64];
> > > > +};
> > > > +
> > > > +struct slice_h264_dpb_entry {
> > > > +       unsigned long long reference_ts;
> > > > +       unsigned short frame_num;
> > > > +       unsigned short pic_num;
> > > > +       /* Note that field is indicated by v4l2_buffer.field */
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct slice_api_h264_decode_param - parameters for decode.
> > > > + */
> > > > +struct slice_api_h264_decode_param {
> > > > +       struct slice_h264_dpb_entry dpb[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > For the same reason as above (this being a firmware structure), I
> > think it is clearer to not use the kernel definitions here.
> >
> > >
> > > > +       unsigned short num_slices;
> > > > +       unsigned short nal_ref_idc;
> > > > +       unsigned char ref_pic_list_p0[32];
> > > > +       unsigned char ref_pic_list_b0[32];
> > > > +       unsigned char ref_pic_list_b1[32];
> > >
> > > V4L2_H264_REF_LIST_LEN?
> >
> > Ditto.
> >
> > >
> > > > +       int top_field_order_cnt;
> > > > +       int bottom_field_order_cnt;
> > > > +       unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> > > > + */
> > > > +struct mtk_h264_dec_slice_param {
> > > > +       struct mtk_h264_sps_param                       sps;
> > > > +       struct mtk_h264_pps_param                       pps;
> > > > +       struct slice_api_h264_scaling_matrix            scaling_matrix;
> > > > +       struct slice_api_h264_decode_param              decode_params;
> > > > +       struct mtk_h264_dpb_info h264_dpb_info[16];
> > >
> > > V4L2_H264_NUM_DPB_ENTRIES?
> >
> > Ditto.
> >
> > >
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct h264_fb - h264 decode frame buffer information
> > > > + * @vdec_fb_va  : virtual address of struct vdec_fb
> > > > + * @y_fb_dma    : dma address of Y frame buffer (luma)
> > > > + * @c_fb_dma    : dma address of C frame buffer (chroma)
> > > > + * @poc         : picture order count of frame buffer
> > > > + * @reserved    : for 8 bytes alignment
> > > > + */
> > > > +struct h264_fb {
> > > > +       uint64_t vdec_fb_va;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       int32_t poc;
> > > > +       uint32_t reserved;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_dec_info - decode information
> > > > + * @dpb_sz             : decoding picture buffer size
> > > > + * @resolution_changed  : resoltion change happen
> > > > + * @realloc_mv_buf     : flag to notify driver to re-allocate mv buffer
> > > > + * @cap_num_planes     : number planes of capture buffer
> > > > + * @bs_dma             : Input bit-stream buffer dma address
> > > > + * @y_fb_dma           : Y frame buffer dma address
> > > > + * @c_fb_dma           : C frame buffer dma address
> > > > + * @vdec_fb_va         : VDEC frame buffer struct virtual address
> > > > + */
> > > > +struct vdec_h264_dec_info {
> > > > +       uint32_t dpb_sz;
> > > > +       uint32_t resolution_changed;
> > > > +       uint32_t realloc_mv_buf;
> > > > +       uint32_t cap_num_planes;
> > > > +       uint64_t bs_dma;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +       uint64_t vdec_fb_va;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_vsi - shared memory for decode information exchange
> > > > + *                        between VPU and Host.
> > > > + *                        The memory is allocated by VPU then mapping to
> > > > Host
> > > > + *                        in vpu_dec_init() and freed in vpu_dec_deinit()
> > > > + *                        by VPU.
> > > > + *                        AP-W/R : AP is writer/reader on this item
> > > > + *                        VPU-W/R: VPU is write/reader on this item
> > > > + * @pred_buf_dma : HW working predication buffer dma address (AP-W, VPU-
> > > > R)
> > > > + * @mv_buf_dma   : HW working motion vector buffer dma address (AP-W,
> > > > VPU-R)
> > > > + * @dec          : decode information (AP-R, VPU-W)
> > > > + * @pic          : picture information (AP-R, VPU-W)
> > > > + * @crop         : crop information (AP-R, VPU-W)
> > > > + */
> > > > +struct vdec_h264_vsi {
> > > > +       uint64_t pred_buf_dma;
> > > > +       uint64_t mv_buf_dma[H264_MAX_MV_NUM];
> > > > +       struct vdec_h264_dec_info dec;
> > > > +       struct vdec_pic_info pic;
> > > > +       struct v4l2_rect crop;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_params;
> > > > +};
> > > > +
> > > > +/**
> > > > + * struct vdec_h264_slice_inst - h264 decoder instance
> > > > + * @num_nalu : how many nalus be decoded
> > > > + * @ctx      : point to mtk_vcodec_ctx
> > > > + * @pred_buf : HW working predication buffer
> > > > + * @mv_buf   : HW working motion vector buffer
> > > > + * @vpu      : VPU instance
> > > > + * @vsi_ctx  : Local VSI data for this decoding context
> > > > + */
> > > > +struct vdec_h264_slice_inst {
> > > > +       unsigned int num_nalu;
> > > > +       struct mtk_vcodec_ctx *ctx;
> > > > +       struct mtk_vcodec_mem pred_buf;
> > > > +       struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> > > > +       struct vdec_vpu_inst vpu;
> > > > +       struct vdec_h264_vsi vsi_ctx;
> > > > +       struct mtk_h264_dec_slice_param h264_slice_param;
> > > > +
> > > > +       struct v4l2_h264_dpb_entry dpb[16];
> > > > +};
> > > > +
> > > > +static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx,
> > > > +                                int id)
> > > > +{
> > > > +       struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> > > > +
> > > > +       return ctrl->p_cur.p;
> > > > +}
> > > > +
> > > > +static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> > > > +                             struct mtk_h264_dec_slice_param
> > > > *slice_param)
> > > > +{
> > > > +       struct vb2_queue *vq;
> > > > +       struct vb2_buffer *vb;
> > > > +       struct vb2_v4l2_buffer *vb2_v4l2;
> > > > +       u64 index;
> > > > +
> > > > +       vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx,
> > > > +               V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> > > > +
> > > > +       for (index = 0; index < 16; index++) {
> > >
> > > Ditto, some macro instead of 16.
> >
> > Changed this to use ARRAY_SIZE() which is appropriate here.
> >
> > >
> > > > +               const struct slice_h264_dpb_entry *dpb;
> > > > +               int vb2_index;
> > > > +
> > > > +               dpb = &slice_param->decode_params.dpb[index];
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 0;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> > > > +               if (vb2_index < 0) {
> > > > +                       mtk_vcodec_err(inst, "Reference invalid:
> > > > dpb_index(%lld) reference_ts(%lld)",
> > > > +                               index, dpb->reference_ts);
> > > > +                       continue;
> > > > +               }
> > > > +               /* 1 for short term reference, 2 for long term reference
> > > > */
> > > > +               if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 1;
> > > > +               else
> > > > +                       slice_param->h264_dpb_info[index].reference_flag =
> > > > 2;
> > > > +
> > > > +               vb = vq->bufs[vb2_index];
> > > > +               vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer,
> > > > vb2_buf);
> > > > +               slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> > > > +
> > > > +               slice_param->h264_dpb_info[index].y_dma_addr =
> > > > +                       vb2_dma_contig_plane_dma_addr(vb, 0);
> > > > +               if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes ==
> > > > 2) {
> > > > +                       slice_param->h264_dpb_info[index].c_dma_addr =
> > > > +                               vb2_dma_contig_plane_dma_addr(vb, 1);
> > > > +               }
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_sps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(chroma_format_idc);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> > > > +       GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> > > > +       GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> > > > +       GET_MTK_VDEC_PARAM(max_num_ref_frames);
> > > > +       GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> > > > +       GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> > > > +               V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> > > > +       GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> > > > +               V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> > > > +       GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> > > > +               V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> > > > +       GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> > > > +               V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> > > > +       GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> > > > +               V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> > > > +       GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> > > > +               V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> > > > +}
> > > > +
> > > > +static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> > > > +       const struct v4l2_ctrl_h264_pps *src_param)
> > > > +{
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> > > > +       GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> > > > +       GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> > > > +       GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> > > > +       GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> > > > +
> > > > +       GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> > > > +       GET_MTK_VDEC_FLAG(pic_order_present_flag,
> > > > +
> > > > V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(weighted_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> > > > +       GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> > > > +               V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> > > > +       GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> > > > +       GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> > > > +               V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> > > > +       GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> > > > +               V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> > > > +}
> > > > +
> > > > +static void
> > > > +get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> > > > +                       const struct v4l2_ctrl_h264_scaling_matrix
> > > > *src_matrix)
> > > > +{
> > > > +       memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> > > > +              sizeof(dst_matrix->scaling_list_4x4));
> > > > +
> > > > +       memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> > > > +              sizeof(dst_matrix->scaling_list_8x8));
> > > > +}
> > > > +
> > > > +static void get_h264_decode_parameters(
> > > > +       struct slice_api_h264_decode_param *dst_params,
> > > > +       const struct v4l2_ctrl_h264_decode_params *src_params,
> > > > +       const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> > > > +               struct slice_h264_dpb_entry *dst_entry = &dst_params-
> > > > >dpb[i];
> > > > +               const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> > > > +
> > > > +               dst_entry->reference_ts = src_entry->reference_ts;
> > > > +               dst_entry->frame_num = src_entry->frame_num;
> > > > +               dst_entry->pic_num = src_entry->pic_num;
> > > > +               dst_entry->top_field_order_cnt = src_entry-
> > > > >top_field_order_cnt;
> > > > +               dst_entry->bottom_field_order_cnt =
> > > > +                       src_entry->bottom_field_order_cnt;
> > > > +               dst_entry->flags = src_entry->flags;
> > > > +       }
> > > > +
> > > > +       // num_slices is a leftover from the old H.264 support and is
> > > > ignored
> > > > +       // by the firmware.
> > > > +       dst_params->num_slices = 0;
> > > > +       dst_params->nal_ref_idc = src_params->nal_ref_idc;
> > > > +       dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> > > > +       dst_params->bottom_field_order_cnt = src_params-
> > > > >bottom_field_order_cnt;
> > > > +       dst_params->flags = src_params->flags;
> > > > +}
> > > > +
> > > > +static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> > > > +                           const struct v4l2_h264_dpb_entry *b)
> > > > +{
> > > > +       return a->top_field_order_cnt == b->top_field_order_cnt &&
> > > > +              a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Move DPB entries of dec_param that refer to a frame already existing
> > > > in dpb
> > > > + * into the already existing slot in dpb, and move other entries into new
> > > > slots.
> > > > + *
> > > > + * This function is an adaptation of the similarly-named function in
> > > > + * hantro_h264.c.
> > > > + */
> > > > +static void update_dpb(const struct v4l2_ctrl_h264_decode_params
> > > > *dec_param,
> > > > +                      struct v4l2_h264_dpb_entry *dpb)
> > > > +{
> > > > +       DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> > > > +       unsigned int i, j;
> > > > +
> > > > +       /* Disable all entries by default, and mark the ones in use. */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> > > > +                       set_bit(i, in_use);
> > > > +               dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> > > > +       }
> > > > +
> > > > +       /* Try to match new DPB entries with existing ones by their POCs.
> > > > */
> > > > +       for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > > >dpb[i];
> > > > +
> > > > +               if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> > > > +                       continue;
> > > > +
> > > > +               /*
> > > > +                * To cut off some comparisons, iterate only on target DPB
> > > > +                * entries were already used.
> > > > +                */
> > > > +               for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> > > > +                       struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +                       cdpb = &dpb[j];
> > > > +                       if (!dpb_entry_match(cdpb, ndpb))
> > > > +                               continue;
> > > > +
> > > > +                       *cdpb = *ndpb;
> > > > +                       set_bit(j, used);
> > > > +                       /* Don't reiterate on this one. */
> > > > +                       clear_bit(j, in_use);
> > > > +                       break;
> > > > +               }
> > > > +
> > > > +               if (j == ARRAY_SIZE(dec_param->dpb))
> > > > +                       set_bit(i, new);
> > > > +       }
> > > > +
> > > > +       /* For entries that could not be matched, use remaining free
> > > > slots. */
> > > > +       for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> > > > +               const struct v4l2_h264_dpb_entry *ndpb = &dec_param-
> > > > >dpb[i];
> > > > +               struct v4l2_h264_dpb_entry *cdpb;
> > > > +
> > > > +               /*
> > > > +                * Both arrays are of the same sizes, so there is no way
> > > > +                * we can end up with no space in target array, unless
> > > > +                * something is buggy.
> > > > +                */
> > > > +               j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> > > > +               if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> > > > +                       return;
> > > > +
> > > > +               cdpb = &dpb[j];
> > > > +               *cdpb = *ndpb;
> > > > +               set_bit(j, used);
> > > > +       }
> > > > +}
> > > > +
> > > > +/*
> > > > + * The firmware expects unused reflist entries to have the value 0x20.
> > > > + */
> > > > +static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> > > > +{
> > > > +       memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> > > > +}
> > > > +
> > > > +static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       const struct v4l2_ctrl_h264_decode_params *dec_params =
> > > > +               get_ctrl_ptr(inst->ctx,
> > > > V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> > > > +       const struct v4l2_ctrl_h264_sps *sps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> > > > +       const struct v4l2_ctrl_h264_pps *pps =
> > > > +               get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> > > > +       const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> > > > +               get_ctrl_ptr(inst->ctx,
> > > > V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> > > > +       struct mtk_h264_dec_slice_param *slice_param = &inst-
> > > > >h264_slice_param;
> > > > +       struct v4l2_h264_reflist_builder reflist_builder;
> > > > +       enum v4l2_field dpb_fields[V4L2_H264_NUM_DPB_ENTRIES];
> > > > +       u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> > > > +       u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> > > > +       u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> > > > +       int i;
> > > > +
> > > > +       update_dpb(dec_params, inst->dpb);
> > > > +
> > > > +       get_h264_sps_parameters(&slice_param->sps, sps);
> > > > +       get_h264_pps_parameters(&slice_param->pps, pps);
> > > > +       get_h264_scaling_matrix(&slice_param->scaling_matrix,
> > > > scaling_matrix);
> > > > +       get_h264_decode_parameters(&slice_param->decode_params,
> > > > dec_params,
> > > > +                                  inst->dpb);
> > > > +       get_h264_dpb_list(inst, slice_param);
> > > > +
> > > > +       /* Prepare the fields for our reference lists */
> > > > +       for (i = 0; i < V4L2_H264_NUM_DPB_ENTRIES; i++)
> > > > +               dpb_fields[i] = slice_param->h264_dpb_info[i].field;
> > > > +       /* Build the reference lists */
> > > > +       v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> > > > +                                      inst->dpb);
> > > > +       v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> > > > +       v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist,
> > > > b1_reflist);
> > > > +       /* Adapt the built lists to the firmware's expectations */
> > > > +       fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> > > > +       fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> > > > +
> > > > +       memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
> > > > +              sizeof(inst->vsi_ctx.h264_slice_params));
> > > > +}
> > > > +
> > > > +static unsigned int get_mv_buf_size(unsigned int width, unsigned int
> > > > height)
> > > > +{
> > > > +       int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) +
> > > > 8;
> > > > +
> > > > +       return HW_MB_STORE_SZ * unit_size;
> > > > +}
> > > > +
> > > > +static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int err = 0;
> > > > +
> > > > +       inst->pred_buf.size = BUF_PREDICTION_SZ;
> > > > +       err = mtk_vcodec_mem_alloc(inst->ctx, &inst->pred_buf);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "failed to allocate ppl buf");
> > > > +               return err;
> > > > +       }
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = inst->pred_buf.dma_addr;
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_predication_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       inst->vsi_ctx.pred_buf_dma = 0;
> > > > +       mem = &inst->pred_buf;
> > > > +       if (mem->va)
> > > > +               mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +}
> > > > +
> > > > +static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> > > > +       struct vdec_pic_info *pic)
> > > > +{
> > > > +       int i;
> > > > +       int err;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +       unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> > > > +
> > > > +       mtk_v4l2_debug(3, "size = 0x%lx", buf_sz);
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +               mem->size = buf_sz;
> > > > +               err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> > > > +               if (err) {
> > > > +                       mtk_vcodec_err(inst, "failed to allocate mv buf");
> > > > +                       return err;
> > > > +               }
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = mem->dma_addr;
> > > > +       }
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static void free_mv_buf(struct vdec_h264_slice_inst *inst)
> > > > +{
> > > > +       int i;
> > > > +       struct mtk_vcodec_mem *mem = NULL;
> > > > +
> > > > +       for (i = 0; i < H264_MAX_MV_NUM; i++) {
> > > > +               inst->vsi_ctx.mv_buf_dma[i] = 0;
> > > > +               mem = &inst->mv_buf[i];
> > > > +               if (mem->va)
> > > > +                       mtk_vcodec_mem_free(inst->ctx, mem);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_pic_info(struct vdec_h264_slice_inst *inst,
> > > > +                        struct vdec_pic_info *pic)
> > > > +{
> > > > +       struct mtk_vcodec_ctx *ctx = inst->ctx;
> > > > +
> > > > +       ctx->picinfo.buf_w = (ctx->picinfo.pic_w + 15) & 0xFFFFFFF0;
> > > > +       ctx->picinfo.buf_h = (ctx->picinfo.pic_h + 31) & 0xFFFFFFE0;
> > > > +       ctx->picinfo.fb_sz[0] = ctx->picinfo.buf_w * ctx->picinfo.buf_h;
> > > > +       ctx->picinfo.fb_sz[1] = ctx->picinfo.fb_sz[0] >> 1;
> > > > +       inst->vsi_ctx.dec.cap_num_planes =
> > > > +               ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> > > > +
> > > > +       pic = &ctx->picinfo;
> > > > +       mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> > > > +                        ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> > > > +                        ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > > > +       mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> > > > +               ctx->picinfo.fb_sz[1]);
> > > > +
> > > > +       if ((ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w) ||
> > > > +               (ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h)) {
> > > > +               inst->vsi_ctx.dec.resolution_changed = true;
> > > > +               if ((ctx->last_decoded_picinfo.buf_w != ctx-
> > > > >picinfo.buf_w) ||
> > > > +                       (ctx->last_decoded_picinfo.buf_h != ctx-
> > > > >picinfo.buf_h))
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +               mtk_v4l2_debug(1, "ResChg: (%d %d) : old(%d, %d) ->
> > > > new(%d, %d)",
> > > > +                       inst->vsi_ctx.dec.resolution_changed,
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf,
> > > > +                       ctx->last_decoded_picinfo.pic_w,
> > > > +                       ctx->last_decoded_picinfo.pic_h,
> > > > +                       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> > > > +       }
> > > > +}
> > > > +
> > > > +static void get_crop_info(struct vdec_h264_slice_inst *inst,
> > > > +       struct v4l2_rect *cr)
> > > > +{
> > > > +       cr->left = inst->vsi_ctx.crop.left;
> > > > +       cr->top = inst->vsi_ctx.crop.top;
> > > > +       cr->width = inst->vsi_ctx.crop.width;
> > > > +       cr->height = inst->vsi_ctx.crop.height;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> > > > +                        cr->left, cr->top, cr->width, cr->height);
> > > > +}
> > > > +
> > > > +static void get_dpb_size(struct vdec_h264_slice_inst *inst,
> > > > +       unsigned int *dpb_sz)
> > > > +{
> > > > +       *dpb_sz = inst->vsi_ctx.dec.dpb_sz;
> > > > +       mtk_vcodec_debug(inst, "sz=%d", *dpb_sz);
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst = NULL;
> > > > +       int err;
> > > > +
> > > > +       inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> > > > +       if (!inst)
> > > > +               return -ENOMEM;
> > > > +
> > > > +       inst->ctx = ctx;
> > > > +
> > > > +       inst->vpu.id = SCP_IPI_VDEC_H264;
> > > > +       inst->vpu.ctx = ctx;
> > > > +
> > > > +       err = vpu_dec_init(&inst->vpu);
> > > > +       if (err) {
> > > > +               mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> > > > +               goto error_free_inst;
> > > > +       }
> > > > +
> > > > +       memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
> > > > +       inst->vsi_ctx.dec.resolution_changed = true;
> > > > +       inst->vsi_ctx.dec.realloc_mv_buf = true;
> > > > +
> > > > +       err = allocate_predication_buf(inst);
> > > > +       if (err)
> > > > +               goto error_deinit;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "struct size = %d,%d,%d,%d\n",
> > > > +               sizeof(struct mtk_h264_sps_param),
> > > > +               sizeof(struct mtk_h264_pps_param),
> > > > +               sizeof(struct mtk_h264_dec_slice_param),
> > > > +               sizeof(struct mtk_h264_dpb_info));
> > > > +
> > > > +       mtk_vcodec_debug(inst, "H264 Instance >> %p", inst);
> > > > +
> > > > +       ctx->drv_handle = inst;
> > > > +       return 0;
> > > > +
> > > > +error_deinit:
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +
> > > > +error_free_inst:
> > > > +       kfree(inst);
> > > > +       return err;
> > > > +}
> > > > +
> > > > +static void vdec_h264_slice_deinit(void *h_vdec)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +
> > > > +       mtk_vcodec_debug_enter(inst);
> > > > +
> > > > +       vpu_dec_deinit(&inst->vpu);
> > > > +       free_predication_buf(inst);
> > > > +       free_mv_buf(inst);
> > > > +
> > > > +       kfree(inst);
> > > > +}
> > > > +
> > > > +static int find_start_code(unsigned char *data, unsigned int data_sz)
> > > > +{
> > > > +       if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> > > > +               return 3;
> > > > +
> > > > +       if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> > > > +           data[3] == 1)
> > > > +               return 4;
> > > > +
> > > > +       return -1;
> > > > +}
> > > > +
> > > > +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem
> > > > *bs,
> > > > +                                 struct vdec_fb *fb, bool *res_chg)
> > > > +{
> > > > +       struct vdec_h264_slice_inst *inst =
> > > > +               (struct vdec_h264_slice_inst *)h_vdec;
> > > > +       struct vdec_vpu_inst *vpu = &inst->vpu;
> > > > +       struct mtk_video_dec_buf *src_buf_info;
> > > > +       int nal_start_idx = 0, err = 0;
> > > > +       uint32_t nal_type, data[2];
> > > > +       unsigned char *buf;
> > > > +       uint64_t y_fb_dma;
> > > > +       uint64_t c_fb_dma;
> > > > +
> > > > +       mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> > > > +                        ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> > > > +
> > > > +       /* bs NULL means flush decoder */
> > > > +       if (bs == NULL)
> > > > +               return vpu_dec_reset(vpu);
> > > > +
> > > > +       src_buf_info = container_of(bs, struct mtk_video_dec_buf,
> > > > bs_buffer);
> > > > +
> > > > +       y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> > > > +       c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> > > > +
> > > > +       buf = (unsigned char *)bs->va;
> > >
> > > I can be completely wrong, but it would seem here
> > > is where the CPU mapping is used.
> >
> > I think you're right. :)
> >
> > >
> > > > +       nal_start_idx = find_start_code(buf, bs->size);
> > > > +       if (nal_start_idx < 0)
> > > > +               goto err_free_fb_out;
> > > > +
> > > > +       data[0] = bs->size;
> > > > +       data[1] = buf[nal_start_idx];
> > > > +       nal_type = NAL_TYPE(buf[nal_start_idx]);
> > >
> > > Which seems to be used to parse the NAL type. But shouldn't
> > > you expect here VLC NALUs only?
> > >
> > > I.e. you only get IDR or non-IDR frames, marked with
> > > V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC.
> >
> > Yep, that's true. And as a matter of fact I can remove `nal_type` (and
> > the test using it below) and the driver is just as happy.
> >
> > >
> > > > +       mtk_vcodec_debug(inst, "\n + NALU[%d] type %d +\n", inst-
> > > > >num_nalu,
> > > > +                        nal_type);
> > > > +
> > > > +       inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
> > > > +       inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
> > > > +       inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
> > > > +       inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
> > > > +
> > > > +       get_vdec_decode_parameters(inst);
> > > > +       *res_chg = inst->vsi_ctx.dec.resolution_changed;
> > > > +       if (*res_chg) {
> > > > +               mtk_vcodec_debug(inst, "- resolution changed -");
> > > > +               if (inst->vsi_ctx.dec.realloc_mv_buf) {
> > > > +                       err = alloc_mv_buf(inst, &(inst->ctx->picinfo));
> > > > +                       inst->vsi_ctx.dec.realloc_mv_buf = false;
> > > > +                       if (err)
> > > > +                               goto err_free_fb_out;
> > > > +               }
> > > > +               *res_chg = false;
> > > > +       }
> > > > +
> > > > +       memcpy(inst->vpu.vsi, &inst->vsi_ctx, sizeof(inst->vsi_ctx));
> > > > +       err = vpu_dec_start(vpu, data, 2);
> > >
> > > Then it seems this 2-bytes are passed to the firmware. Maybe you
> > > could test if that can be derived without the CPU mapping.
> > > That would allow you to set DMA_ATTR_NO_KERNEL_MAPPING.
> >
> > This one is a bit trickier. It seems the NAL type is passed as part of
> > the decode request to the firmware. Which should be absolutely not
> > needed since the firmware can check this from the buffer itself. Just
> > for fun I have tried setting this parameter unconditionally to 0x1
> > (non-IDR picture) and all I get is green frames with seemingly random
> > garbage. If I set it to 0x5 (IDR picture) I also get green frames with
> > a different kind of garbage, and once every while a properly rendered
> > frame (presumably when it is *really* an IDR frame).
>
> Can't you deduce this from the v4l2_ctrl_h264_slice_params.slice_type ?

This decoder is frame-based, so it doesn't receive a
v4l2_ctrl_h264_slice_params from user-space unfortunately. But
thankfully we can deduce this from the decode params.

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
  2021-03-17  3:13           ` Alexandre Courbot
@ 2021-03-17 15:09             ` Nicolas Dufresne
  -1 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-17 15:09 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le mercredi 17 mars 2021 à 12:13 +0900, Alexandre Courbot a écrit :
> On Tue, Mar 16, 2021 at 6:45 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> > Hi Alexandre,
> > 
> > On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > Hi Ezequiel, thanks for the feedback!
> > > 
> > > On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> > > <ezequiel@vanguardiasur.com.ar> wrote:
> > > > 
> > > > Hello Alex,
> > > > 
> > > > Thanks for the patch.
> > > > 
> > > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > > > wrote:
> > > > > 
> > > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > > 
> > > > > Support the stateless codec API that will be used by MT8183.
> > > > > 
> > > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > > [acourbot: refactor, cleanup and split]
> > > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > ---
> > > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > > > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427
> > > > > ++++++++++++++++++
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > > > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > > > >  create mode 100644 drivers/media/platform/mtk-
> > > > > vcodec/mtk_vcodec_dec_stateless.c
> > > > > 
> > > > [..]
> > > > 
> > > > > +
> > > > > +static const struct mtk_stateless_control mtk_stateless_controls[] =
> > > > > {
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > 
> > > > This "needed_in_request" is not really required, as controls
> > > > are not volatile, and their value is stored per-context (per-fd).
> > > > 
> > > > It's perfectly valid for an application to pass the SPS control
> > > > at the beginning of the sequence, and then omit it
> > > > in further requests.
> > > 
> > > If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> > > requests, this boolean only checks that the control has been provided
> > > at least once, and not that it is provided with every request. Without
> > > it we could send a frame to the firmware without e.g. setting an SPS,
> > > which would be a problem.
> > > 
> > 
> > As Nicolas points out, in V4L2 controls have an initial value,
> > so no control can be unset.
> 
> I see. So I guess the expectation is that failure will occur later as
> the firmware reports it cannot decode properly (or returns a corrupted
> frame). Thanks for the precision.

That is identical to userspace passing bad values. We just don't want to force
userspace to pass controls that haven't changed. The control framework isn't
exactly free in CPU time.

> 
> > 
> > > > 
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > > > +                       .menu_skip_mask =
> > > > > +                              
> > > > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > > > +                              
> > > > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > > > +                       .min =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +                       .def =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +                       .max =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +       },
> > > > > +};
> > > > 
> > > > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > > > the driver supports. From a next patch, this case seems to be
> > > > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> > > 
> > > Indeed - I've added the control, thanks for catching this!
> > > 
> > > > 
> > > > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > > > +
> > > > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .type = MTK_FMT_DEC,
> > > > > +               .num_planes = 1,
> > > > > +       },
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > > > +               .type = MTK_FMT_FRAME,
> > > > > +               .num_planes = 2,
> > > > > +       },
> > > > > +};
> > > > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > > > +#define DEFAULT_OUT_FMT_IDX    0
> > > > > +#define DEFAULT_CAP_FMT_IDX    1
> > > > > +
> > > > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .stepwise = {
> > > > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > > > +               },
> > > > > +       },
> > > > > +};
> > > > > +
> > > > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > > > +
> > > > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx
> > > > > *ctx,
> > > > > +                                              struct vdec_fb *fb)
> > > > > +{
> > > > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > > > +               container_of(fb, struct mtk_video_dec_buf,
> > > > > frame_buffer);
> > > > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > > > +       unsigned int cap_y_size = ctx-
> > > > > >q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > > +
> > > > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > > +               unsigned int cap_c_size =
> > > > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > > +
> > > > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > > > +       }
> > > > > +}
> > > > > +
> > > > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx
> > > > > *ctx,
> > > > > +                                          struct vb2_v4l2_buffer
> > > > > *vb2_v4l2)
> > > > > +{
> > > > > +       struct mtk_video_dec_buf *framebuf =
> > > > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf,
> > > > > m2m_buf.vb);
> > > > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > > > +
> > > > > +       pfb = &framebuf->frame_buffer;
> > > > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > > > 
> > > > Are you sure you need a CPU mapping? It seems strange.
> > > > I'll comment some more on the next patch(es).
> > > 
> > > I'll answer on the next patch since this is where that mapping is being
> > > used.
> > > 
> > > > 
> > > > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf,
> > > > > 0);
> > > > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > > +
> > > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > > > +               pfb->base_c.dma_addr =
> > > > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > > > +               pfb->base_c.size = ctx-
> > > > > >q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > > +       }
> > > > > +       mtk_v4l2_debug(1,
> > > > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad
> > > > > Size=%zx frame_count = %d",
> > > > > +               dst_buf->index, pfb,
> > > > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > > > +               ctx->decoded_frame_cnt);
> > > > > +
> > > > > +       return pfb;
> > > > > +}
> > > > > +
> > > > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +
> > > > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > > > +}
> > > > > +
> > > > > +static int fops_media_request_validate(struct media_request *mreq)
> > > > > +{
> > > > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > > > +       struct media_request_object *req_obj;
> > > > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > > > +       struct v4l2_ctrl *ctrl;
> > > > > +       unsigned int i;
> > > > > +
> > > > > +       switch (buffer_cnt) {
> > > > > +       case 1:
> > > > > +               /* We expect exactly one buffer with the request */
> > > > > +               break;
> > > > > +       case 0:
> > > > > +               mtk_v4l2_err("No buffer provided with the request");
> > > > > +               return -ENOENT;
> > > > > +       default:
> > > > > +               mtk_v4l2_err("Too many buffers (%d) provided with the
> > > > > request",
> > > > > +                            buffer_cnt);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > > > +               struct vb2_buffer *vb;
> > > > > +
> > > > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > > > +                       vb = container_of(req_obj, struct vb2_buffer,
> > > > > req_obj);
> > > > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +                       break;
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       if (!ctx) {
> > > > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > > > +               return -ENOENT;
> > > > > +       }
> > > > > +
> > > > > +       parent_hdl = &ctx->ctrl_hdl;
> > > > > +
> > > > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > > > +       if (!hdl) {
> > > > > +               mtk_v4l2_err("Cannot find control handler for
> > > > > request\n");
> > > > > +               return -ENOENT;
> > > > > +       }
> > > > > +
> > > > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > > > +               if (mtk_stateless_controls[i].codec_type != ctx-
> > > > > >current_codec)
> > > > > +                       continue;
> > > > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > > > +                       continue;
> > > > > +
> > > > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > > > +                                        
> > > > > mtk_stateless_controls[i].cfg.id);
> > > > > +               if (!ctrl) {
> > > > > +                       mtk_v4l2_err("Missing required codec
> > > > > control\n");
> > > > > +                       return -ENOENT;
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > > > +
> > > > > +       return vb2_request_validate(mreq);
> > > > > +}
> > > > > +
> > > > > +static void mtk_vdec_worker(struct work_struct *work)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx =
> > > > > +               container_of(work, struct mtk_vcodec_ctx,
> > > > > decode_work);
> > > > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > > > +       struct vb2_buffer *vb2_src;
> > > > > +       struct mtk_vcodec_mem *bs_src;
> > > > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > > > +       struct media_request *src_buf_req;
> > > > > +       struct vdec_fb *dst_buf;
> > > > > +       bool res_chg = false;
> > > > > +       int ret;
> > > > > +
> > > > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > > > +       if (vb2_v4l2_src == NULL) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_debug(1, "[%d] no available source buffer",
> > > > > ctx->id);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > > > +       if (vb2_v4l2_dst == NULL) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_debug(1, "[%d] no available destination
> > > > > buffer", ctx->id);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > > > +       dec_buf_src = container_of(vb2_v4l2_src, struct
> > > > > mtk_video_dec_buf,
> > > > > +                                  m2m_buf.vb);
> > > > > +       bs_src = &dec_buf_src->bs_buffer;
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > > > +                       ctx->id, src_buf->vb2_queue->type,
> > > > > +                       src_buf->index, src_buf, src_buf_info);
> > > > > +
> > > > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > > > +       if (!bs_src->va) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx-
> > > > > >id,
> > > > > +                            vb2_src->index);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx
> > > > > vb=%p",
> > > > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size,
> > > > > src_buf);
> > > > > +       /* Apply request controls. */
> > > > > +       src_buf_req = vb2_src->req_obj.req;
> > > > > +       if (src_buf_req)
> > > > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > > > +       else
> > > > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > > > +
> > > > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > > > +       if (ret) {
> > > > > +               mtk_v4l2_err(
> > > > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu
> > > > > vdec_if_decode() ret=%d res_chg=%d===>",
> > > > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > > > +                       vb2_src->timestamp, ret, res_chg);
> > > > > +               if (ret == -EIO) {
> > > > > +                       mutex_lock(&ctx->lock);
> > > > > +                       dec_buf_src->error = true;
> > > > > +                       mutex_unlock(&ctx->lock);
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > > > +
> > > > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx-
> > > > > >m2m_ctx,
> > > > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > > > +
> > > > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > > > +}
> > > > > +
> > > > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > > > +                       ctx->id, vb->vb2_queue->type,
> > > > > +                       vb->index, vb);
> > > > > +
> > > > > +       mutex_lock(&ctx->lock);
> > > > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > > > +       mutex_unlock(&ctx->lock);
> > > > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > > > +               return;
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > > > +
> > > > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > > > +       if (ctx->state == MTK_STATE_INIT) {
> > > > > +               ctx->state = MTK_STATE_HEADER;
> > > > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > > > 
> > > > This state thing seems just something to make the rest
> > > > of the stateful-based driver happy, right?
> > > 
> > > Correct - if anything we should either use more of the state here
> > > (i.e. set the error state when relevant) or move the state entirely in
> > > the stateful part of the driver.
> > > 
> > > > 
> > > > Makes me wonder a bit if just splitting the stateless part to its
> > > > own driver, wouldn't make your maintenance easier.
> > > > 
> > > > What's the motivation for sharing the driver?
> > > 
> > > Technically you could do it both ways. Separating the driver would
> > > result in some boilerplate code and buffer-management structs
> > > duplication (unless we keep the shared part under another module - but
> > > in this case we are basically in the same situation as now). Also
> > > despite using different userspace-facing ABIs, MT8173 and MT8183
> > > follow a similar architecture and a similar firmware interface.
> > > Considering these similarities it seems simpler from an architectural
> > > point of view to have all the Mediatek codec support under the same
> > > driver. It also probably results in less code.
> > > 
> > > That being said, the split can probably be improved as you pointed out
> > > with this state variable. But the current split is not too bad IMHO,
> > > at least not worse than how the code was originally.
> > > 
> > > > 
> > > > > +       } else {
> > > > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > > > +                               ctx->id, ctx->state);
> > > > > +       }
> > > > > +}
> > > > > +
> > > > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > > > +{
> > > > > +       bool res_chg;
> > > > > +
> > > > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > > > +}
> > > > > +
> > > > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > > > +};
> > > > > +
> > > > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > > > +{
> > > > > +       struct v4l2_ctrl *ctrl;
> > > > > +       unsigned int i;
> > > > > +
> > > > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > > > +       if (ctx->ctrl_hdl.error) {
> > > > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > > > +               return ctx->ctrl_hdl.error;
> > > > > +       }
> > > > > +
> > > > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > > > +                               0, 32, 1, 1);
> > > > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > > > 
> > > > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > > > to return the DPB size. However, isn't this something userspace already
> > > > knows?
> > > 
> > > True, but that's also a control the driver is supposed to provide per
> > > the spec IIUC.
> > > 
> > 
> > I don't see the specification requiring this control. TBH, I'd just drop it
> > and if needed fix the application to support this as an optional
> > control.
> > 
> > In any case, stateless devices should just need 1 output and 1 capture
> > buffer.
> 
> Mmm, you're correct indeed, and checking with our user-space it does
> not rely on this control for stateless codecs. Moving this control to
> the stateful part of the driver.
> 
> 
> > 
> > You might dislike this redundancy, note that you can also get the minimum
> > required buffers through VIDIOC_REQBUFS, where the count
> > v4l2_requestbuffers.field is returned back to userspace with the
> > number of allocated buffers.
> > 
> > If you request just 1 buffer, and your driver needed 3, you should
> > get a 3 there (vb2_ops.queue_setup takes care of that).
> > 
> > Thanks,
> > Ezequiel



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API
@ 2021-03-17 15:09             ` Nicolas Dufresne
  0 siblings, 0 replies; 56+ messages in thread
From: Nicolas Dufresne @ 2021-03-17 15:09 UTC (permalink / raw)
  To: Alexandre Courbot, Ezequiel Garcia
  Cc: Tiffany Lin, Andrew-CT Chen, Rob Herring, Yunfei Dong,
	Mauro Carvalho Chehab, Hans Verkuil, linux-media,
	Linux Kernel Mailing List,
	moderated list:ARM/Mediatek SoC support

Le mercredi 17 mars 2021 à 12:13 +0900, Alexandre Courbot a écrit :
> On Tue, Mar 16, 2021 at 6:45 AM Ezequiel Garcia
> <ezequiel@vanguardiasur.com.ar> wrote:
> > 
> > Hi Alexandre,
> > 
> > On Mon, 15 Mar 2021 at 08:28, Alexandre Courbot <acourbot@chromium.org>
> > wrote:
> > > 
> > > Hi Ezequiel, thanks for the feedback!
> > > 
> > > On Thu, Mar 4, 2021 at 6:30 AM Ezequiel Garcia
> > > <ezequiel@vanguardiasur.com.ar> wrote:
> > > > 
> > > > Hello Alex,
> > > > 
> > > > Thanks for the patch.
> > > > 
> > > > On Fri, 26 Feb 2021 at 07:06, Alexandre Courbot <acourbot@chromium.org>
> > > > wrote:
> > > > > 
> > > > > From: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > > 
> > > > > Support the stateless codec API that will be used by MT8183.
> > > > > 
> > > > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > > > [acourbot: refactor, cleanup and split]
> > > > > Co-developed-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > Signed-off-by: Alexandre Courbot <acourbot@chromium.org>
> > > > > ---
> > > > >  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |  66 ++-
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   9 +-
> > > > >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 427
> > > > > ++++++++++++++++++
> > > > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   3 +
> > > > >  5 files changed, 503 insertions(+), 3 deletions(-)
> > > > >  create mode 100644 drivers/media/platform/mtk-
> > > > > vcodec/mtk_vcodec_dec_stateless.c
> > > > > 
> > > > [..]
> > > > 
> > > > > +
> > > > > +static const struct mtk_stateless_control mtk_stateless_controls[] =
> > > > > {
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_SPS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > 
> > > > This "needed_in_request" is not really required, as controls
> > > > are not volatile, and their value is stored per-context (per-fd).
> > > > 
> > > > It's perfectly valid for an application to pass the SPS control
> > > > at the beginning of the sequence, and then omit it
> > > > in further requests.
> > > 
> > > If I understand how v4l2_ctrl_request_hdl_ctrl_find() works with
> > > requests, this boolean only checks that the control has been provided
> > > at least once, and not that it is provided with every request. Without
> > > it we could send a frame to the firmware without e.g. setting an SPS,
> > > which would be a problem.
> > > 
> > 
> > As Nicolas points out, in V4L2 controls have an initial value,
> > so no control can be unset.
> 
> I see. So I guess the expectation is that failure will occur later as
> the firmware reports it cannot decode properly (or returns a corrupted
> frame). Thanks for the precision.

That is identical to userspace passing bad values. We just don't want to force
userspace to pass controls that haven't changed. The control framework isn't
exactly free in CPU time.

> 
> > 
> > > > 
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_PPS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_SCALING_MATRIX,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_PARAMS,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .needed_in_request = true,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_MPEG_VIDEO_H264_PROFILE,
> > > > > +                       .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
> > > > > +                       .max = V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
> > > > > +                       .menu_skip_mask =
> > > > > +                              
> > > > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
> > > > > +                              
> > > > > BIT(V4L2_MPEG_VIDEO_H264_PROFILE_EXTENDED),
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +       },
> > > > > +       {
> > > > > +               .cfg = {
> > > > > +                       .id = V4L2_CID_STATELESS_H264_DECODE_MODE,
> > > > > +                       .min =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +                       .def =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +                       .max =
> > > > > V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
> > > > > +               },
> > > > > +               .codec_type = V4L2_PIX_FMT_H264_SLICE,
> > > > > +       },
> > > > > +};
> > > > 
> > > > Applications also need to know which V4L2_CID_STATELESS_H264_START_CODE
> > > > the driver supports. From a next patch, this case seems to be
> > > > V4L2_STATELESS_H264_START_CODE_ANNEX_B.
> > > 
> > > Indeed - I've added the control, thanks for catching this!
> > > 
> > > > 
> > > > > +#define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> > > > > +
> > > > > +static const struct mtk_video_fmt mtk_video_formats[] = {
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .type = MTK_FMT_DEC,
> > > > > +               .num_planes = 1,
> > > > > +       },
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_MM21,
> > > > > +               .type = MTK_FMT_FRAME,
> > > > > +               .num_planes = 2,
> > > > > +       },
> > > > > +};
> > > > > +#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > > > > +#define DEFAULT_OUT_FMT_IDX    0
> > > > > +#define DEFAULT_CAP_FMT_IDX    1
> > > > > +
> > > > > +static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > > > > +       {
> > > > > +               .fourcc = V4L2_PIX_FMT_H264_SLICE,
> > > > > +               .stepwise = {
> > > > > +                       MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > > > > +                       MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16,
> > > > > +               },
> > > > > +       },
> > > > > +};
> > > > > +
> > > > > +#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > > > > +
> > > > > +static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx
> > > > > *ctx,
> > > > > +                                              struct vdec_fb *fb)
> > > > > +{
> > > > > +       struct mtk_video_dec_buf *vdec_frame_buf =
> > > > > +               container_of(fb, struct mtk_video_dec_buf,
> > > > > frame_buffer);
> > > > > +       struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> > > > > +       unsigned int cap_y_size = ctx-
> > > > > >q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > > +
> > > > > +       vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> > > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > > +               unsigned int cap_c_size =
> > > > > +                       ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > > +
> > > > > +               vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> > > > > +       }
> > > > > +}
> > > > > +
> > > > > +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx
> > > > > *ctx,
> > > > > +                                          struct vb2_v4l2_buffer
> > > > > *vb2_v4l2)
> > > > > +{
> > > > > +       struct mtk_video_dec_buf *framebuf =
> > > > > +               container_of(vb2_v4l2, struct mtk_video_dec_buf,
> > > > > m2m_buf.vb);
> > > > > +       struct vdec_fb *pfb = &framebuf->frame_buffer;
> > > > > +       struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> > > > > +
> > > > > +       pfb = &framebuf->frame_buffer;
> > > > > +       pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
> > > > 
> > > > Are you sure you need a CPU mapping? It seems strange.
> > > > I'll comment some more on the next patch(es).
> > > 
> > > I'll answer on the next patch since this is where that mapping is being
> > > used.
> > > 
> > > > 
> > > > > +       pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf,
> > > > > 0);
> > > > > +       pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> > > > > +
> > > > > +       if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> > > > > +               pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
> > > > > +               pfb->base_c.dma_addr =
> > > > > +                       vb2_dma_contig_plane_dma_addr(dst_buf, 1);
> > > > > +               pfb->base_c.size = ctx-
> > > > > >q_data[MTK_Q_DATA_DST].sizeimage[1];
> > > > > +       }
> > > > > +       mtk_v4l2_debug(1,
> > > > > +               "id=%d Framebuf  pfb=%p VA=%p Y_DMA=%pad C_DMA=%pad
> > > > > Size=%zx frame_count = %d",
> > > > > +               dst_buf->index, pfb,
> > > > > +               pfb->base_y.va, &pfb->base_y.dma_addr,
> > > > > +               &pfb->base_c.dma_addr, pfb->base_y.size,
> > > > > +               ctx->decoded_frame_cnt);
> > > > > +
> > > > > +       return pfb;
> > > > > +}
> > > > > +
> > > > > +static void vb2ops_vdec_buf_request_complete(struct vb2_buffer *vb)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +
> > > > > +       v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->ctrl_hdl);
> > > > > +}
> > > > > +
> > > > > +static int fops_media_request_validate(struct media_request *mreq)
> > > > > +{
> > > > > +       const unsigned int buffer_cnt = vb2_request_buffer_cnt(mreq);
> > > > > +       struct mtk_vcodec_ctx *ctx = NULL;
> > > > > +       struct media_request_object *req_obj;
> > > > > +       struct v4l2_ctrl_handler *parent_hdl, *hdl;
> > > > > +       struct v4l2_ctrl *ctrl;
> > > > > +       unsigned int i;
> > > > > +
> > > > > +       switch (buffer_cnt) {
> > > > > +       case 1:
> > > > > +               /* We expect exactly one buffer with the request */
> > > > > +               break;
> > > > > +       case 0:
> > > > > +               mtk_v4l2_err("No buffer provided with the request");
> > > > > +               return -ENOENT;
> > > > > +       default:
> > > > > +               mtk_v4l2_err("Too many buffers (%d) provided with the
> > > > > request",
> > > > > +                            buffer_cnt);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +
> > > > > +       list_for_each_entry(req_obj, &mreq->objects, list) {
> > > > > +               struct vb2_buffer *vb;
> > > > > +
> > > > > +               if (vb2_request_object_is_buffer(req_obj)) {
> > > > > +                       vb = container_of(req_obj, struct vb2_buffer,
> > > > > req_obj);
> > > > > +                       ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +                       break;
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       if (!ctx) {
> > > > > +               mtk_v4l2_err("Cannot find buffer for request");
> > > > > +               return -ENOENT;
> > > > > +       }
> > > > > +
> > > > > +       parent_hdl = &ctx->ctrl_hdl;
> > > > > +
> > > > > +       hdl = v4l2_ctrl_request_hdl_find(mreq, parent_hdl);
> > > > > +       if (!hdl) {
> > > > > +               mtk_v4l2_err("Cannot find control handler for
> > > > > request\n");
> > > > > +               return -ENOENT;
> > > > > +       }
> > > > > +
> > > > > +       for (i = 0; i < NUM_CTRLS; i++) {
> > > > > +               if (mtk_stateless_controls[i].codec_type != ctx-
> > > > > >current_codec)
> > > > > +                       continue;
> > > > > +               if (!mtk_stateless_controls[i].needed_in_request)
> > > > > +                       continue;
> > > > > +
> > > > > +               ctrl = v4l2_ctrl_request_hdl_ctrl_find(hdl,
> > > > > +                                        
> > > > > mtk_stateless_controls[i].cfg.id);
> > > > > +               if (!ctrl) {
> > > > > +                       mtk_v4l2_err("Missing required codec
> > > > > control\n");
> > > > > +                       return -ENOENT;
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       v4l2_ctrl_request_hdl_put(hdl);
> > > > > +
> > > > > +       return vb2_request_validate(mreq);
> > > > > +}
> > > > > +
> > > > > +static void mtk_vdec_worker(struct work_struct *work)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx =
> > > > > +               container_of(work, struct mtk_vcodec_ctx,
> > > > > decode_work);
> > > > > +       struct mtk_vcodec_dev *dev = ctx->dev;
> > > > > +       struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> > > > > +       struct vb2_buffer *vb2_src;
> > > > > +       struct mtk_vcodec_mem *bs_src;
> > > > > +       struct mtk_video_dec_buf *dec_buf_src;
> > > > > +       struct media_request *src_buf_req;
> > > > > +       struct vdec_fb *dst_buf;
> > > > > +       bool res_chg = false;
> > > > > +       int ret;
> > > > > +
> > > > > +       vb2_v4l2_src = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
> > > > > +       if (vb2_v4l2_src == NULL) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_debug(1, "[%d] no available source buffer",
> > > > > ctx->id);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> > > > > +       if (vb2_v4l2_dst == NULL) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_debug(1, "[%d] no available destination
> > > > > buffer", ctx->id);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       vb2_src = &vb2_v4l2_src->vb2_buf;
> > > > > +       dec_buf_src = container_of(vb2_v4l2_src, struct
> > > > > mtk_video_dec_buf,
> > > > > +                                  m2m_buf.vb);
> > > > > +       bs_src = &dec_buf_src->bs_buffer;
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p buf_info = %p",
> > > > > +                       ctx->id, src_buf->vb2_queue->type,
> > > > > +                       src_buf->index, src_buf, src_buf_info);
> > > > > +
> > > > > +       bs_src->va = vb2_plane_vaddr(vb2_src, 0);
> > > > > +       bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
> > > > > +       bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> > > > > +       if (!bs_src->va) {
> > > > > +               v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> > > > > +               mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx-
> > > > > >id,
> > > > > +                            vb2_src->index);
> > > > > +               return;
> > > > > +       }
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx
> > > > > vb=%p",
> > > > > +                       ctx->id, buf->va, &buf->dma_addr, buf->size,
> > > > > src_buf);
> > > > > +       /* Apply request controls. */
> > > > > +       src_buf_req = vb2_src->req_obj.req;
> > > > > +       if (src_buf_req)
> > > > > +               v4l2_ctrl_request_setup(src_buf_req, &ctx->ctrl_hdl);
> > > > > +       else
> > > > > +               mtk_v4l2_err("vb2 buffer media request is NULL");
> > > > > +
> > > > > +       dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> > > > > +       v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> > > > > +       ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> > > > > +       if (ret) {
> > > > > +               mtk_v4l2_err(
> > > > > +                       " <===[%d], src_buf[%d] sz=0x%zx pts=%llu
> > > > > vdec_if_decode() ret=%d res_chg=%d===>",
> > > > > +                       ctx->id, vb2_src->index, bs_src->size,
> > > > > +                       vb2_src->timestamp, ret, res_chg);
> > > > > +               if (ret == -EIO) {
> > > > > +                       mutex_lock(&ctx->lock);
> > > > > +                       dec_buf_src->error = true;
> > > > > +                       mutex_unlock(&ctx->lock);
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> > > > > +
> > > > > +       v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx-
> > > > > >m2m_ctx,
> > > > > +               ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> > > > > +
> > > > > +       v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
> > > > > +}
> > > > > +
> > > > > +static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> > > > > +{
> > > > > +       struct mtk_vcodec_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
> > > > > +       struct vb2_v4l2_buffer *vb2_v4l2 = to_vb2_v4l2_buffer(vb);
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p",
> > > > > +                       ctx->id, vb->vb2_queue->type,
> > > > > +                       vb->index, vb);
> > > > > +
> > > > > +       mutex_lock(&ctx->lock);
> > > > > +       v4l2_m2m_buf_queue(ctx->m2m_ctx, vb2_v4l2);
> > > > > +       mutex_unlock(&ctx->lock);
> > > > > +       if (vb->vb2_queue->type != V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
> > > > > +               return;
> > > > > +
> > > > > +       mtk_v4l2_debug(3, "(%d) id=%d, bs=%p",
> > > > > +               vb->vb2_queue->type, vb->index, src_buf);
> > > > > +
> > > > > +       /* If an OUTPUT buffer, we may need to update the state */
> > > > > +       if (ctx->state == MTK_STATE_INIT) {
> > > > > +               ctx->state = MTK_STATE_HEADER;
> > > > > +               mtk_v4l2_debug(1, "Init driver from init to header.");
> > > > 
> > > > This state thing seems just something to make the rest
> > > > of the stateful-based driver happy, right?
> > > 
> > > Correct - if anything we should either use more of the state here
> > > (i.e. set the error state when relevant) or move the state entirely in
> > > the stateful part of the driver.
> > > 
> > > > 
> > > > Makes me wonder a bit if just splitting the stateless part to its
> > > > own driver, wouldn't make your maintenance easier.
> > > > 
> > > > What's the motivation for sharing the driver?
> > > 
> > > Technically you could do it both ways. Separating the driver would
> > > result in some boilerplate code and buffer-management structs
> > > duplication (unless we keep the shared part under another module - but
> > > in this case we are basically in the same situation as now). Also
> > > despite using different userspace-facing ABIs, MT8173 and MT8183
> > > follow a similar architecture and a similar firmware interface.
> > > Considering these similarities it seems simpler from an architectural
> > > point of view to have all the Mediatek codec support under the same
> > > driver. It also probably results in less code.
> > > 
> > > That being said, the split can probably be improved as you pointed out
> > > with this state variable. But the current split is not too bad IMHO,
> > > at least not worse than how the code was originally.
> > > 
> > > > 
> > > > > +       } else {
> > > > > +               mtk_v4l2_debug(3, "[%d] already init driver %d",
> > > > > +                               ctx->id, ctx->state);
> > > > > +       }
> > > > > +}
> > > > > +
> > > > > +static int mtk_vdec_flush_decoder(struct mtk_vcodec_ctx *ctx)
> > > > > +{
> > > > > +       bool res_chg;
> > > > > +
> > > > > +       return vdec_if_decode(ctx, NULL, NULL, &res_chg);
> > > > > +}
> > > > > +
> > > > > +static const struct v4l2_ctrl_ops mtk_vcodec_dec_ctrl_ops = {
> > > > > +       .g_volatile_ctrl = mtk_vdec_g_v_ctrl,
> > > > > +};
> > > > > +
> > > > > +static int mtk_vcodec_dec_ctrls_setup(struct mtk_vcodec_ctx *ctx)
> > > > > +{
> > > > > +       struct v4l2_ctrl *ctrl;
> > > > > +       unsigned int i;
> > > > > +
> > > > > +       v4l2_ctrl_handler_init(&ctx->ctrl_hdl, NUM_CTRLS);
> > > > > +       if (ctx->ctrl_hdl.error) {
> > > > > +               mtk_v4l2_err("v4l2_ctrl_handler_init failed\n");
> > > > > +               return ctx->ctrl_hdl.error;
> > > > > +       }
> > > > > +
> > > > > +       ctrl = v4l2_ctrl_new_std(&ctx->ctrl_hdl,
> > > > > +                               &mtk_vcodec_dec_ctrl_ops,
> > > > > +                               V4L2_CID_MIN_BUFFERS_FOR_CAPTURE,
> > > > > +                               0, 32, 1, 1);
> > > > > +       ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
> > > > 
> > > > Hm, this volatile control for MIN_BUFFERS_FOR_CAPTURE seems
> > > > to return the DPB size. However, isn't this something userspace already
> > > > knows?
> > > 
> > > True, but that's also a control the driver is supposed to provide per
> > > the spec IIUC.
> > > 
> > 
> > I don't see the specification requiring this control. TBH, I'd just drop it
> > and if needed fix the application to support this as an optional
> > control.
> > 
> > In any case, stateless devices should just need 1 output and 1 capture
> > buffer.
> 
> Mmm, you're correct indeed, and checking with our user-space it does
> not rely on this control for stateless codecs. Moving this control to
> the stateful part of the driver.
> 
> 
> > 
> > You might dislike this redundancy, note that you can also get the minimum
> > required buffers through VIDIOC_REQBUFS, where the count
> > v4l2_requestbuffers.field is returned back to userspace with the
> > number of allocated buffers.
> > 
> > If you request just 1 buffer, and your driver needed 3, you should
> > get a 3 there (vb2_ops.queue_setup takes care of that).
> > 
> > Thanks,
> > Ezequiel



_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2021-03-17 16:08 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 10:01 [PATCH v3 00/15] media: mtk-vcodec: support for MT8183 decoder Alexandre Courbot
2021-02-26 10:01 ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 01/15] media: mtk-vcodec: vdec: move stateful ops into their own file Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 02/15] media: mtk-vcodec: vdec: handle firmware version field Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 03/15] media: mtk-vcodec: support version 2 of decoder firmware ABI Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 04/15] media: add Mediatek's MM21 format Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 05/15] media: mtk-vcodec: vdec: support stateless API Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-03-03 21:30   ` Ezequiel Garcia
2021-03-03 21:30     ` Ezequiel Garcia
2021-03-15 11:28     ` Alexandre Courbot
2021-03-15 11:28       ` Alexandre Courbot
2021-03-15 15:16       ` Nicolas Dufresne
2021-03-15 15:16         ` Nicolas Dufresne
2021-03-15 21:45       ` Ezequiel Garcia
2021-03-15 21:45         ` Ezequiel Garcia
2021-03-17  3:13         ` Alexandre Courbot
2021-03-17  3:13           ` Alexandre Courbot
2021-03-17 15:09           ` Nicolas Dufresne
2021-03-17 15:09             ` Nicolas Dufresne
2021-02-26 10:01 ` [PATCH v3 06/15] media: mtk-vcodec: vdec: support stateless H.264 decoding Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-03-03 21:47   ` Ezequiel Garcia
2021-03-03 21:47     ` Ezequiel Garcia
2021-03-15 11:28     ` Alexandre Courbot
2021-03-15 11:28       ` Alexandre Courbot
2021-03-15 15:21       ` Nicolas Dufresne
2021-03-15 15:21         ` Nicolas Dufresne
2021-03-17  3:14         ` Alexandre Courbot
2021-03-17  3:14           ` Alexandre Courbot
2021-03-15 22:08       ` Ezequiel Garcia
2021-03-15 22:08         ` Ezequiel Garcia
2021-03-17  3:13         ` Alexandre Courbot
2021-03-17  3:13           ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 07/15] media: mtk-vcodec: vdec: add media device if using stateless api Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 08/15] dt-bindings: media: document mediatek,mt8183-vcodec-dec Alexandre Courbot
2021-02-26 10:01   ` [PATCH v3 08/15] dt-bindings: media: document mediatek, mt8183-vcodec-dec Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 09/15] media: mtk-vcodec: enable MT8183 decoder Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 10/15] media: mtk-vcodec: vdec: use helpers in VIDIOC_(TRY_)DECODER_CMD Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 11/15] media: mtk-vcodec: vdec: Support H264 profile control Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 12/15] media: mtk-vcodec: vdec: clamp OUTPUT resolution to hardware limits Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 13/15] media: mtk-vcodec: make flush buffer reusable by encoder Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 14/15] media: mtk-vcodec: venc: support START and STOP commands Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot
2021-02-26 10:01 ` [PATCH v3 15/15] media: mtk-vcodec: venc: make sure buffer exists in list before removing Alexandre Courbot
2021-02-26 10:01   ` Alexandre Courbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.