linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder
@ 2022-02-23  3:39 Yunfei Dong
  2022-02-23  3:39 ` [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers Yunfei Dong
                   ` (14 more replies)
  0 siblings, 15 replies; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

This series adds support for mt8192 h264/vp8/vp9 decoder drivers. Firstly, refactor
power/clock/interrupt interfaces for mt8192 is lat and core architecture.

Secondly, add new functions to get frame buffer size and resolution according
to decoder capability from scp side. Then add callback function to get/put
capture buffer in order to enable lat and core decoder in parallel. 

Then add to support MT21C compressed mode and fix v4l2-compliance fail.

Next, extract H264 request api driver to let mt8183 and mt8192 use the same
code, and adds mt8192 frame based h264 driver for stateless decoder.

Lastly, add vp8 and vp9 stateless decoder drivers.

Patches 1 refactor power/clock/interrupt interface.
Patches 2~4 get frame buffer size and resolution according to decoder capability.
Patches 5~6 enable lat and core decode in parallel.
Patch 7~10 add to support MT21C compressed mode and fix v4l2-compliance fail.
patch 11 record capture queue format type.
Patch 12~13 extract h264 driver and add mt8192 frame based driver for h264 decoder.
Patch 14~15 add vp8 and vp9 stateless decoder drivers.
---
changes compared with v6:
- rebase to the latest media stage and fix conficts
- fix memcpy to memcpy_fromio or memcpy_toio
- fix h264 crash when test field bitstream
changes compared with v5:
- fix vp9 comments for patch 15
- fix vp8 comments for patch 14.
- fix comments for patch 12.
- fix build errors.
changes compared with v4:
- fix checkpatch.pl fail.
- fix kernel-doc fail.
- rebase to the latest media codec driver.
changes compared with v3:
- remove enum mtk_chip for patch 2.
- add vp8 stateless decoder drivers for patch 14.
- add vp9 stateless decoder drivers for patch 15.
changes compared with v2:
- add new patch 11 to record capture queue format type.
- separate patch 4 according to tzung-bi's suggestion.
- re-write commit message for patch 5 according to tzung-bi's suggestion.
changes compared with v1:
- rewrite commit message for patch 12.
- rewrite cover-letter message.
---
Yunfei Dong (15):
  media: mtk-vcodec: Add vdec enable/disable hardware helpers
  media: mtk-vcodec: Using firmware type to separate different firmware
    architecture
  media: mtk-vcodec: get capture queue buffer size from scp
  media: mtk-vcodec: Read max resolution from dec_capability
  media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer
    buffered
  media: mtk-vcodec: Refactor get and put capture buffer flow
  media: mtk-vcodec: Refactor supported vdec formats and framesizes
  media: mtk-vcodec: Add format to support MT21C
  media: mtk-vcodec: disable vp8 4K capability
  media: mtk-vcodec: Fix v4l2-compliance fail
  media: mtk-vcodec: record capture queue format type
  media: mtk-vcodec: Extract H264 common code
  media: mtk-vcodec: support stateless H.264 decoding for mt8192
  media: mtk-vcodec: support stateless VP8 decoding
  media: mtk-vcodec: support stateless VP9 decoding

 drivers/media/platform/mtk-vcodec/Makefile    |    4 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      |   47 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |    5 -
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   |  166 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.h   |    6 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |   14 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |  282 ++-
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   40 +-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |    5 -
 .../media/platform/mtk-vcodec/mtk_vcodec_fw.c |    6 +
 .../media/platform/mtk-vcodec/mtk_vcodec_fw.h |    1 +
 .../mtk-vcodec/vdec/vdec_h264_req_common.c    |  310 +++
 .../mtk-vcodec/vdec/vdec_h264_req_common.h    |  253 +++
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        |  440 +---
 .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  |  621 ++++++
 .../mtk-vcodec/vdec/vdec_vp8_req_if.c         |  445 ++++
 .../mtk-vcodec/vdec/vdec_vp9_req_lat_if.c     | 1971 +++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   36 +-
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |    3 +
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |   36 +
 .../platform/mtk-vcodec/vdec_msg_queue.c      |    2 +
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   |   53 +-
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |   15 +
 .../media/platform/mtk-vcodec/venc_vpu_if.c   |    2 +-
 include/linux/remoteproc/mtk_scp.h            |    2 +
 25 files changed, 4180 insertions(+), 585 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-02-25  9:23   ` AngeloGioacchino Del Regno
  2022-02-23  3:39 ` [PATCH v7, 02/15] media: mtk-vcodec: Using firmware type to separate different firmware architecture Yunfei Dong
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Lock, power and clock are highly coupled operations. Adds vdec
enable/disable hardware helpers and uses them.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |   5 -
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 166 +++++++++++-------
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.h   |   6 +-
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |  20 +--
 .../platform/mtk-vcodec/vdec_msg_queue.c      |   2 +
 5 files changed, 116 insertions(+), 83 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 8d11510e441e..82796369f101 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -195,9 +195,6 @@ static int fops_vcodec_open(struct file *file)
 	mtk_vcodec_dec_set_default_params(ctx);
 
 	if (v4l2_fh_is_singular(&ctx->fh)) {
-		ret = mtk_vcodec_dec_pw_on(dev, MTK_VDEC_LAT0);
-		if (ret < 0)
-			goto err_load_fw;
 		/*
 		 * Does nothing if firmware was already loaded.
 		 */
@@ -254,8 +251,6 @@ static int fops_vcodec_release(struct file *file)
 	v4l2_m2m_ctx_release(ctx->m2m_ctx);
 	mtk_vcodec_dec_release(ctx);
 
-	if (v4l2_fh_is_singular(&ctx->fh))
-		mtk_vcodec_dec_pw_off(dev, MTK_VDEC_LAT0);
 	v4l2_fh_del(&ctx->fh);
 	v4l2_fh_exit(&ctx->fh);
 	v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 7e0c2644bf7b..0fb7e5ba635b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -57,74 +57,31 @@ int mtk_vcodec_init_dec_clk(struct platform_device *pdev, struct mtk_vcodec_pm *
 }
 EXPORT_SYMBOL_GPL(mtk_vcodec_init_dec_clk);
 
-int mtk_vcodec_dec_pw_on(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+static int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm)
 {
-	struct mtk_vdec_hw_dev *subdev_dev;
-	struct mtk_vcodec_pm *pm;
 	int ret;
 
-	if (vdec_dev->vdec_pdata->is_subdev_supported) {
-		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
-		if (!subdev_dev) {
-			mtk_v4l2_err("Failed to get hw dev\n");
-			return -EINVAL;
-		}
-		pm = &subdev_dev->pm;
-	} else {
-		pm = &vdec_dev->pm;
-	}
-
 	ret = pm_runtime_resume_and_get(pm->dev);
 	if (ret)
 		mtk_v4l2_err("pm_runtime_resume_and_get fail %d", ret);
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(mtk_vcodec_dec_pw_on);
 
-void mtk_vcodec_dec_pw_off(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+static void mtk_vcodec_dec_pw_off(struct mtk_vcodec_pm *pm)
 {
-	struct mtk_vdec_hw_dev *subdev_dev;
-	struct mtk_vcodec_pm *pm;
 	int ret;
 
-	if (vdec_dev->vdec_pdata->is_subdev_supported) {
-		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
-		if (!subdev_dev) {
-			mtk_v4l2_err("Failed to get hw dev\n");
-			return;
-		}
-		pm = &subdev_dev->pm;
-	} else {
-		pm = &vdec_dev->pm;
-	}
-
 	ret = pm_runtime_put_sync(pm->dev);
 	if (ret)
 		mtk_v4l2_err("pm_runtime_put_sync fail %d", ret);
 }
-EXPORT_SYMBOL_GPL(mtk_vcodec_dec_pw_off);
 
-void mtk_vcodec_dec_clock_on(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+static void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
 {
-	struct mtk_vdec_hw_dev *subdev_dev;
-	struct mtk_vcodec_pm *pm;
 	struct mtk_vcodec_clk *dec_clk;
 	int ret, i;
 
-	if (vdec_dev->vdec_pdata->is_subdev_supported) {
-		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
-		if (!subdev_dev) {
-			mtk_v4l2_err("Failed to get hw dev\n");
-			return;
-		}
-		pm = &subdev_dev->pm;
-		enable_irq(subdev_dev->dec_irq);
-	} else {
-		pm = &vdec_dev->pm;
-		enable_irq(vdec_dev->dec_irq);
-	}
-
 	dec_clk = &pm->vdec_clk;
 	for (i = 0; i < dec_clk->clk_num; i++) {
 		ret = clk_prepare_enable(dec_clk->clk_info[i].vcodec_clk);
@@ -140,30 +97,119 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
 	for (i -= 1; i >= 0; i--)
 		clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
 }
-EXPORT_SYMBOL_GPL(mtk_vcodec_dec_clock_on);
 
-void mtk_vcodec_dec_clock_off(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+static void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
 {
-	struct mtk_vdec_hw_dev *subdev_dev;
-	struct mtk_vcodec_pm *pm;
 	struct mtk_vcodec_clk *dec_clk;
 	int i;
 
+	dec_clk = &pm->vdec_clk;
+	for (i = dec_clk->clk_num - 1; i >= 0; i--)
+		clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
+}
+
+static void mtk_vcodec_dec_enable_irq(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+{
+	struct mtk_vdec_hw_dev *subdev_dev;
+
+	if (!test_bit(hw_idx, vdec_dev->subdev_bitmap))
+		return;
+
 	if (vdec_dev->vdec_pdata->is_subdev_supported) {
 		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
-		if (!subdev_dev) {
+		if (subdev_dev)
+			enable_irq(subdev_dev->dec_irq);
+		else
+			mtk_v4l2_err("Failed to get hw dev\n");
+	} else {
+		enable_irq(vdec_dev->dec_irq);
+	}
+}
+
+static void mtk_vcodec_dec_disable_irq(struct mtk_vcodec_dev *vdec_dev, int hw_idx)
+{
+	struct mtk_vdec_hw_dev *subdev_dev;
+
+	if (!test_bit(hw_idx, vdec_dev->subdev_bitmap))
+		return;
+
+	if (vdec_dev->vdec_pdata->is_subdev_supported) {
+		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
+		if (subdev_dev)
+			disable_irq(subdev_dev->dec_irq);
+		else
 			mtk_v4l2_err("Failed to get hw dev\n");
-			return;
-		}
-		pm = &subdev_dev->pm;
-		disable_irq(subdev_dev->dec_irq);
 	} else {
-		pm = &vdec_dev->pm;
 		disable_irq(vdec_dev->dec_irq);
 	}
+}
 
-	dec_clk = &pm->vdec_clk;
-	for (i = dec_clk->clk_num - 1; i >= 0; i--)
-		clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
+static struct mtk_vcodec_pm *mtk_vcodec_dec_get_pm(struct mtk_vcodec_dev *vdec_dev,
+						   int hw_idx)
+{
+	struct mtk_vdec_hw_dev *subdev_dev;
+
+	if (!test_bit(hw_idx, vdec_dev->subdev_bitmap))
+		return NULL;
+
+	if (vdec_dev->vdec_pdata->is_subdev_supported) {
+		subdev_dev = mtk_vcodec_get_hw_dev(vdec_dev, hw_idx);
+		if (subdev_dev)
+			return &subdev_dev->pm;
+
+		mtk_v4l2_err("Failed to get hw dev\n");
+		return NULL;
+	}
+
+	return &vdec_dev->pm;
+}
+
+static void mtk_vcodec_dec_child_dev_on(struct mtk_vcodec_dev *vdec_dev,
+					int hw_idx)
+{
+	struct mtk_vcodec_pm *pm;
+
+	pm = mtk_vcodec_dec_get_pm(vdec_dev, hw_idx);
+	if (pm) {
+		mtk_vcodec_dec_pw_on(pm);
+		mtk_vcodec_dec_clock_on(pm);
+	}
+}
+
+static void mtk_vcodec_dec_child_dev_off(struct mtk_vcodec_dev *vdec_dev,
+					 int hw_idx)
+{
+	struct mtk_vcodec_pm *pm;
+
+	pm = mtk_vcodec_dec_get_pm(vdec_dev, hw_idx);
+	if (pm) {
+		mtk_vcodec_dec_clock_off(pm);
+		mtk_vcodec_dec_pw_off(pm);
+	}
+}
+
+void mtk_vcodec_dec_enable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx)
+{
+	mutex_lock(&ctx->dev->dec_mutex[hw_idx]);
+
+	if (IS_VDEC_LAT_ARCH(ctx->dev->vdec_pdata->hw_arch) &&
+	    hw_idx == MTK_VDEC_CORE)
+		mtk_vcodec_dec_child_dev_on(ctx->dev, MTK_VDEC_LAT0);
+	mtk_vcodec_dec_child_dev_on(ctx->dev, hw_idx);
+
+	mtk_vcodec_dec_enable_irq(ctx->dev, hw_idx);
+}
+EXPORT_SYMBOL_GPL(mtk_vcodec_dec_enable_hardware);
+
+void mtk_vcodec_dec_disable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx)
+{
+	mtk_vcodec_dec_disable_irq(ctx->dev, hw_idx);
+
+	mtk_vcodec_dec_child_dev_off(ctx->dev, hw_idx);
+	if (IS_VDEC_LAT_ARCH(ctx->dev->vdec_pdata->hw_arch) &&
+	    hw_idx == MTK_VDEC_CORE)
+		mtk_vcodec_dec_child_dev_off(ctx->dev, MTK_VDEC_LAT0);
+
+	mutex_unlock(&ctx->dev->dec_mutex[hw_idx]);
 }
-EXPORT_SYMBOL_GPL(mtk_vcodec_dec_clock_off);
+EXPORT_SYMBOL_GPL(mtk_vcodec_dec_disable_hardware);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.h
index 3cc721bbfaf6..dbcf3cabe6f3 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.h
@@ -11,9 +11,7 @@
 
 int mtk_vcodec_init_dec_clk(struct platform_device *pdev, struct mtk_vcodec_pm *pm);
 
-int mtk_vcodec_dec_pw_on(struct mtk_vcodec_dev *vdec_dev, int hw_idx);
-void mtk_vcodec_dec_pw_off(struct mtk_vcodec_dev *vdec_dev, int hw_idx);
-void mtk_vcodec_dec_clock_on(struct mtk_vcodec_dev *vdec_dev, int hw_idx);
-void mtk_vcodec_dec_clock_off(struct mtk_vcodec_dev *vdec_dev, int hw_idx);
+void mtk_vcodec_dec_enable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx);
+void mtk_vcodec_dec_disable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx);
 
 #endif /* _MTK_VCODEC_DEC_PM_H_ */
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index 05a5b240e906..c93dd0ea3537 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -38,11 +38,9 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 		return -EINVAL;
 	}
 
-	mtk_vdec_lock(ctx);
-	mtk_vcodec_dec_clock_on(ctx->dev, ctx->hw_id);
+	mtk_vcodec_dec_enable_hardware(ctx, ctx->hw_id);
 	ret = ctx->dec_if->init(ctx);
-	mtk_vcodec_dec_clock_off(ctx->dev, ctx->hw_id);
-	mtk_vdec_unlock(ctx);
+	mtk_vcodec_dec_disable_hardware(ctx, ctx->hw_id);
 
 	return ret;
 }
@@ -70,15 +68,11 @@ int vdec_if_decode(struct mtk_vcodec_ctx *ctx, struct mtk_vcodec_mem *bs,
 	if (!ctx->drv_handle)
 		return -EIO;
 
-	mtk_vdec_lock(ctx);
-
+	mtk_vcodec_dec_enable_hardware(ctx, ctx->hw_id);
 	mtk_vcodec_set_curr_ctx(ctx->dev, ctx, ctx->hw_id);
-	mtk_vcodec_dec_clock_on(ctx->dev, ctx->hw_id);
 	ret = ctx->dec_if->decode(ctx->drv_handle, bs, fb, res_chg);
-	mtk_vcodec_dec_clock_off(ctx->dev, ctx->hw_id);
 	mtk_vcodec_set_curr_ctx(ctx->dev, NULL, ctx->hw_id);
-
-	mtk_vdec_unlock(ctx);
+	mtk_vcodec_dec_disable_hardware(ctx, ctx->hw_id);
 
 	return ret;
 }
@@ -103,11 +97,9 @@ void vdec_if_deinit(struct mtk_vcodec_ctx *ctx)
 	if (!ctx->drv_handle)
 		return;
 
-	mtk_vdec_lock(ctx);
-	mtk_vcodec_dec_clock_on(ctx->dev, ctx->hw_id);
+	mtk_vcodec_dec_enable_hardware(ctx, ctx->hw_id);
 	ctx->dec_if->deinit(ctx->drv_handle);
-	mtk_vcodec_dec_clock_off(ctx->dev, ctx->hw_id);
-	mtk_vdec_unlock(ctx);
+	mtk_vcodec_dec_disable_hardware(ctx, ctx->hw_id);
 
 	ctx->drv_handle = NULL;
 }
diff --git a/drivers/media/platform/mtk-vcodec/vdec_msg_queue.c b/drivers/media/platform/mtk-vcodec/vdec_msg_queue.c
index 4b062a8128b4..ae500980ad45 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_msg_queue.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_msg_queue.c
@@ -212,11 +212,13 @@ static void vdec_msg_queue_core_work(struct work_struct *work)
 		return;
 
 	ctx = lat_buf->ctx;
+	mtk_vcodec_dec_enable_hardware(ctx, MTK_VDEC_CORE);
 	mtk_vcodec_set_curr_ctx(dev, ctx, MTK_VDEC_CORE);
 
 	lat_buf->core_decode(lat_buf);
 
 	mtk_vcodec_set_curr_ctx(dev, NULL, MTK_VDEC_CORE);
+	mtk_vcodec_dec_disable_hardware(ctx, MTK_VDEC_CORE);
 	vdec_msg_queue_qbuf(&ctx->msg_queue.lat_ctx, lat_buf);
 
 	if (!list_empty(&ctx->msg_queue.lat_ctx.ready_queue)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 02/15] media: mtk-vcodec: Using firmware type to separate different firmware architecture
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
  2022-02-23  3:39 ` [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-02-23  3:39 ` [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp Yunfei Dong
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

MT8173 platform use vpu firmware, mt8183/mt8192 will use scp
firmware instead, using chip name is not reasonable to separate
different firmware architecture. Using firmware type is much better.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateful.c   |  1 -
 .../platform/mtk-vcodec/mtk_vcodec_dec_stateless.c  |  2 --
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h  | 13 -------------
 .../media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c  |  5 -----
 drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c   |  6 ++++++
 drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h   |  1 +
 drivers/media/platform/mtk-vcodec/vdec_vpu_if.c     |  4 ++--
 drivers/media/platform/mtk-vcodec/venc_vpu_if.c     |  2 +-
 8 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 04ca43c77e5f..7966c132be8f 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -613,7 +613,6 @@ static struct vb2_ops mtk_vdec_frame_vb2_ops = {
 };
 
 const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
-	.chip = MTK_MT8173,
 	.init_vdec_params = mtk_init_vdec_params,
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 23d997ac114d..5aebf88f997b 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -343,7 +343,6 @@ static struct vb2_ops mtk_vdec_request_vb2_ops = {
 };
 
 const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
-	.chip = MTK_MT8183,
 	.init_vdec_params = mtk_init_vdec_params,
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
@@ -362,7 +361,6 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
 
 /* This platform data is used for one lat and one core architecture. */
 const struct mtk_vcodec_dec_pdata mtk_lat_sig_core_pdata = {
-	.chip = MTK_MT8192,
 	.init_vdec_params = mtk_init_vdec_params,
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 813901c4be5e..bb7b8e914d24 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -332,13 +332,6 @@ struct mtk_vcodec_ctx {
 	struct vdec_msg_queue msg_queue;
 };
 
-enum mtk_chip {
-	MTK_MT8173,
-	MTK_MT8183,
-	MTK_MT8192,
-	MTK_MT8195,
-};
-
 /*
  * enum mtk_vdec_hw_arch - Used to separate different hardware architecture
  */
@@ -364,7 +357,6 @@ enum mtk_vdec_hw_arch {
  * @vdec_framesizes: supported video decoder frame sizes
  * @num_framesizes: count of video decoder frame sizes
  *
- * @chip: chip this decoder is compatible with
  * @hw_arch: hardware arch is used to separate pure_sin_core and lat_sin_core
  *
  * @is_subdev_supported: whether support parent-node architecture(subdev)
@@ -387,7 +379,6 @@ struct mtk_vcodec_dec_pdata {
 	const struct mtk_codec_framesizes *vdec_framesizes;
 	const int num_framesizes;
 
-	enum mtk_chip chip;
 	enum mtk_vdec_hw_arch hw_arch;
 
 	bool is_subdev_supported;
@@ -397,8 +388,6 @@ struct mtk_vcodec_dec_pdata {
 /**
  * struct mtk_vcodec_enc_pdata - compatible data for each IC
  *
- * @chip: chip this encoder is compatible with
- *
  * @uses_ext: whether the encoder uses the extended firmware messaging format
  * @min_bitrate: minimum supported encoding bitrate
  * @max_bitrate: maximum supported encoding bitrate
@@ -409,8 +398,6 @@ struct mtk_vcodec_dec_pdata {
  * @core_id: stand for h264 or vp8 encode index
  */
 struct mtk_vcodec_enc_pdata {
-	enum mtk_chip chip;
-
 	bool uses_ext;
 	unsigned long min_bitrate;
 	unsigned long max_bitrate;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
index e21487341d8b..65207f5b6c1c 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
@@ -377,7 +377,6 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
 }
 
 static const struct mtk_vcodec_enc_pdata mt8173_avc_pdata = {
-	.chip = MTK_MT8173,
 	.capture_formats = mtk_video_formats_capture_h264,
 	.num_capture_formats = ARRAY_SIZE(mtk_video_formats_capture_h264),
 	.output_formats = mtk_video_formats_output,
@@ -388,7 +387,6 @@ static const struct mtk_vcodec_enc_pdata mt8173_avc_pdata = {
 };
 
 static const struct mtk_vcodec_enc_pdata mt8173_vp8_pdata = {
-	.chip = MTK_MT8173,
 	.capture_formats = mtk_video_formats_capture_vp8,
 	.num_capture_formats = ARRAY_SIZE(mtk_video_formats_capture_vp8),
 	.output_formats = mtk_video_formats_output,
@@ -399,7 +397,6 @@ static const struct mtk_vcodec_enc_pdata mt8173_vp8_pdata = {
 };
 
 static const struct mtk_vcodec_enc_pdata mt8183_pdata = {
-	.chip = MTK_MT8183,
 	.uses_ext = true,
 	.capture_formats = mtk_video_formats_capture_h264,
 	.num_capture_formats = ARRAY_SIZE(mtk_video_formats_capture_h264),
@@ -411,7 +408,6 @@ static const struct mtk_vcodec_enc_pdata mt8183_pdata = {
 };
 
 static const struct mtk_vcodec_enc_pdata mt8192_pdata = {
-	.chip = MTK_MT8192,
 	.uses_ext = true,
 	.capture_formats = mtk_video_formats_capture_h264,
 	.num_capture_formats = ARRAY_SIZE(mtk_video_formats_capture_h264),
@@ -423,7 +419,6 @@ static const struct mtk_vcodec_enc_pdata mt8192_pdata = {
 };
 
 static const struct mtk_vcodec_enc_pdata mt8195_pdata = {
-	.chip = MTK_MT8195,
 	.uses_ext = true,
 	.capture_formats = mtk_video_formats_capture_h264,
 	.num_capture_formats = ARRAY_SIZE(mtk_video_formats_capture_h264),
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c
index 94b39ae5c2e1..556e54aadac9 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.c
@@ -65,3 +65,9 @@ int mtk_vcodec_fw_ipi_send(struct mtk_vcodec_fw *fw, int id, void *buf,
 	return fw->ops->ipi_send(fw, id, buf, len, wait);
 }
 EXPORT_SYMBOL_GPL(mtk_vcodec_fw_ipi_send);
+
+int mtk_vcodec_fw_get_type(struct mtk_vcodec_fw *fw)
+{
+	return fw->type;
+}
+EXPORT_SYMBOL_GPL(mtk_vcodec_fw_get_type);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h
index 539bb626772c..acd355961e3a 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_fw.h
@@ -39,5 +39,6 @@ int mtk_vcodec_fw_ipi_register(struct mtk_vcodec_fw *fw, int id,
 			       const char *name, void *priv);
 int mtk_vcodec_fw_ipi_send(struct mtk_vcodec_fw *fw, int id,
 			   void *buf, unsigned int len, unsigned int wait);
+int mtk_vcodec_fw_get_type(struct mtk_vcodec_fw *fw);
 
 #endif /* _MTK_VCODEC_FW_H_ */
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index dd35d2c5f920..7210061c772f 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -33,8 +33,8 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 	 */
 	vpu->inst_id = 0xdeadbeef;
 
-	/* Firmware version field does not exist on MT8173. */
-	if (vpu->ctx->dev->vdec_pdata->chip == MTK_MT8173)
+	/* VPU firmware does not contain a version field. */
+	if (mtk_vcodec_fw_get_type(vpu->ctx->dev->fw_handler) == VPU)
 		return;
 
 	/* Check firmware version. */
diff --git a/drivers/media/platform/mtk-vcodec/venc_vpu_if.c b/drivers/media/platform/mtk-vcodec/venc_vpu_if.c
index e7899d8a3e4e..d3570c4c177d 100644
--- a/drivers/media/platform/mtk-vcodec/venc_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/venc_vpu_if.c
@@ -18,7 +18,7 @@ static void handle_enc_init_msg(struct venc_vpu_inst *vpu, const void *data)
 					     msg->vpu_inst_addr);
 
 	/* Firmware version field value is unspecified on MT8173. */
-	if (vpu->ctx->dev->venc_pdata->chip == MTK_MT8173)
+	if (mtk_vcodec_fw_get_type(vpu->ctx->dev->fw_handler) == VPU)
 		return;
 
 	/* Check firmware version. */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
  2022-02-23  3:39 ` [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers Yunfei Dong
  2022-02-23  3:39 ` [PATCH v7, 02/15] media: mtk-vcodec: Using firmware type to separate different firmware architecture Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-03-01 14:44   ` Nicolas Dufresne
  2022-02-23  3:39 ` [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability Yunfei Dong
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Different capture buffer format has different buffer size, need to get
real buffer size according to buffer type from scp.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  | 36 ++++++++++++++
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 49 +++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   | 15 ++++++
 3 files changed, 100 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
index bf54d6d9a857..47070be2a991 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
@@ -20,6 +20,7 @@ enum vdec_ipi_msgid {
 	AP_IPIMSG_DEC_RESET = 0xA004,
 	AP_IPIMSG_DEC_CORE = 0xA005,
 	AP_IPIMSG_DEC_CORE_END = 0xA006,
+	AP_IPIMSG_DEC_GET_PARAM = 0xA007,
 
 	VPU_IPIMSG_DEC_INIT_ACK = 0xB000,
 	VPU_IPIMSG_DEC_START_ACK = 0xB001,
@@ -28,6 +29,7 @@ enum vdec_ipi_msgid {
 	VPU_IPIMSG_DEC_RESET_ACK = 0xB004,
 	VPU_IPIMSG_DEC_CORE_ACK = 0xB005,
 	VPU_IPIMSG_DEC_CORE_END_ACK = 0xB006,
+	VPU_IPIMSG_DEC_GET_PARAM_ACK = 0xB007,
 };
 
 /**
@@ -114,4 +116,38 @@ struct vdec_vpu_ipi_init_ack {
 	uint32_t inst_id;
 };
 
+/**
+ * struct vdec_ap_ipi_get_param - for AP_IPIMSG_DEC_GET_PARAM
+ * @msg_id	: AP_IPIMSG_DEC_GET_PARAM
+ * @inst_id     : instance ID. Used if the ABI version >= 2.
+ * @data	: picture information
+ * @param_type	: get param type
+ * @codec_type	: Codec fourcc
+ */
+struct vdec_ap_ipi_get_param {
+	u32 msg_id;
+	u32 inst_id;
+	u32 data[4];
+	u32 param_type;
+	u32 codec_type;
+};
+
+/**
+ * struct vdec_vpu_ipi_get_param_ack - for VPU_IPIMSG_DEC_GET_PARAM_ACK
+ * @msg_id	: VPU_IPIMSG_DEC_GET_PARAM_ACK
+ * @status	: VPU execution result
+ * @ap_inst_addr	: AP vcodec_vpu_inst instance address
+ * @data     : picture information from SCP.
+ * @param_type	: get param type
+ * @reserved : reserved param
+ */
+struct vdec_vpu_ipi_get_param_ack {
+	u32 msg_id;
+	s32 status;
+	u64 ap_inst_addr;
+	u32 data[4];
+	u32 param_type;
+	u32 reserved;
+};
+
 #endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 7210061c772f..35f4d5583084 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -6,6 +6,7 @@
 
 #include "mtk_vcodec_drv.h"
 #include "mtk_vcodec_util.h"
+#include "vdec_drv_if.h"
 #include "vdec_ipi_msg.h"
 #include "vdec_vpu_if.h"
 #include "mtk_vcodec_fw.h"
@@ -54,6 +55,26 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
 	}
 }
 
+static void handle_get_param_msg_ack(const struct vdec_vpu_ipi_get_param_ack *msg)
+{
+	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
+					(unsigned long)msg->ap_inst_addr;
+
+	mtk_vcodec_debug(vpu, "+ ap_inst_addr = 0x%llx", msg->ap_inst_addr);
+
+	/* param_type is enum vdec_get_param_type */
+	switch (msg->param_type) {
+	case GET_PARAM_PIC_INFO:
+		vpu->fb_sz[0] = msg->data[0];
+		vpu->fb_sz[1] = msg->data[1];
+		break;
+	default:
+		mtk_vcodec_err(vpu, "invalid get param type=%d", msg->param_type);
+		vpu->failure = 1;
+		break;
+	}
+}
+
 /*
  * vpu_dec_ipi_handler - Handler for VPU ipi message.
  *
@@ -89,6 +110,9 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
 		case VPU_IPIMSG_DEC_CORE_END_ACK:
 			break;
 
+		case VPU_IPIMSG_DEC_GET_PARAM_ACK:
+			handle_get_param_msg_ack(data);
+			break;
 		default:
 			mtk_vcodec_err(vpu, "invalid msg=%X", msg->msg_id);
 			break;
@@ -217,6 +241,31 @@ int vpu_dec_start(struct vdec_vpu_inst *vpu, uint32_t *data, unsigned int len)
 	return err;
 }
 
+int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
+		      unsigned int len, unsigned int param_type)
+{
+	struct vdec_ap_ipi_get_param msg;
+	int err;
+
+	mtk_vcodec_debug_enter(vpu);
+
+	if (len > ARRAY_SIZE(msg.data)) {
+		mtk_vcodec_err(vpu, "invalid len = %d\n", len);
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(msg));
+	msg.msg_id = AP_IPIMSG_DEC_GET_PARAM;
+	msg.inst_id = vpu->inst_id;
+	memcpy(msg.data, data, sizeof(unsigned int) * len);
+	msg.param_type = param_type;
+	msg.codec_type = vpu->codec_type;
+
+	err = vcodec_vpu_send_msg(vpu, (void *)&msg, sizeof(msg));
+	mtk_vcodec_debug(vpu, "- ret=%d", err);
+	return err;
+}
+
 int vpu_dec_core(struct vdec_vpu_inst *vpu)
 {
 	return vcodec_send_ap_ipi(vpu, AP_IPIMSG_DEC_CORE);
diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
index 4cb3c7f5a3ad..d1feba41dd39 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
@@ -28,6 +28,8 @@ struct mtk_vcodec_ctx;
  * @wq          : wait queue to wait VPU message ack
  * @handler     : ipi handler for each decoder
  * @codec_type     : use codec type to separate different codecs
+ * @capture_type    : used capture type to separate different capture format
+ * @fb_sz  : frame buffer size of each plane
  */
 struct vdec_vpu_inst {
 	int id;
@@ -42,6 +44,8 @@ struct vdec_vpu_inst {
 	wait_queue_head_t wq;
 	mtk_vcodec_ipi_handler handler;
 	unsigned int codec_type;
+	unsigned int capture_type;
+	unsigned int fb_sz[2];
 };
 
 /**
@@ -104,4 +108,15 @@ int vpu_dec_core(struct vdec_vpu_inst *vpu);
  */
 int vpu_dec_core_end(struct vdec_vpu_inst *vpu);
 
+/**
+ * vpu_dec_get_param - get param from scp
+ *
+ * @vpu : instance for vdec_vpu_inst
+ * @data: meta data to pass bitstream info to VPU decoder
+ * @len : meta data length
+ * @param_type : get param type
+ */
+int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
+		      unsigned int len, unsigned int param_type);
+
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (2 preceding siblings ...)
  2022-02-23  3:39 ` [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-02-25  9:23   ` AngeloGioacchino Del Regno
  2022-02-28 21:29   ` Nicolas Dufresne
  2022-02-23  3:39 ` [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered Yunfei Dong
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Supported max resolution for different platforms are not the same: 2K
or 4K, getting it according to dec_capability.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 29 +++++++++++--------
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 +++
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 130ecef2e766..304f5afbd419 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -152,13 +152,15 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
 	q_data->coded_height = DFT_CFG_HEIGHT;
 	q_data->fmt = ctx->dev->vdec_pdata->default_cap_fmt;
 	q_data->field = V4L2_FIELD_NONE;
+	ctx->max_width = MTK_VDEC_MAX_W;
+	ctx->max_height = MTK_VDEC_MAX_H;
 
 	v4l_bound_align_image(&q_data->coded_width,
 				MTK_VDEC_MIN_W,
-				MTK_VDEC_MAX_W, 4,
+				ctx->max_width, 4,
 				&q_data->coded_height,
 				MTK_VDEC_MIN_H,
-				MTK_VDEC_MAX_H, 5, 6);
+				ctx->max_height, 5, 6);
 
 	q_data->sizeimage[0] = q_data->coded_width * q_data->coded_height;
 	q_data->bytesperline[0] = q_data->coded_width;
@@ -217,7 +219,7 @@ static int vidioc_vdec_subscribe_evt(struct v4l2_fh *fh,
 	}
 }
 
-static int vidioc_try_fmt(struct v4l2_format *f,
+static int vidioc_try_fmt(struct mtk_vcodec_ctx *ctx, struct v4l2_format *f,
 			  const struct mtk_video_fmt *fmt)
 {
 	struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
@@ -225,9 +227,9 @@ static int vidioc_try_fmt(struct v4l2_format *f,
 	pix_fmt_mp->field = V4L2_FIELD_NONE;
 
 	pix_fmt_mp->width =
-		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W, MTK_VDEC_MAX_W);
+		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W, ctx->max_width);
 	pix_fmt_mp->height =
-		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H, MTK_VDEC_MAX_H);
+		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H, ctx->max_height);
 
 	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		pix_fmt_mp->num_planes = 1;
@@ -245,16 +247,16 @@ static int vidioc_try_fmt(struct v4l2_format *f,
 		tmp_h = pix_fmt_mp->height;
 		v4l_bound_align_image(&pix_fmt_mp->width,
 					MTK_VDEC_MIN_W,
-					MTK_VDEC_MAX_W, 6,
+					ctx->max_width, 6,
 					&pix_fmt_mp->height,
 					MTK_VDEC_MIN_H,
-					MTK_VDEC_MAX_H, 6, 9);
+					ctx->max_height, 6, 9);
 
 		if (pix_fmt_mp->width < tmp_w &&
-			(pix_fmt_mp->width + 64) <= MTK_VDEC_MAX_W)
+			(pix_fmt_mp->width + 64) <= ctx->max_width)
 			pix_fmt_mp->width += 64;
 		if (pix_fmt_mp->height < tmp_h &&
-			(pix_fmt_mp->height + 64) <= MTK_VDEC_MAX_H)
+			(pix_fmt_mp->height + 64) <= ctx->max_height)
 			pix_fmt_mp->height += 64;
 
 		mtk_v4l2_debug(0,
@@ -294,7 +296,7 @@ static int vidioc_try_fmt_vid_cap_mplane(struct file *file, void *priv,
 		fmt = mtk_vdec_find_format(f, dec_pdata);
 	}
 
-	return vidioc_try_fmt(f, fmt);
+	return vidioc_try_fmt(ctx, f, fmt);
 }
 
 static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
@@ -317,7 +319,7 @@ static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
 		return -EINVAL;
 	}
 
-	return vidioc_try_fmt(f, fmt);
+	return vidioc_try_fmt(ctx, f, fmt);
 }
 
 static int vidioc_vdec_g_selection(struct file *file, void *priv,
@@ -445,7 +447,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		return -EINVAL;
 
 	q_data->fmt = fmt;
-	vidioc_try_fmt(f, q_data->fmt);
+	vidioc_try_fmt(ctx, f, q_data->fmt);
 	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
 		q_data->sizeimage[0] = pix_mp->plane_fmt[0].sizeimage;
 		q_data->coded_width = pix_mp->width;
@@ -545,6 +547,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 				fsize->stepwise.min_height,
 				fsize->stepwise.max_height,
 				fsize->stepwise.step_height);
+
+		ctx->max_width = fsize->stepwise.max_width;
+		ctx->max_height = fsize->stepwise.max_height;
 		return 0;
 	}
 
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index bb7b8e914d24..6d27e4d41ede 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -284,6 +284,8 @@ struct vdec_pic_info {
  *	  mtk_video_dec_buf.
  * @hw_id: hardware index used to identify different hardware.
  *
+ * @max_width: hardware supported max width
+ * @max_height: hardware supported max height
  * @msg_queue: msg queue used to store lat buffer information.
  */
 struct mtk_vcodec_ctx {
@@ -329,6 +331,8 @@ struct mtk_vcodec_ctx {
 	struct mutex lock;
 	int hw_id;
 
+	unsigned int max_width;
+	unsigned int max_height;
 	struct vdec_msg_queue msg_queue;
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (3 preceding siblings ...)
  2022-02-23  3:39 ` [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-03-01 18:50   ` Nicolas Dufresne
  2022-02-23  3:39 ` [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow Yunfei Dong
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

lat thread: output queue      \
                               -> lat hardware -> lat trans buffer
            lat trans buffer  /

core thread: capture queue     \
                                ->core hardware -> capture queue
             lat trans buffer  /

Lat and core work in different thread, setting capture buffer buffered.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 5aebf88f997b..23a154c4e321 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -314,6 +314,9 @@ static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
 	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
 				 V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
 
+	if (ctx->dev->vdec_pdata->hw_arch != MTK_VDEC_PURE_SINGLE_CORE)
+		v4l2_m2m_set_dst_buffered(ctx->m2m_ctx, 1);
+
 	/* Support request api for output plane */
 	src_vq->supports_requests = true;
 	src_vq->requires_requests = true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (4 preceding siblings ...)
  2022-02-23  3:39 ` [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered Yunfei Dong
@ 2022-02-23  3:39 ` Yunfei Dong
  2022-03-01 19:00   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes Yunfei Dong
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:39 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

For lat and core decode in parallel, need to get capture buffer
when core start to decode and put capture buffer to display
list when core decode done.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 121 ++++++++++++------
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   5 +-
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        |  16 ++-
 3 files changed, 102 insertions(+), 40 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 23a154c4e321..6d481410bf89 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -108,37 +108,87 @@ static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
 
 #define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
 
-static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
-					       struct vdec_fb *fb)
+static void mtk_vdec_stateless_out_to_done(struct mtk_vcodec_ctx *ctx,
+					   struct mtk_vcodec_mem *bs, int error)
 {
-	struct mtk_video_dec_buf *vdec_frame_buf =
-		container_of(fb, struct mtk_video_dec_buf, frame_buffer);
-	struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
-	unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+	struct mtk_video_dec_buf *out_buf;
+	struct vb2_v4l2_buffer *vb;
 
-	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
-	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
-		unsigned int cap_c_size =
-			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+	if (!bs) {
+		mtk_v4l2_err("Free bitstream buffer fail.");
+		return;
+	}
+	out_buf = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+	vb = &out_buf->m2m_buf.vb;
 
-		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
+	mtk_v4l2_debug(2, "Free bitsteam buffer id = %d to done_list",
+		       vb->vb2_buf.index);
+
+	v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
+	if (error) {
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR);
+		if (error == -EIO)
+			out_buf->error = true;
+	} else {
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
 	}
 }
 
-static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
-					   struct vb2_v4l2_buffer *vb2_v4l2)
+static void mtk_vdec_stateless_cap_to_disp(struct mtk_vcodec_ctx *ctx,
+					   struct vdec_fb *fb, int error)
 {
-	struct mtk_video_dec_buf *framebuf =
-		container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
-	struct vdec_fb *pfb = &framebuf->frame_buffer;
-	struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
+	struct mtk_video_dec_buf *vdec_frame_buf;
+	struct vb2_v4l2_buffer *vb;
+	unsigned int cap_y_size, cap_c_size;
+
+	if (!fb) {
+		mtk_v4l2_err("Free frame buffer fail.");
+		return;
+	}
+	vdec_frame_buf = container_of(fb, struct mtk_video_dec_buf,
+				      frame_buffer);
+	vb = &vdec_frame_buf->m2m_buf.vb;
+
+	cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
+	cap_c_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
+
+	v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
 
-	pfb->base_y.va = NULL;
+	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
+
+	mtk_v4l2_debug(2, "Free frame buffer id = %d to done_list",
+		       vb->vb2_buf.index);
+	if (error)
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR);
+	else
+		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
+}
+
+static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx)
+{
+	struct mtk_video_dec_buf *framebuf;
+	struct vb2_v4l2_buffer *vb2_v4l2;
+	struct vb2_buffer *dst_buf;
+	struct vdec_fb *pfb;
+
+	vb2_v4l2 = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	if (!vb2_v4l2) {
+		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
+		return NULL;
+	}
+
+	dst_buf = &vb2_v4l2->vb2_buf;
+	framebuf = container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
+
+	pfb = &framebuf->frame_buffer;
+	pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
 	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
 	pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
 
 	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
-		pfb->base_c.va = NULL;
+		pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
 		pfb->base_c.dma_addr =
 			vb2_dma_contig_plane_dma_addr(dst_buf, 1);
 		pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
@@ -162,12 +212,11 @@ static void mtk_vdec_worker(struct work_struct *work)
 	struct mtk_vcodec_ctx *ctx =
 		container_of(work, struct mtk_vcodec_ctx, decode_work);
 	struct mtk_vcodec_dev *dev = ctx->dev;
-	struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
+	struct vb2_v4l2_buffer *vb2_v4l2_src;
 	struct vb2_buffer *vb2_src;
 	struct mtk_vcodec_mem *bs_src;
 	struct mtk_video_dec_buf *dec_buf_src;
 	struct media_request *src_buf_req;
-	struct vdec_fb *dst_buf;
 	bool res_chg = false;
 	int ret;
 
@@ -178,13 +227,6 @@ static void mtk_vdec_worker(struct work_struct *work)
 		return;
 	}
 
-	vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
-	if (!vb2_v4l2_dst) {
-		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-		mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
-		return;
-	}
-
 	vb2_src = &vb2_v4l2_src->vb2_buf;
 	dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
 				   m2m_buf.vb);
@@ -193,9 +235,15 @@ static void mtk_vdec_worker(struct work_struct *work)
 	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p", ctx->id,
 		       vb2_src->vb2_queue->type, vb2_src->index, vb2_src);
 
-	bs_src->va = NULL;
+	bs_src->va = vb2_plane_vaddr(vb2_src, 0);
 	bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
 	bs_src->size = (size_t)vb2_src->planes[0].bytesused;
+	if (!bs_src->va) {
+		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+		mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
+			     vb2_src->index);
+		return;
+	}
 
 	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
 		       ctx->id, bs_src->va, &bs_src->dma_addr, bs_src->size, vb2_src);
@@ -206,9 +254,7 @@ static void mtk_vdec_worker(struct work_struct *work)
 	else
 		mtk_v4l2_err("vb2 buffer media request is NULL");
 
-	dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
-	v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
-	ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
+	ret = vdec_if_decode(ctx, bs_src, NULL, &res_chg);
 	if (ret) {
 		mtk_v4l2_err(" <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
 			     ctx->id, vb2_src->index, bs_src->size,
@@ -220,12 +266,9 @@ static void mtk_vdec_worker(struct work_struct *work)
 		}
 	}
 
-	mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
-
-	v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
-					 ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
-
+	mtk_vdec_stateless_out_to_done(ctx, bs_src, ret);
 	v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);
+	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
 }
 
 static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
@@ -358,6 +401,8 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
 	.uses_stateless_api = true,
 	.worker = mtk_vdec_worker,
 	.flush_decoder = mtk_vdec_flush_decoder,
+	.cap_to_disp = mtk_vdec_stateless_cap_to_disp,
+	.get_cap_buffer = vdec_get_cap_buffer,
 	.is_subdev_supported = false,
 	.hw_arch = MTK_VDEC_PURE_SINGLE_CORE,
 };
@@ -376,6 +421,8 @@ const struct mtk_vcodec_dec_pdata mtk_lat_sig_core_pdata = {
 	.uses_stateless_api = true,
 	.worker = mtk_vdec_worker,
 	.flush_decoder = mtk_vdec_flush_decoder,
+	.cap_to_disp = mtk_vdec_stateless_cap_to_disp,
+	.get_cap_buffer = vdec_get_cap_buffer,
 	.is_subdev_supported = true,
 	.hw_arch = MTK_VDEC_LAT_SINGLE_CORE,
 };
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 6d27e4d41ede..9fcaf69549dd 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -350,7 +350,8 @@ enum mtk_vdec_hw_arch {
  * @ctrls_setup: init vcodec dec ctrls
  * @worker: worker to start a decode job
  * @flush_decoder: function that flushes the decoder
- *
+ * @get_cap_buffer: get capture buffer from capture queue
+ * @cap_to_disp: put capture buffer to disp list
  * @vdec_vb2_ops: struct vb2_ops
  *
  * @vdec_formats: supported video decoder formats
@@ -372,6 +373,8 @@ struct mtk_vcodec_dec_pdata {
 	int (*ctrls_setup)(struct mtk_vcodec_ctx *ctx);
 	void (*worker)(struct work_struct *work);
 	int (*flush_decoder)(struct mtk_vcodec_ctx *ctx);
+	struct vdec_fb *(*get_cap_buffer)(struct mtk_vcodec_ctx *ctx);
+	void (*cap_to_disp)(struct mtk_vcodec_ctx *ctx, struct vdec_fb *fb, int error);
 
 	struct vb2_ops *vdec_vb2_ops;
 
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
index 43542de11e9c..36f3dc1fbe3b 100644
--- a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
@@ -670,32 +670,42 @@ static void vdec_h264_slice_deinit(void *h_vdec)
 }
 
 static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
-				  struct vdec_fb *fb, bool *res_chg)
+				  struct vdec_fb *unused, bool *res_chg)
 {
 	struct vdec_h264_slice_inst *inst = h_vdec;
 	const struct v4l2_ctrl_h264_decode_params *dec_params =
 		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
 	struct vdec_vpu_inst *vpu = &inst->vpu;
+	struct mtk_video_dec_buf *src_buf_info;
+	struct mtk_video_dec_buf *dst_buf_info;
+	struct vdec_fb *fb;
 	u32 data[2];
 	u64 y_fb_dma;
 	u64 c_fb_dma;
 	int err;
 
+	inst->num_nalu++;
 	/* bs NULL means flush decoder */
 	if (!bs)
 		return vpu_dec_reset(vpu);
 
+	fb = inst->ctx->dev->vdec_pdata->get_cap_buffer(inst->ctx);
+	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+	dst_buf_info = container_of(fb, struct mtk_video_dec_buf, frame_buffer);
+
 	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
 	c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
 
 	mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
-			 ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
+			 inst->num_nalu, y_fb_dma, c_fb_dma, fb);
 
 	inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
 	inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
 	inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
 	inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
 
+	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
+				   &dst_buf_info->m2m_buf.vb, true);
 	get_vdec_decode_parameters(inst);
 	data[0] = bs->size;
 	/*
@@ -734,6 +744,8 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
 
 	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
 	mtk_vcodec_debug(inst, "\n - NALU[%d]", inst->num_nalu);
+
+	inst->ctx->dev->vdec_pdata->cap_to_disp(inst->ctx, fb, 0);
 	return 0;
 
 err_free_fb_out:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (5 preceding siblings ...)
  2022-02-23  3:39 ` [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-02-25  9:24   ` AngeloGioacchino Del Regno
  2022-03-01 14:34   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C Yunfei Dong
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Supported output and capture format types for mt8192 are different
with mt8183. Needs to get format types according to decoder capability.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      |   8 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |  13 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 117 +++++++++++++-----
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  13 +-
 4 files changed, 107 insertions(+), 44 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 304f5afbd419..bae43938ee37 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -26,7 +26,7 @@ mtk_vdec_find_format(struct v4l2_format *f,
 	const struct mtk_video_fmt *fmt;
 	unsigned int k;
 
-	for (k = 0; k < dec_pdata->num_formats; k++) {
+	for (k = 0; k < *dec_pdata->num_formats; k++) {
 		fmt = &dec_pdata->vdec_formats[k];
 		if (fmt->fourcc == f->fmt.pix_mp.pixelformat)
 			return fmt;
@@ -525,7 +525,7 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 	if (fsize->index != 0)
 		return -EINVAL;
 
-	for (i = 0; i < dec_pdata->num_framesizes; ++i) {
+	for (i = 0; i < *dec_pdata->num_framesizes; ++i) {
 		if (fsize->pixel_format != dec_pdata->vdec_framesizes[i].fourcc)
 			continue;
 
@@ -564,7 +564,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
 	const struct mtk_video_fmt *fmt;
 	int i, j = 0;
 
-	for (i = 0; i < dec_pdata->num_formats; i++) {
+	for (i = 0; i < *dec_pdata->num_formats; i++) {
 		if (output_queue &&
 		    dec_pdata->vdec_formats[i].type != MTK_FMT_DEC)
 			continue;
@@ -577,7 +577,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
 		++j;
 	}
 
-	if (i == dec_pdata->num_formats)
+	if (i == *dec_pdata->num_formats)
 		return -EINVAL;
 
 	fmt = &dec_pdata->vdec_formats[i];
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
index 7966c132be8f..3f33beb9c551 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
@@ -37,7 +37,9 @@ static const struct mtk_video_fmt mtk_video_formats[] = {
 	},
 };
 
-#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
+static const unsigned int num_supported_formats =
+	ARRAY_SIZE(mtk_video_formats);
+
 #define DEFAULT_OUT_FMT_IDX 0
 #define DEFAULT_CAP_FMT_IDX 3
 
@@ -59,7 +61,8 @@ static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
 	},
 };
 
-#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
+static const unsigned int num_supported_framesize =
+	ARRAY_SIZE(mtk_vdec_framesizes);
 
 /*
  * This function tries to clean all display buffers, the buffers will return
@@ -235,7 +238,7 @@ static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
 	unsigned int k;
 
 	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
-	for (k = 0; k < NUM_FORMATS; k++) {
+	for (k = 0; k < num_supported_formats; k++) {
 		fmt = &mtk_video_formats[k];
 		if (fmt->fourcc == pixelformat) {
 			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
@@ -617,11 +620,11 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
 	.vdec_formats = mtk_video_formats,
-	.num_formats = NUM_FORMATS,
+	.num_formats = &num_supported_formats,
 	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
 	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
 	.vdec_framesizes = mtk_vdec_framesizes,
-	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.num_framesizes = &num_supported_framesize,
 	.worker = mtk_vdec_worker,
 	.flush_decoder = mtk_vdec_flush_decoder,
 	.is_subdev_supported = false,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 6d481410bf89..e51d935bd21d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -81,33 +81,23 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
 
 #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
 
-static const struct mtk_video_fmt mtk_video_formats[] = {
-	{
-		.fourcc = V4L2_PIX_FMT_H264_SLICE,
-		.type = MTK_FMT_DEC,
-		.num_planes = 1,
-	},
-	{
-		.fourcc = V4L2_PIX_FMT_MM21,
-		.type = MTK_FMT_FRAME,
-		.num_planes = 2,
-	},
+static struct mtk_video_fmt mtk_video_formats[2];
+static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
+
+static struct mtk_video_fmt default_out_format;
+static struct mtk_video_fmt default_cap_format;
+static unsigned int num_formats;
+static unsigned int num_framesizes;
+
+static struct v4l2_frmsize_stepwise stepwise_fhd = {
+	.min_width = MTK_VDEC_MIN_W,
+	.max_width = MTK_VDEC_MAX_W,
+	.step_width = 16,
+	.min_height = MTK_VDEC_MIN_H,
+	.max_height = MTK_VDEC_MAX_H,
+	.step_height = 16
 };
 
-#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
-#define DEFAULT_OUT_FMT_IDX    0
-#define DEFAULT_CAP_FMT_IDX    1
-
-static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
-	{
-		.fourcc	= V4L2_PIX_FMT_H264_SLICE,
-		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
-				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
-	},
-};
-
-#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
-
 static void mtk_vdec_stateless_out_to_done(struct mtk_vcodec_ctx *ctx,
 					   struct mtk_vcodec_mem *bs, int error)
 {
@@ -350,6 +340,62 @@ const struct media_device_ops mtk_vcodec_media_ops = {
 	.req_queue	= v4l2_m2m_request_queue,
 };
 
+static void mtk_vcodec_add_formats(unsigned int fourcc,
+				   struct mtk_vcodec_ctx *ctx)
+{
+	struct mtk_vcodec_dev *dev = ctx->dev;
+	const struct mtk_vcodec_dec_pdata *pdata = dev->vdec_pdata;
+	int count_formats = *pdata->num_formats;
+	int count_framesizes = *pdata->num_framesizes;
+
+	switch (fourcc) {
+	case V4L2_PIX_FMT_H264_SLICE:
+		mtk_video_formats[count_formats].fourcc = fourcc;
+		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
+		mtk_video_formats[count_formats].num_planes = 1;
+
+		mtk_vdec_framesizes[count_framesizes].fourcc = fourcc;
+		mtk_vdec_framesizes[count_framesizes].stepwise = stepwise_fhd;
+		num_framesizes++;
+		break;
+	case V4L2_PIX_FMT_MM21:
+		mtk_video_formats[count_formats].fourcc = fourcc;
+		mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
+		mtk_video_formats[count_formats].num_planes = 2;
+		break;
+	default:
+		mtk_v4l2_err("Can not add unsupported format type");
+		return;
+	}
+
+	num_formats++;
+	mtk_v4l2_debug(3, "num_formats: %d num_frames:%d dec_capability: 0x%x",
+		       count_formats, count_framesizes, ctx->dev->dec_capability);
+}
+
+static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
+{
+	int cap_format_count = 0, out_format_count = 0;
+
+	if (num_formats && num_framesizes)
+		return;
+
+	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_MM21) {
+		mtk_vcodec_add_formats(V4L2_PIX_FMT_MM21, ctx);
+		cap_format_count++;
+	}
+	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_H264_SLICE) {
+		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
+		out_format_count++;
+	}
+
+	if (cap_format_count)
+		default_cap_format = mtk_video_formats[cap_format_count - 1];
+	if (out_format_count)
+		default_out_format =
+			mtk_video_formats[cap_format_count + out_format_count - 1];
+}
+
 static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
 {
 	struct vb2_queue *src_vq;
@@ -360,6 +406,11 @@ static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
 	if (ctx->dev->vdec_pdata->hw_arch != MTK_VDEC_PURE_SINGLE_CORE)
 		v4l2_m2m_set_dst_buffered(ctx->m2m_ctx, 1);
 
+	if (!ctx->dev->vdec_pdata->is_subdev_supported)
+		ctx->dev->dec_capability |=
+			MTK_VDEC_FORMAT_H264_SLICE | MTK_VDEC_FORMAT_MM21;
+	mtk_vcodec_get_supported_formats(ctx);
+
 	/* Support request api for output plane */
 	src_vq->supports_requests = true;
 	src_vq->requires_requests = true;
@@ -393,11 +444,11 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
 	.vdec_formats = mtk_video_formats,
-	.num_formats = NUM_FORMATS,
-	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
-	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.num_formats = &num_formats,
+	.default_out_fmt = &default_out_format,
+	.default_cap_fmt = &default_cap_format,
 	.vdec_framesizes = mtk_vdec_framesizes,
-	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.num_framesizes = &num_framesizes,
 	.uses_stateless_api = true,
 	.worker = mtk_vdec_worker,
 	.flush_decoder = mtk_vdec_flush_decoder,
@@ -413,11 +464,11 @@ const struct mtk_vcodec_dec_pdata mtk_lat_sig_core_pdata = {
 	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
 	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
 	.vdec_formats = mtk_video_formats,
-	.num_formats = NUM_FORMATS,
-	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
-	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
+	.num_formats = &num_formats,
+	.default_out_fmt = &default_out_format,
+	.default_cap_fmt = &default_cap_format,
 	.vdec_framesizes = mtk_vdec_framesizes,
-	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
+	.num_framesizes = &num_framesizes,
 	.uses_stateless_api = true,
 	.worker = mtk_vdec_worker,
 	.flush_decoder = mtk_vdec_flush_decoder,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 9fcaf69549dd..270c73c05285 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -344,6 +344,15 @@ enum mtk_vdec_hw_arch {
 	MTK_VDEC_LAT_SINGLE_CORE,
 };
 
+/*
+ * struct mtk_vdec_format_types - Structure used to get supported
+ *		  format types according to decoder capability
+ */
+enum mtk_vdec_format_types {
+	MTK_VDEC_FORMAT_MM21 = 0x20,
+	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
+};
+
 /**
  * struct mtk_vcodec_dec_pdata - compatible data for each IC
  * @init_vdec_params: init vdec params
@@ -379,12 +388,12 @@ struct mtk_vcodec_dec_pdata {
 	struct vb2_ops *vdec_vb2_ops;
 
 	const struct mtk_video_fmt *vdec_formats;
-	const int num_formats;
+	const int *num_formats;
 	const struct mtk_video_fmt *default_out_fmt;
 	const struct mtk_video_fmt *default_cap_fmt;
 
 	const struct mtk_codec_framesizes *vdec_framesizes;
-	const int num_framesizes;
+	const int *num_framesizes;
 
 	enum mtk_vdec_hw_arch hw_arch;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (6 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-02-25  9:24   ` AngeloGioacchino Del Regno
  2022-02-23  3:40 ` [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability Yunfei Dong
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Needs to use mediatek compressed mode for mt8192 decoder.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c   | 7 ++++++-
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h         | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index e51d935bd21d..9333e3418b98 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -81,7 +81,7 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
 
 #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
 
-static struct mtk_video_fmt mtk_video_formats[2];
+static struct mtk_video_fmt mtk_video_formats[3];
 static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
 
 static struct mtk_video_fmt default_out_format;
@@ -359,6 +359,7 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
 		num_framesizes++;
 		break;
 	case V4L2_PIX_FMT_MM21:
+	case V4L2_PIX_FMT_MT21C:
 		mtk_video_formats[count_formats].fourcc = fourcc;
 		mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
 		mtk_video_formats[count_formats].num_planes = 2;
@@ -384,6 +385,10 @@ static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
 		mtk_vcodec_add_formats(V4L2_PIX_FMT_MM21, ctx);
 		cap_format_count++;
 	}
+	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_MT21C) {
+		mtk_vcodec_add_formats(V4L2_PIX_FMT_MT21C, ctx);
+		cap_format_count++;
+	}
 	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_H264_SLICE) {
 		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
 		out_format_count++;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 270c73c05285..cca0f1dbf581 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -350,6 +350,7 @@ enum mtk_vdec_hw_arch {
  */
 enum mtk_vdec_format_types {
 	MTK_VDEC_FORMAT_MM21 = 0x20,
+	MTK_VDEC_FORMAT_MT21C = 0x40,
 	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (7 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-03-01 19:02   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 10/15] media: mtk-vcodec: Fix v4l2-compliance fail Yunfei Dong
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

For vp8 not support 4K, need to disable it.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index bae43938ee37..ba188d16f0fb 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -532,7 +532,8 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
 		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
 		if (!(ctx->dev->dec_capability &
-				VCODEC_CAPABILITY_4K_DISABLED)) {
+				VCODEC_CAPABILITY_4K_DISABLED) &&
+				fsize->pixel_format != V4L2_PIX_FMT_VP8_FRAME) {
 			mtk_v4l2_debug(3, "4K is enabled");
 			fsize->stepwise.max_width =
 					VCODEC_DEC_4K_CODED_WIDTH;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 10/15] media: mtk-vcodec: Fix v4l2-compliance fail
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (8 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-02-23  3:40 ` [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type Yunfei Dong
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Need to use default pic info when get pic info fail.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Reviewed-by: Steve Cho <stevecho@chromium.org>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index ba188d16f0fb..5a429ed83ed4 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -478,11 +478,14 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 		ctx->picinfo.pic_w = pix_mp->width;
 		ctx->picinfo.pic_h = pix_mp->height;
 
+		/*
+		 * If get pic info fail, need to use the default pic info params, or
+		 * v4l2-compliance will fail
+		 */
 		ret = vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, &ctx->picinfo);
 		if (ret) {
 			mtk_v4l2_err("[%d]Error!! Get GET_PARAM_PICTURE_INFO Fail",
 				     ctx->id);
-			return -EINVAL;
 		}
 
 		ctx->last_decoded_picinfo = ctx->picinfo;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (9 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 10/15] media: mtk-vcodec: Fix v4l2-compliance fail Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-02-25  9:24   ` AngeloGioacchino Del Regno
  2022-02-23  3:40 ` [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code Yunfei Dong
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Capture queue format type is difference for different platform,
need to calculate capture buffer size according to capture queue
format type in scp.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 2 ++
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 5a429ed83ed4..6ad17e69e32d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -468,6 +468,8 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 			}
 			ctx->state = MTK_STATE_INIT;
 		}
+	} else {
+		ctx->capture_fourcc = fmt->fourcc;
 	}
 
 	/*
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index cca0f1dbf581..d60561065656 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -274,6 +274,7 @@ struct vdec_pic_info {
  *		     to be used with encoder and stateful decoder.
  * @is_flushing: set to true if flushing is in progress.
  * @current_codec: current set input codec, in V4L2 pixel format
+ * @capture_fourcc: capture queue type in V4L2 pixel format
  *
  * @colorspace: enum v4l2_colorspace; supplemental to pixelformat
  * @ycbcr_enc: enum v4l2_ycbcr_encoding, Y'CbCr encoding
@@ -321,6 +322,7 @@ struct mtk_vcodec_ctx {
 	bool is_flushing;
 
 	u32 current_codec;
+	u32 capture_fourcc;
 
 	enum v4l2_colorspace colorspace;
 	enum v4l2_ycbcr_encoding ycbcr_enc;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (10 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-03-01 21:30   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192 Yunfei Dong
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Mt8192 can use some of common code with mt8183. Moves them to
a new file in order to reuse.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../mtk-vcodec/vdec/vdec_h264_req_common.c    | 310 +++++++++++++
 .../mtk-vcodec/vdec/vdec_h264_req_common.h    | 253 +++++++++++
 .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 424 ++----------------
 4 files changed, 606 insertions(+), 382 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 359619653a0e..3f41d748eee5 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -9,6 +9,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp8_if.o \
 		vdec/vdec_vp9_if.o \
 		vdec/vdec_h264_req_if.o \
+		vdec/vdec_h264_req_common.o \
 		mtk_vcodec_dec_drv.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
new file mode 100644
index 000000000000..6c68bee632d6
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
@@ -0,0 +1,310 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ * Author: Yunfei Dong <yunfei.dong@mediatek.com>
+ */
+
+#include "vdec_h264_req_common.h"
+
+/* get used parameters for sps/pps */
+#define GET_MTK_VDEC_FLAG(cond, flag) \
+	{ dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
+#define GET_MTK_VDEC_PARAM(param) \
+	{ dst_param->param = src_param->param; }
+
+/*
+ * The firmware expects unused reflist entries to have the value 0x20.
+ */
+void mtk_vdec_h264_fixup_ref_list(u8 *ref_list, size_t num_valid)
+{
+	memset_io(&ref_list[num_valid], 0x20, 32 - num_valid);
+}
+
+void *mtk_vdec_h264_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
+{
+	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
+
+	if (!ctrl)
+		return ERR_PTR(-EINVAL);
+
+	return ctrl->p_cur.p;
+}
+
+void mtk_vdec_h264_fill_dpb_info(struct mtk_vcodec_ctx *ctx,
+				 struct slice_api_h264_decode_param *decode_params,
+				 struct mtk_h264_dpb_info *h264_dpb_info)
+{
+	const struct slice_h264_dpb_entry *dpb;
+	struct vb2_queue *vq;
+	struct vb2_buffer *vb;
+	struct vb2_v4l2_buffer *vb2_v4l2;
+	int index, vb2_index;
+
+	vq = v4l2_m2m_get_vq(ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+
+	for (index = 0; index < V4L2_H264_NUM_DPB_ENTRIES; index++) {
+		dpb = &decode_params->dpb[index];
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
+			h264_dpb_info[index].reference_flag = 0;
+			continue;
+		}
+
+		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
+		if (vb2_index < 0) {
+			dev_err(&ctx->dev->plat_dev->dev,
+				"Reference invalid: dpb_index(%d) reference_ts(%lld)",
+				index, dpb->reference_ts);
+			continue;
+		}
+
+		/* 1 for short term reference, 2 for long term reference */
+		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			h264_dpb_info[index].reference_flag = 1;
+		else
+			h264_dpb_info[index].reference_flag = 2;
+
+		vb = vq->bufs[vb2_index];
+		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
+		h264_dpb_info[index].field = vb2_v4l2->field;
+
+		h264_dpb_info[index].y_dma_addr =
+			vb2_dma_contig_plane_dma_addr(vb, 0);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			h264_dpb_info[index].c_dma_addr =
+				vb2_dma_contig_plane_dma_addr(vb, 1);
+		else
+			h264_dpb_info[index].c_dma_addr =
+				h264_dpb_info[index].y_dma_addr +
+				ctx->picinfo.fb_sz[0];
+	}
+}
+
+void mtk_vdec_h264_copy_sps_params(struct mtk_h264_sps_param *dst_param,
+				   const struct v4l2_ctrl_h264_sps *src_param)
+{
+	GET_MTK_VDEC_PARAM(chroma_format_idc);
+	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
+	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
+	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
+	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
+	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
+	GET_MTK_VDEC_PARAM(max_num_ref_frames);
+	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
+	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
+
+	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
+			  V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
+	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
+			  V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
+	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
+			  V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
+	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
+			  V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
+	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
+			  V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
+	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
+			  V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
+}
+
+void mtk_vdec_h264_copy_pps_params(struct mtk_h264_pps_param *dst_param,
+				   const struct v4l2_ctrl_h264_pps *src_param)
+{
+	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
+	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
+	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
+	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
+	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
+	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
+
+	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
+			  V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
+	GET_MTK_VDEC_FLAG(pic_order_present_flag,
+			  V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
+	GET_MTK_VDEC_FLAG(weighted_pred_flag,
+			  V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
+	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
+			  V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
+	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
+			  V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
+	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
+			  V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
+	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
+			  V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
+	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
+			  V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
+}
+
+void mtk_vdec_h264_copy_slice_hd_params(struct mtk_h264_slice_hd_param *dst_param,
+					const struct v4l2_ctrl_h264_slice_params *src_param,
+					const struct v4l2_ctrl_h264_decode_params *dec_param)
+{
+	int temp;
+
+	GET_MTK_VDEC_PARAM(first_mb_in_slice);
+	GET_MTK_VDEC_PARAM(slice_type);
+	GET_MTK_VDEC_PARAM(cabac_init_idc);
+	GET_MTK_VDEC_PARAM(slice_qp_delta);
+	GET_MTK_VDEC_PARAM(disable_deblocking_filter_idc);
+	GET_MTK_VDEC_PARAM(slice_alpha_c0_offset_div2);
+	GET_MTK_VDEC_PARAM(slice_beta_offset_div2);
+	GET_MTK_VDEC_PARAM(num_ref_idx_l0_active_minus1);
+	GET_MTK_VDEC_PARAM(num_ref_idx_l1_active_minus1);
+
+	dst_param->frame_num = dec_param->frame_num;
+	dst_param->pic_order_cnt_lsb = dec_param->pic_order_cnt_lsb;
+
+	dst_param->delta_pic_order_cnt_bottom =
+		dec_param->delta_pic_order_cnt_bottom;
+	dst_param->delta_pic_order_cnt0 =
+		dec_param->delta_pic_order_cnt0;
+	dst_param->delta_pic_order_cnt1 =
+		dec_param->delta_pic_order_cnt1;
+
+	temp = dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC;
+	dst_param->field_pic_flag = temp ? 1 : 0;
+
+	temp = dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD;
+	dst_param->bottom_field_flag = temp ? 1 : 0;
+
+	GET_MTK_VDEC_FLAG(direct_spatial_mv_pred_flag,
+			  V4L2_H264_SLICE_FLAG_DIRECT_SPATIAL_MV_PRED);
+}
+
+void mtk_vdec_h264_copy_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
+				       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
+{
+	memcpy_toio(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
+		    sizeof(dst_matrix->scaling_list_4x4));
+
+	memcpy_toio(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
+		    sizeof(dst_matrix->scaling_list_8x8));
+}
+
+void
+mtk_vdec_h264_copy_decode_params(struct slice_api_h264_decode_param *dst_params,
+				 const struct v4l2_ctrl_h264_decode_params *src_params,
+				 const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
+{
+	struct slice_h264_dpb_entry *dst_entry;
+	const struct v4l2_h264_dpb_entry *src_entry;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
+		dst_entry = &dst_params->dpb[i];
+		src_entry = &dpb[i];
+
+		dst_entry->reference_ts = src_entry->reference_ts;
+		dst_entry->frame_num = src_entry->frame_num;
+		dst_entry->pic_num = src_entry->pic_num;
+		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
+		dst_entry->bottom_field_order_cnt =
+			src_entry->bottom_field_order_cnt;
+		dst_entry->flags = src_entry->flags;
+	}
+
+	/* num_slices is a leftover from the old H.264 support and is ignored
+	 * by the firmware.
+	 */
+	dst_params->num_slices = 0;
+	dst_params->nal_ref_idc = src_params->nal_ref_idc;
+	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
+	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
+	dst_params->flags = src_params->flags;
+}
+
+static bool mtk_vdec_h264_dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
+					  const struct v4l2_h264_dpb_entry *b)
+{
+	return a->top_field_order_cnt == b->top_field_order_cnt &&
+	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+}
+
+/*
+ * Move DPB entries of dec_param that refer to a frame already existing in dpb
+ * into the already existing slot in dpb, and move other entries into new slots.
+ *
+ * This function is an adaptation of the similarly-named function in
+ * hantro_h264.c.
+ */
+void mtk_vdec_h264_update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
+			      struct v4l2_h264_dpb_entry *dpb)
+{
+	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	unsigned int i, j;
+
+	/* Disable all entries by default, and mark the ones in use. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+			set_bit(i, in_use);
+		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+	}
+
+	/* Try to match new DPB entries with existing ones by their POCs. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		/*
+		 * To cut off some comparisons, iterate only on target DPB
+		 * entries were already used.
+		 */
+		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
+			struct v4l2_h264_dpb_entry *cdpb;
+
+			cdpb = &dpb[j];
+			if (!mtk_vdec_h264_dpb_entry_match(cdpb, ndpb))
+				continue;
+
+			*cdpb = *ndpb;
+			set_bit(j, used);
+			/* Don't reiterate on this one. */
+			clear_bit(j, in_use);
+			break;
+		}
+
+		if (j == ARRAY_SIZE(dec_param->dpb))
+			set_bit(i, new);
+	}
+
+	/* For entries that could not be matched, use remaining free slots. */
+	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+		struct v4l2_h264_dpb_entry *cdpb;
+
+		/*
+		 * Both arrays are of the same sizes, so there is no way
+		 * we can end up with no space in target array, unless
+		 * something is buggy.
+		 */
+		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
+		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
+			return;
+
+		cdpb = &dpb[j];
+		*cdpb = *ndpb;
+		set_bit(j, used);
+	}
+}
+
+unsigned int mtk_vdec_h264_get_mv_buf_size(unsigned int width, unsigned int height)
+{
+	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
+
+	return HW_MB_STORE_SZ * unit_size;
+}
+
+int mtk_vdec_h264_find_start_code(unsigned char *data, unsigned int data_sz)
+{
+	if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
+		return 3;
+
+	if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
+	    data[3] == 1)
+		return 4;
+
+	return -1;
+}
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
new file mode 100644
index 000000000000..2d731bc777ca
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
@@ -0,0 +1,253 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ * Author: Yunfei Dong <yunfei.dong@mediatek.com>
+ */
+
+#ifndef _VDEC_H264_REQ_COMMON_H_
+#define _VDEC_H264_REQ_COMMON_H_
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <media/v4l2-h264.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "../mtk_vcodec_drv.h"
+
+#define NAL_NON_IDR_SLICE			0x01
+#define NAL_IDR_SLICE				0x05
+#define NAL_TYPE(value)				((value) & 0x1F)
+
+#define BUF_PREDICTION_SZ			(64 * 4096)
+#define MB_UNIT_LEN				16
+
+/* motion vector size (bytes) for every macro block */
+#define HW_MB_STORE_SZ				64
+
+#define H264_MAX_MV_NUM				32
+
+/**
+ * struct mtk_h264_dpb_info  - h264 dpb information
+ * @y_dma_addr: Y bitstream physical address
+ * @c_dma_addr: CbCr bitstream physical address
+ * @reference_flag: reference picture flag (short/long term reference picture)
+ * @field: field picture flag
+ */
+struct mtk_h264_dpb_info {
+	dma_addr_t y_dma_addr;
+	dma_addr_t c_dma_addr;
+	int reference_flag;
+	int field;
+};
+
+/**
+ * struct mtk_h264_sps_param  - parameters for sps
+ */
+struct mtk_h264_sps_param {
+	unsigned char chroma_format_idc;
+	unsigned char bit_depth_luma_minus8;
+	unsigned char bit_depth_chroma_minus8;
+	unsigned char log2_max_frame_num_minus4;
+	unsigned char pic_order_cnt_type;
+	unsigned char log2_max_pic_order_cnt_lsb_minus4;
+	unsigned char max_num_ref_frames;
+	unsigned char separate_colour_plane_flag;
+	unsigned short pic_width_in_mbs_minus1;
+	unsigned short pic_height_in_map_units_minus1;
+	unsigned int max_frame_nums;
+	unsigned char qpprime_y_zero_transform_bypass_flag;
+	unsigned char delta_pic_order_always_zero_flag;
+	unsigned char frame_mbs_only_flag;
+	unsigned char mb_adaptive_frame_field_flag;
+	unsigned char direct_8x8_inference_flag;
+	unsigned char reserved[3];
+};
+
+/**
+ * struct mtk_h264_pps_param  - parameters for pps
+ */
+struct mtk_h264_pps_param {
+	unsigned char num_ref_idx_l0_default_active_minus1;
+	unsigned char num_ref_idx_l1_default_active_minus1;
+	unsigned char weighted_bipred_idc;
+	char pic_init_qp_minus26;
+	char chroma_qp_index_offset;
+	char second_chroma_qp_index_offset;
+	unsigned char entropy_coding_mode_flag;
+	unsigned char pic_order_present_flag;
+	unsigned char deblocking_filter_control_present_flag;
+	unsigned char constrained_intra_pred_flag;
+	unsigned char weighted_pred_flag;
+	unsigned char redundant_pic_cnt_present_flag;
+	unsigned char transform_8x8_mode_flag;
+	unsigned char scaling_matrix_present_flag;
+	unsigned char reserved[2];
+};
+
+/**
+ * struct mtk_h264_slice_hd_param  - parameters for slice header
+ */
+struct mtk_h264_slice_hd_param {
+	unsigned int first_mb_in_slice;
+	unsigned int field_pic_flag;
+	unsigned int slice_type;
+	unsigned int frame_num;
+	int pic_order_cnt_lsb;
+	int delta_pic_order_cnt_bottom;
+	unsigned int bottom_field_flag;
+	unsigned int direct_spatial_mv_pred_flag;
+	int delta_pic_order_cnt0;
+	int delta_pic_order_cnt1;
+	unsigned int cabac_init_idc;
+	int slice_qp_delta;
+	unsigned int disable_deblocking_filter_idc;
+	int slice_alpha_c0_offset_div2;
+	int slice_beta_offset_div2;
+	unsigned int num_ref_idx_l0_active_minus1;
+	unsigned int num_ref_idx_l1_active_minus1;
+	unsigned int reserved;
+};
+
+struct slice_api_h264_scaling_matrix {
+	unsigned char scaling_list_4x4[6][16];
+	unsigned char scaling_list_8x8[6][64];
+};
+
+struct slice_h264_dpb_entry {
+	unsigned long long reference_ts;
+	unsigned short frame_num;
+	unsigned short pic_num;
+	/* Note that field is indicated by v4l2_buffer.field */
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
+};
+
+/**
+ * struct slice_api_h264_decode_param - parameters for decode.
+ */
+struct slice_api_h264_decode_param {
+	struct slice_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES];
+	unsigned short num_slices;
+	unsigned short nal_ref_idc;
+	unsigned char ref_pic_list_p0[32];
+	unsigned char ref_pic_list_b0[32];
+	unsigned char ref_pic_list_b1[32];
+	int top_field_order_cnt;
+	int bottom_field_order_cnt;
+	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
+};
+
+/**
+ * struct h264_fb - h264 decode frame buffer information
+ * @vdec_fb_va  : virtual address of struct vdec_fb
+ * @y_fb_dma    : dma address of Y frame buffer (luma)
+ * @c_fb_dma    : dma address of C frame buffer (chroma)
+ * @poc         : picture order count of frame buffer
+ * @reserved    : for 8 bytes alignment
+ */
+struct h264_fb {
+	u64 vdec_fb_va;
+	u64 y_fb_dma;
+	u64 c_fb_dma;
+	s32 poc;
+	u32 reserved;
+};
+
+/**
+ * mtk_vdec_h264_fixup_ref_list - fixup unused reference to 0x20.
+ * @ref_list: reference picture list
+ * @num_valid: used reference number
+ */
+void mtk_vdec_h264_fixup_ref_list(u8 *ref_list, size_t num_valid);
+
+/**
+ * mtk_vdec_h264_get_ctrl_ptr - get each CID contrl address.
+ * @ctx: v4l2 ctx
+ * @id: CID control ID
+ */
+void *mtk_vdec_h264_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id);
+
+/**
+ * mtk_vdec_h264_fill_dpb_info - get each CID contrl address.
+ * @ctx: v4l2 ctx
+ * @decode_params: slice decode params
+ * @h264_dpb_info: dpb buffer information
+ */
+void mtk_vdec_h264_fill_dpb_info(struct mtk_vcodec_ctx *ctx,
+				 struct slice_api_h264_decode_param *decode_params,
+				 struct mtk_h264_dpb_info *h264_dpb_info);
+
+/**
+ * mtk_vdec_h264_copy_sps_params - get sps params.
+ * @dst_params: sps params for hw decoder
+ * @src_params: sps params from user driver
+ */
+void mtk_vdec_h264_copy_sps_params(struct mtk_h264_sps_param *dst_param,
+				   const struct v4l2_ctrl_h264_sps *src_param);
+
+/**
+ * mtk_vdec_h264_copy_pps_params - get pps params.
+ * @dst_params: pps params for hw decoder
+ * @src_params: pps params from user driver
+ */
+void mtk_vdec_h264_copy_pps_params(struct mtk_h264_pps_param *dst_param,
+				   const struct v4l2_ctrl_h264_pps *src_param);
+
+/**
+ * mtk_vdec_h264_copy_slice_hd_params - get slice header params.
+ * @dst_params: slice params for hw decoder
+ * @src_params: slice params from user driver
+ * @dec_param: decode params from user driver
+ */
+void mtk_vdec_h264_copy_slice_hd_params(struct mtk_h264_slice_hd_param *dst_param,
+					const struct v4l2_ctrl_h264_slice_params *src_param,
+					const struct v4l2_ctrl_h264_decode_params *dec_param);
+
+/**
+ * mtk_vdec_h264_copy_scaling_matrix - get each CID contrl address.
+ * @dst_matrix: scaling list params for hw decoder
+ * @src_matrix: scaling list params from user driver
+ */
+void mtk_vdec_h264_copy_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
+				       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix);
+
+/**
+ * mtk_vdec_h264_copy_decode_params - get decode params.
+ * @dst_params: dst params for hw decoder
+ * @src_params: decode params from user driver
+ * @dpb: dpb information
+ */
+void
+mtk_vdec_h264_copy_decode_params(struct slice_api_h264_decode_param *dst_params,
+				 const struct v4l2_ctrl_h264_decode_params *src_params,
+				 const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES]);
+
+/**
+ * mtk_vdec_h264_update_dpb - updata dpb list.
+ * @dec_param: v4l2 control decode params
+ * @dpb: dpb entry informaton
+ */
+void mtk_vdec_h264_update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
+			      struct v4l2_h264_dpb_entry *dpb);
+
+/**
+ * mtk_vdec_h264_find_start_code - find h264 start code using sofeware.
+ * @data: input buffer address
+ * @data_sz: input buffer size
+ *
+ * Return: returns start code position.
+ */
+int mtk_vdec_h264_find_start_code(unsigned char *data, unsigned int data_sz);
+
+/**
+ * mtk_vdec_h264_get_mv_buf_size - get mv buffer size.
+ * @width: picture width
+ * @height: picture height
+ *
+ * Return: returns mv buffer size.
+ */
+unsigned int mtk_vdec_h264_get_mv_buf_size(unsigned int width, unsigned int height);
+
+#endif
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
index 36f3dc1fbe3b..87e0b2f95572 100644
--- a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
@@ -12,109 +12,7 @@
 #include "../vdec_drv_base.h"
 #include "../vdec_drv_if.h"
 #include "../vdec_vpu_if.h"
-
-#define BUF_PREDICTION_SZ			(64 * 4096)
-#define MB_UNIT_LEN				16
-
-/* get used parameters for sps/pps */
-#define GET_MTK_VDEC_FLAG(cond, flag) \
-	{ dst_param->cond = ((src_param->flags & (flag)) ? (1) : (0)); }
-#define GET_MTK_VDEC_PARAM(param) \
-	{ dst_param->param = src_param->param; }
-/* motion vector size (bytes) for every macro block */
-#define HW_MB_STORE_SZ				64
-
-#define H264_MAX_FB_NUM				17
-#define H264_MAX_MV_NUM				32
-#define HDR_PARSING_BUF_SZ			1024
-
-/**
- * struct mtk_h264_dpb_info  - h264 dpb information
- * @y_dma_addr: Y bitstream physical address
- * @c_dma_addr: CbCr bitstream physical address
- * @reference_flag: reference picture flag (short/long term reference picture)
- * @field: field picture flag
- */
-struct mtk_h264_dpb_info {
-	dma_addr_t y_dma_addr;
-	dma_addr_t c_dma_addr;
-	int reference_flag;
-	int field;
-};
-
-/*
- * struct mtk_h264_sps_param  - parameters for sps
- */
-struct mtk_h264_sps_param {
-	unsigned char chroma_format_idc;
-	unsigned char bit_depth_luma_minus8;
-	unsigned char bit_depth_chroma_minus8;
-	unsigned char log2_max_frame_num_minus4;
-	unsigned char pic_order_cnt_type;
-	unsigned char log2_max_pic_order_cnt_lsb_minus4;
-	unsigned char max_num_ref_frames;
-	unsigned char separate_colour_plane_flag;
-	unsigned short pic_width_in_mbs_minus1;
-	unsigned short pic_height_in_map_units_minus1;
-	unsigned int max_frame_nums;
-	unsigned char qpprime_y_zero_transform_bypass_flag;
-	unsigned char delta_pic_order_always_zero_flag;
-	unsigned char frame_mbs_only_flag;
-	unsigned char mb_adaptive_frame_field_flag;
-	unsigned char direct_8x8_inference_flag;
-	unsigned char reserved[3];
-};
-
-/*
- * struct mtk_h264_pps_param  - parameters for pps
- */
-struct mtk_h264_pps_param {
-	unsigned char num_ref_idx_l0_default_active_minus1;
-	unsigned char num_ref_idx_l1_default_active_minus1;
-	unsigned char weighted_bipred_idc;
-	char pic_init_qp_minus26;
-	char chroma_qp_index_offset;
-	char second_chroma_qp_index_offset;
-	unsigned char entropy_coding_mode_flag;
-	unsigned char pic_order_present_flag;
-	unsigned char deblocking_filter_control_present_flag;
-	unsigned char constrained_intra_pred_flag;
-	unsigned char weighted_pred_flag;
-	unsigned char redundant_pic_cnt_present_flag;
-	unsigned char transform_8x8_mode_flag;
-	unsigned char scaling_matrix_present_flag;
-	unsigned char reserved[2];
-};
-
-struct slice_api_h264_scaling_matrix {
-	unsigned char scaling_list_4x4[6][16];
-	unsigned char scaling_list_8x8[6][64];
-};
-
-struct slice_h264_dpb_entry {
-	unsigned long long reference_ts;
-	unsigned short frame_num;
-	unsigned short pic_num;
-	/* Note that field is indicated by v4l2_buffer.field */
-	int top_field_order_cnt;
-	int bottom_field_order_cnt;
-	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
-};
-
-/*
- * struct slice_api_h264_decode_param - parameters for decode.
- */
-struct slice_api_h264_decode_param {
-	struct slice_h264_dpb_entry dpb[16];
-	unsigned short num_slices;
-	unsigned short nal_ref_idc;
-	unsigned char ref_pic_list_p0[32];
-	unsigned char ref_pic_list_b0[32];
-	unsigned char ref_pic_list_b1[32];
-	int top_field_order_cnt;
-	int bottom_field_order_cnt;
-	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
-};
+#include "vdec_h264_req_common.h"
 
 /*
  * struct mtk_h264_dec_slice_param  - parameters for decode current frame
@@ -127,22 +25,6 @@ struct mtk_h264_dec_slice_param {
 	struct mtk_h264_dpb_info h264_dpb_info[16];
 };
 
-/**
- * struct h264_fb - h264 decode frame buffer information
- * @vdec_fb_va  : virtual address of struct vdec_fb
- * @y_fb_dma    : dma address of Y frame buffer (luma)
- * @c_fb_dma    : dma address of C frame buffer (chroma)
- * @poc         : picture order count of frame buffer
- * @reserved    : for 8 bytes alignment
- */
-struct h264_fb {
-	u64 vdec_fb_va;
-	u64 y_fb_dma;
-	u64 c_fb_dma;
-	s32 poc;
-	u32 reserved;
-};
-
 /**
  * struct vdec_h264_dec_info - decode information
  * @dpb_sz		: decoding picture buffer size
@@ -212,265 +94,45 @@ struct vdec_h264_slice_inst {
 	struct v4l2_h264_dpb_entry dpb[16];
 };
 
-static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
-{
-	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
-
-	return ctrl->p_cur.p;
-}
-
-static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
-			      struct mtk_h264_dec_slice_param *slice_param)
-{
-	struct vb2_queue *vq;
-	struct vb2_buffer *vb;
-	struct vb2_v4l2_buffer *vb2_v4l2;
-	u64 index;
-
-	vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
-
-	for (index = 0; index < ARRAY_SIZE(slice_param->decode_params.dpb); index++) {
-		const struct slice_h264_dpb_entry *dpb;
-		int vb2_index;
-
-		dpb = &slice_param->decode_params.dpb[index];
-		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
-			slice_param->h264_dpb_info[index].reference_flag = 0;
-			continue;
-		}
-
-		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
-		if (vb2_index < 0) {
-			mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
-				       index, dpb->reference_ts);
-			continue;
-		}
-		/* 1 for short term reference, 2 for long term reference */
-		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
-			slice_param->h264_dpb_info[index].reference_flag = 1;
-		else
-			slice_param->h264_dpb_info[index].reference_flag = 2;
-
-		vb = vq->bufs[vb2_index];
-		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
-		slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
-
-		slice_param->h264_dpb_info[index].y_dma_addr =
-			vb2_dma_contig_plane_dma_addr(vb, 0);
-		if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
-			slice_param->h264_dpb_info[index].c_dma_addr =
-				vb2_dma_contig_plane_dma_addr(vb, 1);
-		}
-	}
-}
-
-static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
-				    const struct v4l2_ctrl_h264_sps *src_param)
-{
-	GET_MTK_VDEC_PARAM(chroma_format_idc);
-	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
-	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
-	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
-	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
-	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
-	GET_MTK_VDEC_PARAM(max_num_ref_frames);
-	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
-	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
-
-	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
-			  V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
-	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
-			  V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
-	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
-			  V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
-	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
-			  V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
-	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
-			  V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
-	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
-			  V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
-}
-
-static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
-				    const struct v4l2_ctrl_h264_pps *src_param)
+static int get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
 {
-	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
-	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
-	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
-	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
-	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
-	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
-
-	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
-			  V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
-	GET_MTK_VDEC_FLAG(pic_order_present_flag,
-			  V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
-	GET_MTK_VDEC_FLAG(weighted_pred_flag,
-			  V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
-	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
-			  V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
-	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
-			  V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
-	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
-			  V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
-	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
-			  V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
-	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
-			  V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
-}
-
-static void
-get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
-			const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
-{
-	memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
-	       sizeof(dst_matrix->scaling_list_4x4));
-
-	memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
-	       sizeof(dst_matrix->scaling_list_8x8));
-}
-
-static void
-get_h264_decode_parameters(struct slice_api_h264_decode_param *dst_params,
-			   const struct v4l2_ctrl_h264_decode_params *src_params,
-			   const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
-		struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
-		const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
-
-		dst_entry->reference_ts = src_entry->reference_ts;
-		dst_entry->frame_num = src_entry->frame_num;
-		dst_entry->pic_num = src_entry->pic_num;
-		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
-		dst_entry->bottom_field_order_cnt =
-			src_entry->bottom_field_order_cnt;
-		dst_entry->flags = src_entry->flags;
-	}
-
-	/*
-	 * num_slices is a leftover from the old H.264 support and is ignored
-	 * by the firmware.
-	 */
-	dst_params->num_slices = 0;
-	dst_params->nal_ref_idc = src_params->nal_ref_idc;
-	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
-	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
-	dst_params->flags = src_params->flags;
-}
-
-static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
-			    const struct v4l2_h264_dpb_entry *b)
-{
-	return a->top_field_order_cnt == b->top_field_order_cnt &&
-	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
-}
-
-/*
- * Move DPB entries of dec_param that refer to a frame already existing in dpb
- * into the already existing slot in dpb, and move other entries into new slots.
- *
- * This function is an adaptation of the similarly-named function in
- * hantro_h264.c.
- */
-static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
-		       struct v4l2_h264_dpb_entry *dpb)
-{
-	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
-	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
-	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
-	unsigned int i, j;
-
-	/* Disable all entries by default, and mark the ones in use. */
-	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
-		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
-			set_bit(i, in_use);
-		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
-	}
-
-	/* Try to match new DPB entries with existing ones by their POCs. */
-	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
-		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
-
-		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
-			continue;
-
-		/*
-		 * To cut off some comparisons, iterate only on target DPB
-		 * entries were already used.
-		 */
-		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
-			struct v4l2_h264_dpb_entry *cdpb;
-
-			cdpb = &dpb[j];
-			if (!dpb_entry_match(cdpb, ndpb))
-				continue;
-
-			*cdpb = *ndpb;
-			set_bit(j, used);
-			/* Don't reiterate on this one. */
-			clear_bit(j, in_use);
-			break;
-		}
-
-		if (j == ARRAY_SIZE(dec_param->dpb))
-			set_bit(i, new);
-	}
-
-	/* For entries that could not be matched, use remaining free slots. */
-	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
-		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
-		struct v4l2_h264_dpb_entry *cdpb;
-
-		/*
-		 * Both arrays are of the same sizes, so there is no way
-		 * we can end up with no space in target array, unless
-		 * something is buggy.
-		 */
-		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
-		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
-			return;
-
-		cdpb = &dpb[j];
-		*cdpb = *ndpb;
-		set_bit(j, used);
-	}
-}
-
-/*
- * The firmware expects unused reflist entries to have the value 0x20.
- */
-static void fixup_ref_list(u8 *ref_list, size_t num_valid)
-{
-	memset(&ref_list[num_valid], 0x20, 32 - num_valid);
-}
-
-static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
-{
-	const struct v4l2_ctrl_h264_decode_params *dec_params =
-		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
-	const struct v4l2_ctrl_h264_sps *sps =
-		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
-	const struct v4l2_ctrl_h264_pps *pps =
-		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
-	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
-		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
+	const struct v4l2_ctrl_h264_decode_params *dec_params;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix;
 	struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
 	struct v4l2_h264_reflist_builder reflist_builder;
 	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
 	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
 	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
 
-	update_dpb(dec_params, inst->dpb);
+	dec_params =
+		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
+	if (IS_ERR(dec_params))
+		return PTR_ERR(dec_params);
+
+	sps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
+	if (IS_ERR(sps))
+		return PTR_ERR(sps);
+
+	pps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
+	if (IS_ERR(pps))
+		return PTR_ERR(pps);
 
-	get_h264_sps_parameters(&slice_param->sps, sps);
-	get_h264_pps_parameters(&slice_param->pps, pps);
-	get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
-	get_h264_decode_parameters(&slice_param->decode_params, dec_params,
-				   inst->dpb);
-	get_h264_dpb_list(inst, slice_param);
+	scaling_matrix =
+		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
+	if (IS_ERR(scaling_matrix))
+		return PTR_ERR(scaling_matrix);
+
+	mtk_vdec_h264_update_dpb(dec_params, inst->dpb);
+
+	mtk_vdec_h264_copy_sps_params(&slice_param->sps, sps);
+	mtk_vdec_h264_copy_pps_params(&slice_param->pps, pps);
+	mtk_vdec_h264_copy_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
+	mtk_vdec_h264_copy_decode_params(&slice_param->decode_params,
+					 dec_params, inst->dpb);
+	mtk_vdec_h264_fill_dpb_info(inst->ctx, &slice_param->decode_params,
+				    slice_param->h264_dpb_info);
 
 	/* Build the reference lists */
 	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
@@ -478,19 +140,14 @@ static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
 	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
 	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
 	/* Adapt the built lists to the firmware's expectations */
-	fixup_ref_list(p0_reflist, reflist_builder.num_valid);
-	fixup_ref_list(b0_reflist, reflist_builder.num_valid);
-	fixup_ref_list(b1_reflist, reflist_builder.num_valid);
+	mtk_vdec_h264_fixup_ref_list(p0_reflist, reflist_builder.num_valid);
+	mtk_vdec_h264_fixup_ref_list(b0_reflist, reflist_builder.num_valid);
+	mtk_vdec_h264_fixup_ref_list(b1_reflist, reflist_builder.num_valid);
 
 	memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
 	       sizeof(inst->vsi_ctx.h264_slice_params));
-}
 
-static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
-{
-	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
-
-	return HW_MB_STORE_SZ * unit_size;
+	return 0;
 }
 
 static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
@@ -525,7 +182,7 @@ static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
 	int i;
 	int err;
 	struct mtk_vcodec_mem *mem = NULL;
-	unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
+	unsigned int buf_sz = mtk_vdec_h264_get_mv_buf_size(pic->buf_w, pic->buf_h);
 
 	mtk_v4l2_debug(3, "size = 0x%x", buf_sz);
 	for (i = 0; i < H264_MAX_MV_NUM; i++) {
@@ -674,7 +331,7 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
 {
 	struct vdec_h264_slice_inst *inst = h_vdec;
 	const struct v4l2_ctrl_h264_decode_params *dec_params =
-		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
+		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
 	struct vdec_vpu_inst *vpu = &inst->vpu;
 	struct mtk_video_dec_buf *src_buf_info;
 	struct mtk_video_dec_buf *dst_buf_info;
@@ -706,7 +363,10 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
 
 	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
 				   &dst_buf_info->m2m_buf.vb, true);
-	get_vdec_decode_parameters(inst);
+	err = get_vdec_decode_parameters(inst);
+	if (err)
+		goto err_free_fb_out;
+
 	data[0] = bs->size;
 	/*
 	 * Reconstruct the first byte of the NAL unit, as the firmware requests
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (11 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-03-01 22:01   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding Yunfei Dong
  2022-02-23  3:40 ` [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding Yunfei Dong
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Adds h264 lat and core architecture driver for mt8192,
and the decode mode is frame based for stateless decoder.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  | 621 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   8 +-
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 include/linux/remoteproc/mtk_scp.h            |   2 +
 5 files changed, 632 insertions(+), 1 deletion(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 3f41d748eee5..22edb1c86598 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -10,6 +10,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp9_if.o \
 		vdec/vdec_h264_req_if.o \
 		vdec/vdec_h264_req_common.o \
+		vdec/vdec_h264_req_multi_if.o \
 		mtk_vcodec_dec_drv.o \
 		vdec_drv_if.o \
 		vdec_vpu_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
new file mode 100644
index 000000000000..82a279f327c4
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
@@ -0,0 +1,621 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ * Author: Yunfei Dong <yunfei.dong@mediatek.com>
+ */
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <media/v4l2-h264.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "../mtk_vcodec_util.h"
+#include "../mtk_vcodec_dec.h"
+#include "../mtk_vcodec_intr.h"
+#include "../vdec_drv_base.h"
+#include "../vdec_drv_if.h"
+#include "../vdec_vpu_if.h"
+#include "vdec_h264_req_common.h"
+
+/**
+ * enum vdec_h264_core_dec_err_type  - core decode error type
+ * @TRANS_BUFFER_FULL : trans buffer is full
+ * @SLICE_HEADER_FULL : slice header buffer is full
+ */
+enum vdec_h264_core_dec_err_type {
+	TRANS_BUFFER_FULL = 1,
+	SLICE_HEADER_FULL,
+};
+
+/**
+ * struct vdec_h264_slice_lat_dec_param  - parameters for decode current frame
+ * @sps : h264 sps syntax parameters
+ * @pps : h264 pps syntax parameters
+ * @slice_header: h264 slice header syntax parameters
+ * @scaling_matrix : h264 scaling list parameters
+ * @decode_params : decoder parameters of each frame used for hardware decode
+ * @h264_dpb_info : dpb reference list
+ */
+struct vdec_h264_slice_lat_dec_param {
+	struct mtk_h264_sps_param sps;
+	struct mtk_h264_pps_param pps;
+	struct mtk_h264_slice_hd_param slice_header;
+	struct slice_api_h264_scaling_matrix scaling_matrix;
+	struct slice_api_h264_decode_param decode_params;
+	struct mtk_h264_dpb_info h264_dpb_info[V4L2_H264_NUM_DPB_ENTRIES];
+};
+
+/**
+ * struct vdec_h264_slice_info - decode information
+ * @nal_info    : nal info of current picture
+ * @timeout     : Decode timeout: 1 timeout, 0 no timeount
+ * @bs_buf_size : bitstream size
+ * @bs_buf_addr : bitstream buffer dma address
+ * @y_fb_dma    : Y frame buffer dma address
+ * @c_fb_dma    : C frame buffer dma address
+ * @vdec_fb_va  : VDEC frame buffer struct virtual address
+ * @crc         : Used to check whether hardware's status is right
+ */
+struct vdec_h264_slice_info {
+	u16 nal_info;
+	u16 timeout;
+	u32 bs_buf_size;
+	u64 bs_buf_addr;
+	u64 y_fb_dma;
+	u64 c_fb_dma;
+	u64 vdec_fb_va;
+	u32 crc[8];
+};
+
+/**
+ * struct vdec_h264_slice_vsi - shared memory for decode information exchange
+ *        between VPU and Host. The memory is allocated by VPU then mapping to
+ *        Host in vdec_h264_slice_init() and freed in vdec_h264_slice_deinit()
+ *        by VPU. AP-W/R : AP is writer/reader on this item. VPU-W/R: VPU is
+ *        write/reader on this item.
+ * @wdma_err_addr       : wdma error dma address
+ * @wdma_start_addr     : wdma start dma address
+ * @wdma_end_addr       : wdma end dma address
+ * @slice_bc_start_addr : slice bc start dma address
+ * @slice_bc_end_addr   : slice bc end dma address
+ * @row_info_start_addr : row info start dma address
+ * @row_info_end_addr   : row info end dma address
+ * @trans_start         : trans start dma address
+ * @trans_end           : trans end dma address
+ * @wdma_end_addr_offset: wdma end address offset
+ *
+ * @mv_buf_dma          : HW working motion vector buffer
+ *                        dma address (AP-W, VPU-R)
+ * @dec                 : decode information (AP-R, VPU-W)
+ * @h264_slice_params   : decode parameters for hw used
+ */
+struct vdec_h264_slice_vsi {
+	/* LAT dec addr */
+	u64 wdma_err_addr;
+	u64 wdma_start_addr;
+	u64 wdma_end_addr;
+	u64 slice_bc_start_addr;
+	u64 slice_bc_end_addr;
+	u64 row_info_start_addr;
+	u64 row_info_end_addr;
+	u64 trans_start;
+	u64 trans_end;
+	u64 wdma_end_addr_offset;
+
+	u64 mv_buf_dma[H264_MAX_MV_NUM];
+	struct vdec_h264_slice_info dec;
+	struct vdec_h264_slice_lat_dec_param h264_slice_params;
+};
+
+/**
+ * struct vdec_h264_slice_share_info - shared information used to exchange
+ *                                     message between lat and core
+ * @sps	              : sequence header information from user space
+ * @dec_params        : decoder params from user space
+ * @h264_slice_params : decoder params used for hardware
+ * @trans_start       : trans start dma address
+ * @trans_end         : trans end dma address
+ * @nal_info          : nal info of current picture
+ */
+struct vdec_h264_slice_share_info {
+	struct v4l2_ctrl_h264_sps sps;
+	struct v4l2_ctrl_h264_decode_params dec_params;
+	struct vdec_h264_slice_lat_dec_param h264_slice_params;
+	u64 trans_start;
+	u64 trans_end;
+	u16 nal_info;
+};
+
+/**
+ * struct vdec_h264_slice_inst - h264 decoder instance
+ * @slice_dec_num        : how many picture be decoded
+ * @ctx                 : point to mtk_vcodec_ctx
+ * @pred_buf            : HW working predication buffer
+ * @mv_buf              : HW working motion vector buffer
+ * @vpu                 : VPU instance
+ * @vsi                 : vsi used for lat
+ * @vsi_core            : vsi used for core
+ *
+ * @resolution_changed  : resolution changed
+ * @realloc_mv_buf      : reallocate mv buffer
+ * @cap_num_planes      : number of capture queue plane
+ *
+ * @dpb : decoded picture buffer used to store reference buffer information
+ */
+struct vdec_h264_slice_inst {
+	unsigned int slice_dec_num;
+	struct mtk_vcodec_ctx *ctx;
+	struct mtk_vcodec_mem pred_buf;
+	struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
+	struct vdec_vpu_inst vpu;
+	struct vdec_h264_slice_vsi *vsi;
+	struct vdec_h264_slice_vsi *vsi_core;
+
+	unsigned int resolution_changed;
+	unsigned int realloc_mv_buf;
+	unsigned int cap_num_planes;
+
+	struct v4l2_h264_dpb_entry dpb[16];
+};
+
+static int vdec_h264_slice_fill_decode_parameters(struct vdec_h264_slice_inst *inst,
+						  struct vdec_h264_slice_share_info *share_info)
+{
+	struct vdec_h264_slice_lat_dec_param *slice_param = &inst->vsi->h264_slice_params;
+	const struct v4l2_ctrl_h264_decode_params *dec_params;
+	const struct v4l2_ctrl_h264_scaling_matrix *src_matrix;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
+
+	dec_params =
+		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
+	if (IS_ERR(dec_params))
+		return PTR_ERR(dec_params);
+
+	src_matrix =
+		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
+	if (IS_ERR(src_matrix))
+		return PTR_ERR(src_matrix);
+
+	sps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
+	if (IS_ERR(sps))
+		return PTR_ERR(sps);
+
+	pps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
+	if (IS_ERR(pps))
+		return PTR_ERR(pps);
+
+	if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) {
+		mtk_vcodec_err(inst, "h264 no support field bitstream.");
+		return -EINVAL;
+	}
+
+	mtk_vdec_h264_copy_sps_params(&slice_param->sps, sps);
+	mtk_vdec_h264_copy_pps_params(&slice_param->pps, pps);
+	mtk_vdec_h264_copy_scaling_matrix(&slice_param->scaling_matrix, src_matrix);
+
+	memcpy(&share_info->sps, sps, sizeof(*sps));
+	memcpy(&share_info->dec_params, dec_params, sizeof(*dec_params));
+
+	return 0;
+}
+
+static void vdec_h264_slice_fill_decode_reflist(struct vdec_h264_slice_inst *inst,
+						struct vdec_h264_slice_lat_dec_param *slice_param,
+						struct vdec_h264_slice_share_info *share_info)
+{
+	struct v4l2_ctrl_h264_decode_params *dec_params = &share_info->dec_params;
+	struct v4l2_ctrl_h264_sps *sps = &share_info->sps;
+	struct v4l2_h264_reflist_builder reflist_builder;
+	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
+	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
+	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
+
+	mtk_vdec_h264_update_dpb(dec_params, inst->dpb);
+
+	mtk_vdec_h264_copy_decode_params(&slice_param->decode_params, dec_params,
+					 inst->dpb);
+	mtk_vdec_h264_fill_dpb_info(inst->ctx, &slice_param->decode_params,
+				    slice_param->h264_dpb_info);
+
+	mtk_v4l2_debug(3, "cur poc = %d\n", dec_params->bottom_field_order_cnt);
+	/* Build the reference lists */
+	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
+				       inst->dpb);
+	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
+	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
+
+	/* Adapt the built lists to the firmware's expectations */
+	mtk_vdec_h264_fixup_ref_list(p0_reflist, reflist_builder.num_valid);
+	mtk_vdec_h264_fixup_ref_list(b0_reflist, reflist_builder.num_valid);
+	mtk_vdec_h264_fixup_ref_list(b1_reflist, reflist_builder.num_valid);
+}
+
+static int vdec_h264_slice_alloc_mv_buf(struct vdec_h264_slice_inst *inst,
+					struct vdec_pic_info *pic)
+{
+	unsigned int buf_sz = mtk_vdec_h264_get_mv_buf_size(pic->buf_w, pic->buf_h);
+	struct mtk_vcodec_mem *mem;
+	int i, err;
+
+	mtk_v4l2_debug(3, "size = 0x%x", buf_sz);
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+		mem->size = buf_sz;
+		err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+		if (err) {
+			mtk_vcodec_err(inst, "failed to allocate mv buf");
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static void vdec_h264_slice_free_mv_buf(struct vdec_h264_slice_inst *inst)
+{
+	int i;
+	struct mtk_vcodec_mem *mem;
+
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		if (mem->va)
+			mtk_vcodec_mem_free(inst->ctx, mem);
+	}
+}
+
+static void vdec_h264_slice_get_pic_info(struct vdec_h264_slice_inst *inst)
+{
+	struct mtk_vcodec_ctx *ctx = inst->ctx;
+	unsigned int data[3];
+
+	data[0] = ctx->picinfo.pic_w;
+	data[1] = ctx->picinfo.pic_h;
+	data[2] = ctx->capture_fourcc;
+	vpu_dec_get_param(&inst->vpu, data, 3, GET_PARAM_PIC_INFO);
+
+	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
+	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);
+	ctx->picinfo.fb_sz[0] = inst->vpu.fb_sz[0];
+	ctx->picinfo.fb_sz[1] = inst->vpu.fb_sz[1];
+	inst->cap_num_planes =
+		ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
+
+	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
+			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
+	mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
+			 ctx->picinfo.fb_sz[1]);
+
+	if (ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w ||
+	    ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h) {
+		inst->resolution_changed = true;
+		if (ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w ||
+		    ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h)
+			inst->realloc_mv_buf = true;
+
+		mtk_v4l2_debug(1, "resChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
+			       inst->resolution_changed,
+			       inst->realloc_mv_buf,
+			       ctx->last_decoded_picinfo.pic_w,
+			       ctx->last_decoded_picinfo.pic_h,
+			       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
+	}
+}
+
+static void vdec_h264_slice_get_crop_info(struct vdec_h264_slice_inst *inst,
+					  struct v4l2_rect *cr)
+{
+	cr->left = 0;
+	cr->top = 0;
+	cr->width = inst->ctx->picinfo.pic_w;
+	cr->height = inst->ctx->picinfo.pic_h;
+
+	mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
+			 cr->left, cr->top, cr->width, cr->height);
+}
+
+static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_h264_slice_inst *inst;
+	int err, vsi_size;
+
+	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+
+	inst->ctx = ctx;
+
+	inst->vpu.id = SCP_IPI_VDEC_LAT;
+	inst->vpu.core_id = SCP_IPI_VDEC_CORE;
+	inst->vpu.ctx = ctx;
+	inst->vpu.codec_type = ctx->current_codec;
+	inst->vpu.capture_type = ctx->capture_fourcc;
+
+	err = vpu_dec_init(&inst->vpu);
+	if (err) {
+		mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
+		goto error_free_inst;
+	}
+
+	vsi_size = round_up(sizeof(struct vdec_h264_slice_vsi), 64);
+	inst->vsi = inst->vpu.vsi;
+	inst->vsi_core =
+		(struct vdec_h264_slice_vsi *)(((char *)inst->vpu.vsi) + vsi_size);
+	inst->resolution_changed = true;
+	inst->realloc_mv_buf = true;
+
+	mtk_vcodec_debug(inst, "lat struct size = %d,%d,%d,%d vsi: %d\n",
+			 (int)sizeof(struct mtk_h264_sps_param),
+			 (int)sizeof(struct mtk_h264_pps_param),
+			 (int)sizeof(struct vdec_h264_slice_lat_dec_param),
+			 (int)sizeof(struct mtk_h264_dpb_info),
+			 vsi_size);
+	mtk_vcodec_debug(inst, "lat H264 instance >> %p, codec_type = 0x%x",
+			 inst, inst->vpu.codec_type);
+
+	ctx->drv_handle = inst;
+	return 0;
+
+error_free_inst:
+	kfree(inst);
+	return err;
+}
+
+static void vdec_h264_slice_deinit(void *h_vdec)
+{
+	struct vdec_h264_slice_inst *inst = h_vdec;
+
+	mtk_vcodec_debug_enter(inst);
+
+	vpu_dec_deinit(&inst->vpu);
+	vdec_h264_slice_free_mv_buf(inst);
+	vdec_msg_queue_deinit(&inst->ctx->msg_queue, inst->ctx);
+
+	kfree(inst);
+}
+
+static int vdec_h264_slice_core_decode(struct vdec_lat_buf *lat_buf)
+{
+	struct vdec_fb *fb;
+	u64 vdec_fb_va;
+	u64 y_fb_dma, c_fb_dma;
+	int err, timeout, i;
+	struct mtk_vcodec_ctx *ctx = lat_buf->ctx;
+	struct vdec_h264_slice_inst *inst = ctx->drv_handle;
+	struct vb2_v4l2_buffer *vb2_v4l2;
+	struct vdec_h264_slice_share_info *share_info = lat_buf->private_data;
+	struct mtk_vcodec_mem *mem;
+	struct vdec_vpu_inst *vpu = &inst->vpu;
+
+	mtk_vcodec_debug(inst, "[h264-core] vdec_h264 core decode");
+	memcpy_toio(&inst->vsi_core->h264_slice_params, &share_info->h264_slice_params,
+		    sizeof(share_info->h264_slice_params));
+
+	fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
+	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
+	vdec_fb_va = (unsigned long)fb;
+
+	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
+		c_fb_dma =
+			y_fb_dma + inst->ctx->picinfo.buf_w * inst->ctx->picinfo.buf_h;
+	else
+		c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
+
+	mtk_vcodec_debug(inst, "[h264-core] y/c addr = 0x%llx 0x%llx", y_fb_dma,
+			 c_fb_dma);
+
+	inst->vsi_core->dec.y_fb_dma = y_fb_dma;
+	inst->vsi_core->dec.c_fb_dma = c_fb_dma;
+	inst->vsi_core->dec.vdec_fb_va = vdec_fb_va;
+	inst->vsi_core->dec.nal_info = share_info->nal_info;
+	inst->vsi_core->wdma_start_addr =
+		lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
+	inst->vsi_core->wdma_end_addr =
+		lat_buf->ctx->msg_queue.wdma_addr.dma_addr +
+		lat_buf->ctx->msg_queue.wdma_addr.size;
+	inst->vsi_core->wdma_err_addr = lat_buf->wdma_err_addr.dma_addr;
+	inst->vsi_core->slice_bc_start_addr = lat_buf->slice_bc_addr.dma_addr;
+	inst->vsi_core->slice_bc_end_addr = lat_buf->slice_bc_addr.dma_addr +
+		lat_buf->slice_bc_addr.size;
+	inst->vsi_core->trans_start = share_info->trans_start;
+	inst->vsi_core->trans_end = share_info->trans_end;
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		inst->vsi_core->mv_buf_dma[i] = mem->dma_addr;
+	}
+
+	vb2_v4l2 = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
+	vb2_v4l2->vb2_buf.timestamp = lat_buf->ts_info.vb2_buf.timestamp;
+	vb2_v4l2->timecode = lat_buf->ts_info.timecode;
+	vb2_v4l2->field = lat_buf->ts_info.field;
+	vb2_v4l2->flags = lat_buf->ts_info.flags;
+	vb2_v4l2->vb2_buf.copied_timestamp =
+		lat_buf->ts_info.vb2_buf.copied_timestamp;
+
+	vdec_h264_slice_fill_decode_reflist(inst, &inst->vsi_core->h264_slice_params,
+					    share_info);
+
+	err = vpu_dec_core(vpu);
+	if (err) {
+		mtk_vcodec_err(inst, "core decode err=%d", err);
+		goto vdec_dec_end;
+	}
+
+	/* wait decoder done interrupt */
+	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
+					       WAIT_INTR_TIMEOUT_MS, MTK_VDEC_CORE);
+	if (timeout)
+		mtk_vcodec_err(inst, "core decode timeout: pic_%d",
+			       ctx->decoded_frame_cnt);
+	inst->vsi_core->dec.timeout = !!timeout;
+
+	vpu_dec_core_end(vpu);
+	mtk_vcodec_debug(inst, "pic[%d] crc: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x",
+			 ctx->decoded_frame_cnt,
+			 inst->vsi_core->dec.crc[0], inst->vsi_core->dec.crc[1],
+			 inst->vsi_core->dec.crc[2], inst->vsi_core->dec.crc[3],
+			 inst->vsi_core->dec.crc[4], inst->vsi_core->dec.crc[5],
+			 inst->vsi_core->dec.crc[6], inst->vsi_core->dec.crc[7]);
+
+vdec_dec_end:
+	vdec_msg_queue_update_ube_rptr(&lat_buf->ctx->msg_queue,
+				       share_info->trans_end);
+	ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, !!err);
+	mtk_vcodec_debug(inst, "core decode done err=%d", err);
+	ctx->decoded_frame_cnt++;
+	return 0;
+}
+
+static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
+				  struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_h264_slice_inst *inst = h_vdec;
+	struct vdec_vpu_inst *vpu = &inst->vpu;
+	struct mtk_video_dec_buf *src_buf_info;
+	int nal_start_idx, err, timeout = 0, i;
+	unsigned int data[2];
+	struct vdec_lat_buf *lat_buf;
+	struct vdec_h264_slice_share_info *share_info;
+	unsigned char *buf;
+	struct mtk_vcodec_mem *mem;
+
+	if (vdec_msg_queue_init(&inst->ctx->msg_queue, inst->ctx,
+				vdec_h264_slice_core_decode,
+				sizeof(*share_info)))
+		return -ENOMEM;
+
+	/* bs NULL means flush decoder */
+	if (!bs) {
+		vdec_msg_queue_wait_lat_buf_full(&inst->ctx->msg_queue);
+		return vpu_dec_reset(vpu);
+	}
+
+	lat_buf = vdec_msg_queue_dqbuf(&inst->ctx->msg_queue.lat_ctx);
+	if (!lat_buf) {
+		mtk_vcodec_err(inst, "failed to get lat buffer");
+		return -EINVAL;
+	}
+	share_info = lat_buf->private_data;
+	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+
+	buf = (unsigned char *)bs->va;
+	nal_start_idx = mtk_vdec_h264_find_start_code(buf, bs->size);
+	if (nal_start_idx < 0) {
+		err = -EINVAL;
+		goto err_free_fb_out;
+	}
+
+	inst->vsi->dec.nal_info = buf[nal_start_idx];
+	inst->vsi->dec.bs_buf_addr = (u64)bs->dma_addr;
+	inst->vsi->dec.bs_buf_size = bs->size;
+
+	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
+				   &lat_buf->ts_info, true);
+
+	err = vdec_h264_slice_fill_decode_parameters(inst, share_info);
+	if (err)
+		goto err_free_fb_out;
+
+	*res_chg = inst->resolution_changed;
+	if (inst->resolution_changed) {
+		mtk_vcodec_debug(inst, "- resolution changed -");
+		if (inst->realloc_mv_buf) {
+			err = vdec_h264_slice_alloc_mv_buf(inst, &inst->ctx->picinfo);
+			inst->realloc_mv_buf = false;
+			if (err)
+				goto err_free_fb_out;
+		}
+		inst->resolution_changed = false;
+	}
+	for (i = 0; i < H264_MAX_MV_NUM; i++) {
+		mem = &inst->mv_buf[i];
+		inst->vsi->mv_buf_dma[i] = mem->dma_addr;
+	}
+	inst->vsi->wdma_start_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
+	inst->vsi->wdma_end_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr +
+		lat_buf->ctx->msg_queue.wdma_addr.size;
+	inst->vsi->wdma_err_addr = lat_buf->wdma_err_addr.dma_addr;
+	inst->vsi->slice_bc_start_addr = lat_buf->slice_bc_addr.dma_addr;
+	inst->vsi->slice_bc_end_addr = lat_buf->slice_bc_addr.dma_addr +
+		lat_buf->slice_bc_addr.size;
+
+	inst->vsi->trans_end = inst->ctx->msg_queue.wdma_rptr_addr;
+	inst->vsi->trans_start = inst->ctx->msg_queue.wdma_wptr_addr;
+	mtk_vcodec_debug(inst, "lat:trans(0x%llx 0x%llx)err:0x%llx",
+			 inst->vsi->wdma_start_addr,
+			 inst->vsi->wdma_end_addr,
+			 inst->vsi->wdma_err_addr);
+
+	mtk_vcodec_debug(inst, "slice(0x%llx 0x%llx) rprt((0x%llx 0x%llx))",
+			 inst->vsi->slice_bc_start_addr,
+			 inst->vsi->slice_bc_end_addr,
+			 inst->vsi->trans_start,
+			 inst->vsi->trans_end);
+	err = vpu_dec_start(vpu, data, 2);
+	if (err) {
+		mtk_vcodec_debug(inst, "lat decode err: %d", err);
+		goto err_free_fb_out;
+	}
+
+	/* wait decoder done interrupt */
+	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
+					       WAIT_INTR_TIMEOUT_MS, MTK_VDEC_LAT0);
+	inst->vsi->dec.timeout = !!timeout;
+
+	err = vpu_dec_end(vpu);
+	if (err == SLICE_HEADER_FULL || timeout || err == TRANS_BUFFER_FULL) {
+		err = -EINVAL;
+		goto err_free_fb_out;
+	}
+
+	share_info->trans_end = inst->ctx->msg_queue.wdma_addr.dma_addr +
+		inst->vsi->wdma_end_addr_offset;
+	share_info->trans_start = inst->ctx->msg_queue.wdma_wptr_addr;
+	share_info->nal_info = inst->vsi->dec.nal_info;
+	vdec_msg_queue_update_ube_wptr(&lat_buf->ctx->msg_queue,
+				       share_info->trans_end);
+
+	memcpy_fromio(&share_info->h264_slice_params, &inst->vsi->h264_slice_params,
+		      sizeof(share_info->h264_slice_params));
+	vdec_msg_queue_qbuf(&inst->ctx->dev->msg_queue_core_ctx, lat_buf);
+
+	inst->slice_dec_num++;
+	return 0;
+
+err_free_fb_out:
+	mtk_vcodec_err(inst, "slice dec number: %d err: %d", inst->slice_dec_num, err);
+	return err;
+}
+
+static int vdec_h264_slice_get_param(void *h_vdec, enum vdec_get_param_type type,
+				     void *out)
+{
+	struct vdec_h264_slice_inst *inst = h_vdec;
+
+	switch (type) {
+	case GET_PARAM_PIC_INFO:
+		vdec_h264_slice_get_pic_info(inst);
+		break;
+	case GET_PARAM_DPB_SIZE:
+		*(unsigned int *)out = 6;
+		break;
+	case GET_PARAM_CROP_INFO:
+		vdec_h264_slice_get_crop_info(inst, out);
+		break;
+	default:
+		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+const struct vdec_common_if vdec_h264_slice_lat_if = {
+	.init		= vdec_h264_slice_init,
+	.decode		= vdec_h264_slice_decode,
+	.get_param	= vdec_h264_slice_get_param,
+	.deinit		= vdec_h264_slice_deinit,
+};
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index c93dd0ea3537..c17a7815e1bb 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -20,7 +20,13 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 
 	switch (fourcc) {
 	case V4L2_PIX_FMT_H264_SLICE:
-		ctx->dec_if = &vdec_h264_slice_if;
+		if (ctx->dev->vdec_pdata->hw_arch == MTK_VDEC_PURE_SINGLE_CORE) {
+			ctx->dec_if = &vdec_h264_slice_if;
+			ctx->hw_id = MTK_VDEC_CORE;
+		} else {
+			ctx->dec_if = &vdec_h264_slice_lat_if;
+			ctx->hw_id = MTK_VDEC_LAT0;
+		}
 		break;
 	case V4L2_PIX_FMT_H264:
 		ctx->dec_if = &vdec_h264_if;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index d467e8af4a84..6ce848e74167 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -56,6 +56,7 @@ struct vdec_fb_node {
 
 extern const struct vdec_common_if vdec_h264_if;
 extern const struct vdec_common_if vdec_h264_slice_if;
+extern const struct vdec_common_if vdec_h264_slice_lat_if;
 extern const struct vdec_common_if vdec_vp8_if;
 extern const struct vdec_common_if vdec_vp9_if;
 
diff --git a/include/linux/remoteproc/mtk_scp.h b/include/linux/remoteproc/mtk_scp.h
index b47416f7aeb8..7c2b7cc9fe6c 100644
--- a/include/linux/remoteproc/mtk_scp.h
+++ b/include/linux/remoteproc/mtk_scp.h
@@ -41,6 +41,8 @@ enum scp_ipi_id {
 	SCP_IPI_ISP_FRAME,
 	SCP_IPI_FD_CMD,
 	SCP_IPI_CROS_HOST_CMD,
+	SCP_IPI_VDEC_LAT,
+	SCP_IPI_VDEC_CORE,
 	SCP_IPI_NS_SERVICE = 0xFF,
 	SCP_IPI_MAX = 0x100,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (12 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192 Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-03-01 22:15   ` Nicolas Dufresne
  2022-02-23  3:40 ` [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding Yunfei Dong
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Add support for VP8 decoding using the stateless API,
as supported by MT8192.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
 drivers/media/platform/mtk-vcodec/Makefile    |   1 +
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |  24 +-
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   1 +
 .../mtk-vcodec/vdec/vdec_vp8_req_if.c         | 445 ++++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |   4 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
 6 files changed, 474 insertions(+), 2 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index 22edb1c86598..b457daf2d196 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
 
 mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp8_if.o \
+		vdec/vdec_vp8_req_if.o \
 		vdec/vdec_vp9_if.o \
 		vdec/vdec_h264_req_if.o \
 		vdec/vdec_h264_req_common.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 9333e3418b98..2a0164ddc708 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -76,13 +76,28 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
 			.max = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
 		},
 		.codec_type = V4L2_PIX_FMT_H264_SLICE,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_VP8_FRAME,
+		},
+		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_VP8_PROFILE,
+			.min = V4L2_MPEG_VIDEO_VP8_PROFILE_0,
+			.def = V4L2_MPEG_VIDEO_VP8_PROFILE_0,
+			.max = V4L2_MPEG_VIDEO_VP8_PROFILE_3,
+		},
+		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
 	}
 };
 
 #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
 
-static struct mtk_video_fmt mtk_video_formats[3];
-static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
+static struct mtk_video_fmt mtk_video_formats[4];
+static struct mtk_codec_framesizes mtk_vdec_framesizes[2];
 
 static struct mtk_video_fmt default_out_format;
 static struct mtk_video_fmt default_cap_format;
@@ -350,6 +365,7 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
 
 	switch (fourcc) {
 	case V4L2_PIX_FMT_H264_SLICE:
+	case V4L2_PIX_FMT_VP8_FRAME:
 		mtk_video_formats[count_formats].fourcc = fourcc;
 		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
 		mtk_video_formats[count_formats].num_planes = 1;
@@ -393,6 +409,10 @@ static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
 		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
 		out_format_count++;
 	}
+	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_VP8_FRAME) {
+		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP8_FRAME, ctx);
+		out_format_count++;
+	}
 
 	if (cap_format_count)
 		default_cap_format = mtk_video_formats[cap_format_count - 1];
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index d60561065656..c68297db225e 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -354,6 +354,7 @@ enum mtk_vdec_format_types {
 	MTK_VDEC_FORMAT_MM21 = 0x20,
 	MTK_VDEC_FORMAT_MT21C = 0x40,
 	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
+	MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
 };
 
 /**
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
new file mode 100644
index 000000000000..6bd4f2365826
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
@@ -0,0 +1,445 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ * Author: Yunfei Dong <yunfei.dong@mediatek.com>
+ */
+
+#include <linux/slab.h>
+#include <media/v4l2-mem2mem.h>
+#include <media/videobuf2-dma-contig.h>
+#include <uapi/linux/v4l2-controls.h>
+
+#include "../mtk_vcodec_util.h"
+#include "../mtk_vcodec_dec.h"
+#include "../mtk_vcodec_intr.h"
+#include "../vdec_drv_base.h"
+#include "../vdec_drv_if.h"
+#include "../vdec_vpu_if.h"
+
+/* Decoding picture buffer size (3 reference frames plus current frame) */
+#define VP8_DPB_SIZE 4
+
+/* HW working buffer size (bytes) */
+#define VP8_SEG_ID_SZ   SZ_256K
+#define VP8_PP_WRAPY_SZ SZ_64K
+#define VP8_PP_WRAPC_SZ SZ_64K
+#define VP8_VLD_PRED_SZ SZ_64K
+
+/**
+ * struct vdec_vp8_slice_info - decode misc information
+ * @vld_wrapper_dma   : vld wrapper dma address
+ * @seg_id_buf_dma    : seg id dma address
+ * @wrap_y_dma        : wrap y dma address
+ * @wrap_c_dma        : wrap y dma address
+ * @cur_y_fb_dma      : current plane Y frame buffer dma address
+ * @cur_c_fb_dma      : current plane C frame buffer dma address
+ * @bs_dma            : bitstream dma address
+ * @bs_sz             : bitstream size
+ * @resolution_changed: resolution change flag 1 - changed,  0 - not change
+ * @frame_header_type : current frame header type
+ * @wait_key_frame    : wait key frame coming
+ * @crc               : used to check whether hardware's status is right
+ * @reserved:         : reserved, currently unused
+ */
+struct vdec_vp8_slice_info {
+	u64 vld_wrapper_dma;
+	u64 seg_id_buf_dma;
+	u64 wrap_y_dma;
+	u64 wrap_c_dma;
+	u64 cur_y_fb_dma;
+	u64 cur_c_fb_dma;
+	u64 bs_dma;
+	u32 bs_sz;
+	u32 resolution_changed;
+	u32 frame_header_type;
+	u32 crc[8];
+	u32 reserved;
+};
+
+/**
+ * struct vdec_vp8_slice_dpb_info  - vp8 reference information
+ * @y_dma_addr    : Y bitstream physical address
+ * @c_dma_addr    : CbCr bitstream physical address
+ * @reference_flag: reference picture flag
+ * @reserved      : 64bit align
+ */
+struct vdec_vp8_slice_dpb_info {
+	dma_addr_t y_dma_addr;
+	dma_addr_t c_dma_addr;
+	int reference_flag;
+	int reserved;
+};
+
+/**
+ * struct vdec_vp8_slice_vsi - VPU shared information
+ * @dec          : decoding information
+ * @pic          : picture information
+ * @vp8_dpb_info : reference buffer information
+ */
+struct vdec_vp8_slice_vsi {
+	struct vdec_vp8_slice_info dec;
+	struct vdec_pic_info pic;
+	struct vdec_vp8_slice_dpb_info vp8_dpb_info[3];
+};
+
+/**
+ * struct vdec_vp8_slice_inst - VP8 decoder instance
+ * @seg_id_buf     : seg buffer
+ * @wrap_y_buf     : wrapper y buffer
+ * @wrap_c_buf     : wrapper c buffer
+ * @vld_wrapper_buf: vld wrapper buffer
+ * @ctx            : V4L2 context
+ * @vpu            : VPU instance for decoder
+ * @vsi            : VPU share information
+ */
+struct vdec_vp8_slice_inst {
+	struct mtk_vcodec_mem seg_id_buf;
+	struct mtk_vcodec_mem wrap_y_buf;
+	struct mtk_vcodec_mem wrap_c_buf;
+	struct mtk_vcodec_mem vld_wrapper_buf;
+	struct mtk_vcodec_ctx *ctx;
+	struct vdec_vpu_inst vpu;
+	struct vdec_vp8_slice_vsi *vsi;
+};
+
+static void *vdec_vp8_slice_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
+{
+	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
+
+	if (!ctrl)
+		return ERR_PTR(-EINVAL);
+
+	return ctrl->p_cur.p;
+}
+
+static void vdec_vp8_slice_get_crop_info(struct vdec_vp8_slice_inst *inst,
+					 struct v4l2_rect *cr)
+{
+	cr->left = 0;
+	cr->top = 0;
+	cr->width = inst->vsi->pic.pic_w;
+	cr->height = inst->vsi->pic.pic_h;
+	mtk_vcodec_debug(inst, "get crop info l=%d, t=%d, w=%d, h=%d",
+			 cr->left, cr->top, cr->width, cr->height);
+}
+
+static void vdec_vp8_slice_get_pic_info(struct vdec_vp8_slice_inst *inst)
+{
+	struct mtk_vcodec_ctx *ctx = inst->ctx;
+	unsigned int data[3];
+
+	data[0] = ctx->picinfo.pic_w;
+	data[1] = ctx->picinfo.pic_h;
+	data[2] = ctx->capture_fourcc;
+	vpu_dec_get_param(&inst->vpu, data, 3, GET_PARAM_PIC_INFO);
+
+	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
+	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);
+	ctx->picinfo.fb_sz[0] = inst->vpu.fb_sz[0];
+	ctx->picinfo.fb_sz[1] = inst->vpu.fb_sz[1];
+
+	inst->vsi->pic.pic_w = ctx->picinfo.pic_w;
+	inst->vsi->pic.pic_h = ctx->picinfo.pic_h;
+	inst->vsi->pic.buf_w = ctx->picinfo.buf_w;
+	inst->vsi->pic.buf_h = ctx->picinfo.buf_h;
+	inst->vsi->pic.fb_sz[0] = ctx->picinfo.fb_sz[0];
+	inst->vsi->pic.fb_sz[1] = ctx->picinfo.fb_sz[1];
+	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
+			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
+	mtk_vcodec_debug(inst, "fb size: Y(%d), C(%d)",
+			 ctx->picinfo.fb_sz[0], ctx->picinfo.fb_sz[1]);
+}
+
+static int vdec_vp8_slice_alloc_working_buf(struct vdec_vp8_slice_inst *inst)
+{
+	int err;
+	struct mtk_vcodec_mem *mem;
+
+	mem = &inst->seg_id_buf;
+	mem->size = VP8_SEG_ID_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+	if (err) {
+		mtk_vcodec_err(inst, "Cannot allocate working buffer");
+		return err;
+	}
+	inst->vsi->dec.seg_id_buf_dma = (u64)mem->dma_addr;
+
+	mem = &inst->wrap_y_buf;
+	mem->size = VP8_PP_WRAPY_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+	if (err) {
+		mtk_vcodec_err(inst, "cannot allocate WRAP Y buffer");
+		return err;
+	}
+	inst->vsi->dec.wrap_y_dma = (u64)mem->dma_addr;
+
+	mem = &inst->wrap_c_buf;
+	mem->size = VP8_PP_WRAPC_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+	if (err) {
+		mtk_vcodec_err(inst, "cannot allocate WRAP C buffer");
+		return err;
+	}
+	inst->vsi->dec.wrap_c_dma = (u64)mem->dma_addr;
+
+	mem = &inst->vld_wrapper_buf;
+	mem->size = VP8_VLD_PRED_SZ;
+	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
+	if (err) {
+		mtk_vcodec_err(inst, "cannot allocate vld wrapper buffer");
+		return err;
+	}
+	inst->vsi->dec.vld_wrapper_dma = (u64)mem->dma_addr;
+
+	return 0;
+}
+
+static void vdec_vp8_slice_free_working_buf(struct vdec_vp8_slice_inst *inst)
+{
+	struct mtk_vcodec_mem *mem;
+
+	mem = &inst->seg_id_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+	inst->vsi->dec.seg_id_buf_dma = 0;
+
+	mem = &inst->wrap_y_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+	inst->vsi->dec.wrap_y_dma = 0;
+
+	mem = &inst->wrap_c_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+	inst->vsi->dec.wrap_c_dma = 0;
+
+	mem = &inst->vld_wrapper_buf;
+	if (mem->va)
+		mtk_vcodec_mem_free(inst->ctx, mem);
+	inst->vsi->dec.vld_wrapper_dma = 0;
+}
+
+static u64 vdec_vp8_slice_get_ref_by_ts(const struct v4l2_ctrl_vp8_frame *frame_header,
+					int index)
+{
+	switch (index) {
+	case 0:
+		return frame_header->last_frame_ts;
+	case 1:
+		return frame_header->golden_frame_ts;
+	case 2:
+		return frame_header->alt_frame_ts;
+	default:
+		break;
+	}
+
+	return -1;
+}
+
+static int vdec_vp8_slice_get_decode_parameters(struct vdec_vp8_slice_inst *inst)
+{
+	const struct v4l2_ctrl_vp8_frame *frame_header;
+	struct mtk_vcodec_ctx *ctx = inst->ctx;
+	struct vb2_queue *vq;
+	struct vb2_buffer *vb;
+	u64 referenct_ts;
+	int index, vb2_index;
+
+	frame_header = vdec_vp8_slice_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_VP8_FRAME);
+	if (IS_ERR(frame_header))
+		return PTR_ERR(frame_header);
+
+	vq = v4l2_m2m_get_vq(ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+	for (index = 0; index < 3; index++) {
+		referenct_ts = vdec_vp8_slice_get_ref_by_ts(frame_header, index);
+		vb2_index = vb2_find_timestamp(vq, referenct_ts, 0);
+		if (vb2_index < 0) {
+			if (!V4L2_VP8_FRAME_IS_KEY_FRAME(frame_header))
+				mtk_vcodec_err(inst, "reference invalid: index(%d) ts(%lld)",
+					       index, referenct_ts);
+			inst->vsi->vp8_dpb_info[index].reference_flag = 0;
+			continue;
+		}
+		inst->vsi->vp8_dpb_info[index].reference_flag = 1;
+
+		vb = vq->bufs[vb2_index];
+		inst->vsi->vp8_dpb_info[index].y_dma_addr =
+			vb2_dma_contig_plane_dma_addr(vb, 0);
+		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+			inst->vsi->vp8_dpb_info[index].c_dma_addr =
+				vb2_dma_contig_plane_dma_addr(vb, 1);
+		else
+			inst->vsi->vp8_dpb_info[index].c_dma_addr =
+				inst->vsi->vp8_dpb_info[index].y_dma_addr +
+				ctx->picinfo.fb_sz[0];
+	}
+
+	inst->vsi->dec.frame_header_type = frame_header->flags >> 1;
+
+	return 0;
+}
+
+static int vdec_vp8_slice_init(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_vp8_slice_inst *inst;
+	int err;
+
+	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+
+	inst->ctx = ctx;
+
+	inst->vpu.id = SCP_IPI_VDEC_LAT;
+	inst->vpu.core_id = SCP_IPI_VDEC_CORE;
+	inst->vpu.ctx = ctx;
+	inst->vpu.codec_type = ctx->current_codec;
+	inst->vpu.capture_type = ctx->capture_fourcc;
+
+	err = vpu_dec_init(&inst->vpu);
+	if (err) {
+		mtk_vcodec_err(inst, "vdec_vp8 init err=%d", err);
+		goto error_free_inst;
+	}
+
+	inst->vsi = inst->vpu.vsi;
+	err = vdec_vp8_slice_alloc_working_buf(inst);
+	if (err)
+		goto error_deinit;
+
+	mtk_vcodec_debug(inst, "vp8 struct size = %d vsi: %d\n",
+			 (int)sizeof(struct v4l2_ctrl_vp8_frame),
+			 (int)sizeof(struct vdec_vp8_slice_vsi));
+	mtk_vcodec_debug(inst, "vp8:%p, codec_type = 0x%x vsi: 0x%p",
+			 inst, inst->vpu.codec_type, inst->vpu.vsi);
+
+	ctx->drv_handle = inst;
+	return 0;
+
+error_deinit:
+	vpu_dec_deinit(&inst->vpu);
+error_free_inst:
+	kfree(inst);
+	return err;
+}
+
+static int vdec_vp8_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
+				 struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_vp8_slice_inst *inst = h_vdec;
+	struct vdec_vpu_inst *vpu = &inst->vpu;
+	struct mtk_video_dec_buf *src_buf_info, *dst_buf_info;
+	unsigned int data;
+	u64 y_fb_dma, c_fb_dma;
+	int err, timeout;
+
+	/* Resolution changes are never initiated by us */
+	*res_chg = false;
+
+	/* bs NULL means flush decoder */
+	if (!bs)
+		return vpu_dec_reset(vpu);
+
+	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
+
+	fb = inst->ctx->dev->vdec_pdata->get_cap_buffer(inst->ctx);
+	dst_buf_info = container_of(fb, struct mtk_video_dec_buf, frame_buffer);
+
+	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
+	if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
+		c_fb_dma = y_fb_dma +
+			inst->ctx->picinfo.buf_w * inst->ctx->picinfo.buf_h;
+	else
+		c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
+
+	inst->vsi->dec.bs_dma = (u64)bs->dma_addr;
+	inst->vsi->dec.bs_sz = bs->size;
+	inst->vsi->dec.cur_y_fb_dma = y_fb_dma;
+	inst->vsi->dec.cur_c_fb_dma = c_fb_dma;
+
+	mtk_vcodec_debug(inst, "frame[%d] bs(%zu 0x%llx) y/c(0x%llx 0x%llx)",
+			 inst->ctx->decoded_frame_cnt,
+			 bs->size, (u64)bs->dma_addr,
+			 y_fb_dma, c_fb_dma);
+
+	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
+				   &dst_buf_info->m2m_buf.vb, true);
+
+	err = vdec_vp8_slice_get_decode_parameters(inst);
+	if (err)
+		goto error;
+
+	err = vpu_dec_start(vpu, &data, 1);
+	if (err) {
+		mtk_vcodec_debug(inst, "vp8 dec start err!");
+		goto error;
+	}
+
+	if (inst->vsi->dec.resolution_changed) {
+		mtk_vcodec_debug(inst, "- resolution_changed -");
+		*res_chg = true;
+		return 0;
+	}
+
+	/* wait decode done interrupt */
+	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
+					       50, MTK_VDEC_CORE);
+
+	err = vpu_dec_end(vpu);
+	if (err || timeout)
+		mtk_vcodec_debug(inst, "vp8 dec error timeout:%d err: %d pic_%d",
+				 timeout, err, inst->ctx->decoded_frame_cnt);
+
+	mtk_vcodec_debug(inst, "pic[%d] crc: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x",
+			 inst->ctx->decoded_frame_cnt,
+			 inst->vsi->dec.crc[0], inst->vsi->dec.crc[1],
+			 inst->vsi->dec.crc[2], inst->vsi->dec.crc[3],
+			 inst->vsi->dec.crc[4], inst->vsi->dec.crc[5],
+			 inst->vsi->dec.crc[6], inst->vsi->dec.crc[7]);
+
+	inst->ctx->decoded_frame_cnt++;
+error:
+	inst->ctx->dev->vdec_pdata->cap_to_disp(inst->ctx, fb, !!err);
+	return err;
+}
+
+static int vdec_vp8_slice_get_param(void *h_vdec, enum vdec_get_param_type type, void *out)
+{
+	struct vdec_vp8_slice_inst *inst = h_vdec;
+
+	switch (type) {
+	case GET_PARAM_PIC_INFO:
+		vdec_vp8_slice_get_pic_info(inst);
+		break;
+	case GET_PARAM_CROP_INFO:
+		vdec_vp8_slice_get_crop_info(inst, out);
+		break;
+	case GET_PARAM_DPB_SIZE:
+		*((unsigned int *)out) = VP8_DPB_SIZE;
+		break;
+	default:
+		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void vdec_vp8_slice_deinit(void *h_vdec)
+{
+	struct vdec_vp8_slice_inst *inst = h_vdec;
+
+	mtk_vcodec_debug_enter(inst);
+
+	vpu_dec_deinit(&inst->vpu);
+	vdec_vp8_slice_free_working_buf(inst);
+	kfree(inst);
+}
+
+const struct vdec_common_if vdec_vp8_slice_if = {
+	.init		= vdec_vp8_slice_init,
+	.decode		= vdec_vp8_slice_decode,
+	.get_param	= vdec_vp8_slice_get_param,
+	.deinit		= vdec_vp8_slice_deinit,
+};
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index c17a7815e1bb..9db9a57da2c1 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -32,6 +32,10 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 		ctx->dec_if = &vdec_h264_if;
 		ctx->hw_id = MTK_VDEC_CORE;
 		break;
+	case V4L2_PIX_FMT_VP8_FRAME:
+		ctx->dec_if = &vdec_vp8_slice_if;
+		ctx->hw_id = MTK_VDEC_CORE;
+		break;
 	case V4L2_PIX_FMT_VP8:
 		ctx->dec_if = &vdec_vp8_if;
 		ctx->hw_id = MTK_VDEC_CORE;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index 6ce848e74167..e3adf8f36342 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -58,6 +58,7 @@ extern const struct vdec_common_if vdec_h264_if;
 extern const struct vdec_common_if vdec_h264_slice_if;
 extern const struct vdec_common_if vdec_h264_slice_lat_if;
 extern const struct vdec_common_if vdec_vp8_if;
+extern const struct vdec_common_if vdec_vp8_slice_if;
 extern const struct vdec_common_if vdec_vp9_if;
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding
  2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
                   ` (13 preceding siblings ...)
  2022-02-23  3:40 ` [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding Yunfei Dong
@ 2022-02-23  3:40 ` Yunfei Dong
  2022-03-01 22:22   ` Nicolas Dufresne
  14 siblings, 1 reply; 36+ messages in thread
From: Yunfei Dong @ 2022-02-23  3:40 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Add support for VP9 decoding using the stateless API,
as supported by MT8192. And the drivers is lat and core architecture.

Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
Signed-off-by: George Sun <george.sun@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
 drivers/media/platform/mtk-vcodec/Makefile    |    1 +
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |   26 +-
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |    1 +
 .../mtk-vcodec/vdec/vdec_vp9_req_lat_if.c     | 1971 +++++++++++++++++
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |    4 +
 .../media/platform/mtk-vcodec/vdec_drv_if.h   |    1 +
 6 files changed, 2001 insertions(+), 3 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c

diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
index b457daf2d196..93e7a343b5b0 100644
--- a/drivers/media/platform/mtk-vcodec/Makefile
+++ b/drivers/media/platform/mtk-vcodec/Makefile
@@ -9,6 +9,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
 		vdec/vdec_vp8_if.o \
 		vdec/vdec_vp8_req_if.o \
 		vdec/vdec_vp9_if.o \
+		vdec/vdec_vp9_req_lat_if.o \
 		vdec/vdec_h264_req_if.o \
 		vdec/vdec_h264_req_common.o \
 		vdec/vdec_h264_req_multi_if.o \
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
index 2a0164ddc708..3770e8117488 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
@@ -91,13 +91,28 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
 			.max = V4L2_MPEG_VIDEO_VP8_PROFILE_3,
 		},
 		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
-	}
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_STATELESS_VP9_FRAME,
+		},
+		.codec_type = V4L2_PIX_FMT_VP9_FRAME,
+	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
+			.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+			.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
+			.max = V4L2_MPEG_VIDEO_VP9_PROFILE_3,
+		},
+		.codec_type = V4L2_PIX_FMT_VP9_FRAME,
+	},
 };
 
 #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
 
-static struct mtk_video_fmt mtk_video_formats[4];
-static struct mtk_codec_framesizes mtk_vdec_framesizes[2];
+static struct mtk_video_fmt mtk_video_formats[5];
+static struct mtk_codec_framesizes mtk_vdec_framesizes[3];
 
 static struct mtk_video_fmt default_out_format;
 static struct mtk_video_fmt default_cap_format;
@@ -366,6 +381,7 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
 	switch (fourcc) {
 	case V4L2_PIX_FMT_H264_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
+	case V4L2_PIX_FMT_VP9_FRAME:
 		mtk_video_formats[count_formats].fourcc = fourcc;
 		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
 		mtk_video_formats[count_formats].num_planes = 1;
@@ -413,6 +429,10 @@ static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
 		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP8_FRAME, ctx);
 		out_format_count++;
 	}
+	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_VP9_FRAME) {
+		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP9_FRAME, ctx);
+		out_format_count++;
+	}
 
 	if (cap_format_count)
 		default_cap_format = mtk_video_formats[cap_format_count - 1];
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index c68297db225e..ea58f11e7659 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -355,6 +355,7 @@ enum mtk_vdec_format_types {
 	MTK_VDEC_FORMAT_MT21C = 0x40,
 	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
 	MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
+	MTK_VDEC_FORMAT_VP9_FRAME = 0x400,
 };
 
 /**
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c
new file mode 100644
index 000000000000..c678170c7ca3
--- /dev/null
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c
@@ -0,0 +1,1971 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ * Author: George Sun <george.sun@mediatek.com>
+ */
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <media/videobuf2-dma-contig.h>
+
+#include "../mtk_vcodec_util.h"
+#include "../mtk_vcodec_dec.h"
+#include "../mtk_vcodec_intr.h"
+#include "../vdec_drv_base.h"
+#include "../vdec_drv_if.h"
+#include "../vdec_vpu_if.h"
+
+/* reset_frame_context defined in VP9 spec */
+#define VP9_RESET_FRAME_CONTEXT_NONE0 0
+#define VP9_RESET_FRAME_CONTEXT_NONE1 1
+#define VP9_RESET_FRAME_CONTEXT_SPEC 2
+#define VP9_RESET_FRAME_CONTEXT_ALL 3
+
+#define VP9_TILE_BUF_SIZE 4096
+#define VP9_PROB_BUF_SIZE 2560
+#define VP9_COUNTS_BUF_SIZE 16384
+
+#define HDR_FLAG(x) (!!((hdr)->flags & V4L2_VP9_FRAME_FLAG_##x))
+#define LF_FLAG(x) (!!((lf)->flags & V4L2_VP9_LOOP_FILTER_FLAG_##x))
+#define SEG_FLAG(x) (!!((seg)->flags & V4L2_VP9_SEGMENTATION_FLAG_##x))
+
+/*
+ * struct vdec_vp9_slice_frame_ctx - vp9 prob tables footprint
+ */
+struct vdec_vp9_slice_frame_ctx {
+	struct {
+		u8 probs[6][3];
+		u8 padding[2];
+	} coef_probs[4][2][2][6];
+
+	u8 y_mode_prob[4][16];
+	u8 switch_interp_prob[4][16];
+	u8 seg[32];  /* ignore */
+	u8 comp_inter_prob[16];
+	u8 comp_ref_prob[16];
+	u8 single_ref_prob[5][2];
+	u8 single_ref_prob_padding[6];
+
+	u8 joint[3];
+	u8 joint_padding[13];
+	struct {
+		u8 sign;
+		u8 classes[10];
+		u8 padding[5];
+	} sign_classes[2];
+	struct {
+		u8 class0[1];
+		u8 bits[10];
+		u8 padding[5];
+	} class0_bits[2];
+	struct {
+		u8 class0_fp[2][3];
+		u8 fp[3];
+		u8 class0_hp;
+		u8 hp;
+		u8 padding[5];
+	} class0_fp_hp[2];
+
+	u8 uv_mode_prob[10][16];
+	u8 uv_mode_prob_padding[2][16];
+
+	u8 partition_prob[16][4];
+
+	u8 inter_mode_probs[7][4];
+	u8 skip_probs[4];
+
+	u8 tx_p8x8[2][4];
+	u8 tx_p16x16[2][4];
+	u8 tx_p32x32[2][4];
+	u8 intra_inter_prob[8];
+};
+
+/*
+ * struct vdec_vp9_slice_frame_counts - vp9 counts tables footprint
+ */
+struct vdec_vp9_slice_frame_counts {
+	union {
+		struct {
+			u32 band_0[3];
+			u32 padding0[1];
+			u32 band_1_5[5][6];
+			u32 padding1[2];
+		} eob_branch[4][2][2];
+		u32 eob_branch_space[256 * 4];
+	};
+
+	struct {
+		u32 band_0[3][4];
+		u32 band_1_5[5][6][4];
+	} coef_probs[4][2][2];
+
+	u32 intra_inter[4][2];
+	u32 comp_inter[5][2];
+	u32 comp_inter_padding[2];
+	u32 comp_ref[5][2];
+	u32 comp_ref_padding[2];
+	u32 single_ref[5][2][2];
+	u32 inter_mode[7][4];
+	u32 y_mode[4][12];
+	u32 uv_mode[10][10];
+	u32 partition[16][4];
+	u32 switchable_interp[4][4];
+
+	u32 tx_p8x8[2][2];
+	u32 tx_p16x16[2][4];
+	u32 tx_p32x32[2][4];
+
+	u32 skip[3][4];
+
+	u32 joint[4];
+
+	struct {
+		u32 sign[2];
+		u32 class0[2];
+		u32 classes[12];
+		u32 bits[10][2];
+		u32 padding[4];
+		u32 class0_fp[2][4];
+		u32 fp[4];
+		u32 class0_hp[2];
+		u32 hp[2];
+	} mvcomp[2];
+
+	u32 reserved[126][4];
+};
+
+/*
+ * struct vdec_vp9_slice_uncompressed_header - vp9 uncompressed header syntax
+ *                                             used for decoding
+ */
+struct vdec_vp9_slice_uncompressed_header {
+	u8 profile;
+	u8 last_frame_type;
+	u8 frame_type;
+
+	u8 last_show_frame;
+	u8 show_frame;
+	u8 error_resilient_mode;
+
+	u8 bit_depth;
+	u8 padding0[1];
+	u16 last_frame_width;
+	u16 last_frame_height;
+	u16 frame_width;
+	u16 frame_height;
+
+	u8 intra_only;
+	u8 reset_frame_context;
+	u8 ref_frame_sign_bias[4];
+	u8 allow_high_precision_mv;
+	u8 interpolation_filter;
+
+	u8 refresh_frame_context;
+	u8 frame_parallel_decoding_mode;
+	u8 frame_context_idx;
+
+	/* loop_filter_params */
+	u8 loop_filter_level;
+	u8 loop_filter_sharpness;
+	u8 loop_filter_delta_enabled;
+	s8 loop_filter_ref_deltas[4];
+	s8 loop_filter_mode_deltas[2];
+
+	/* quantization_params */
+	u8 base_q_idx;
+	s8 delta_q_y_dc;
+	s8 delta_q_uv_dc;
+	s8 delta_q_uv_ac;
+
+	/* segmentation_params */
+	u8 segmentation_enabled;
+	u8 segmentation_update_map;
+	u8 segmentation_tree_probs[7];
+	u8 padding1[1];
+	u8 segmentation_temporal_udpate;
+	u8 segmentation_pred_prob[3];
+	u8 segmentation_update_data;
+	u8 segmentation_abs_or_delta_update;
+	u8 feature_enabled[8];
+	s16 feature_value[8][4];
+
+	/* tile_info */
+	u8 tile_cols_log2;
+	u8 tile_rows_log2;
+	u8 padding2[2];
+
+	u16 uncompressed_header_size;
+	u16 header_size_in_bytes;
+
+	/* LAT OUT, CORE IN */
+	u32 dequant[8][4];
+};
+
+/*
+ * struct vdec_vp9_slice_compressed_header - vp9 compressed header syntax
+ *                                           used for decoding.
+ */
+struct vdec_vp9_slice_compressed_header {
+	u8 tx_mode;
+	u8 ref_mode;
+	u8 comp_fixed_ref;
+	u8 comp_var_ref[2];
+	u8 padding[3];
+};
+
+/*
+ * struct vdec_vp9_slice_tiles - vp9 tile syntax
+ */
+struct vdec_vp9_slice_tiles {
+	u32 size[4][64];
+	u32 mi_rows[4];
+	u32 mi_cols[64];
+	u8 actual_rows;
+	u8 padding[7];
+};
+
+/*
+ * struct vdec_vp9_slice_reference - vp9 reference frame information
+ */
+struct vdec_vp9_slice_reference {
+	u16 frame_width;
+	u16 frame_height;
+	u8 bit_depth;
+	u8 subsampling_x;
+	u8 subsampling_y;
+	u8 padding;
+};
+
+/*
+ * struct vdec_vp9_slice_frame - vp9 syntax used for decoding
+ */
+struct vdec_vp9_slice_frame {
+	struct vdec_vp9_slice_uncompressed_header uh;
+	struct vdec_vp9_slice_compressed_header ch;
+	struct vdec_vp9_slice_tiles tiles;
+	struct vdec_vp9_slice_reference ref[3];
+};
+
+/*
+ * struct vdec_vp9_slice_init_vsi - VSI used to initialize instance
+ */
+struct vdec_vp9_slice_init_vsi {
+	unsigned int architecture;
+	unsigned int reserved;
+	u64 core_vsi;
+	/* default frame context's position in MicroP */
+	u64 default_frame_ctx;
+};
+
+/*
+ * struct vdec_vp9_slice_mem - memory address and size
+ */
+struct vdec_vp9_slice_mem {
+	union {
+		u64 buf;
+		dma_addr_t dma_addr;
+	};
+	union {
+		size_t size;
+		dma_addr_t dma_addr_end;
+		u64 padding;
+	};
+};
+
+/*
+ * struct vdec_vp9_slice_bs - input buffer for decoding
+ */
+struct vdec_vp9_slice_bs {
+	struct vdec_vp9_slice_mem buf;
+	struct vdec_vp9_slice_mem frame;
+};
+
+/*
+ * struct vdec_vp9_slice_fb - frame buffer for decoding
+ */
+struct vdec_vp9_slice_fb {
+	struct vdec_vp9_slice_mem y;
+	struct vdec_vp9_slice_mem c;
+};
+
+/*
+ * struct vdec_vp9_slice_state - decoding state
+ */
+struct vdec_vp9_slice_state {
+	int err;
+	unsigned int full;
+	unsigned int timeout;
+	unsigned int perf;
+
+	unsigned int crc[12];
+};
+
+/**
+ * struct vdec_vp9_slice_vsi - exchange decoding information
+ *                             between Main CPU and MicroP
+ * @bs          : input buffer
+ * @fb          : output buffer
+ * @ref         : 3 reference buffers
+ * @mv          : mv working buffer
+ * @seg         : segmentation working buffer
+ * @tile        : tile buffer
+ * @prob        : prob table buffer, used to set/update prob table
+ * @counts      : counts table buffer, used to update prob table
+ * @ube         : general buffer
+ * @trans       : trans buffer position in general buffer
+ * @err_map     : error buffer
+ * @row_info    : row info buffer
+ * @frame       : decoding syntax
+ * @state       : decoding state
+ */
+struct vdec_vp9_slice_vsi {
+	/* used in LAT stage */
+	struct vdec_vp9_slice_bs bs;
+	/* used in Core stage */
+	struct vdec_vp9_slice_fb fb;
+	struct vdec_vp9_slice_fb ref[3];
+
+	struct vdec_vp9_slice_mem mv[2];
+	struct vdec_vp9_slice_mem seg[2];
+	struct vdec_vp9_slice_mem tile;
+	struct vdec_vp9_slice_mem prob;
+	struct vdec_vp9_slice_mem counts;
+
+	/* LAT stage's output, Core stage's input */
+	struct vdec_vp9_slice_mem ube;
+	struct vdec_vp9_slice_mem trans;
+	struct vdec_vp9_slice_mem err_map;
+	struct vdec_vp9_slice_mem row_info;
+
+	/* decoding parameters */
+	struct vdec_vp9_slice_frame frame;
+
+	struct vdec_vp9_slice_state state;
+};
+
+/**
+ * struct vdec_vp9_slice_pfc - per-frame context that contains a local vsi.
+ *                             pass it from lat to core
+ * @vsi         : local vsi. copy to/from remote vsi before/after decoding
+ * @ref_idx     : reference buffer index
+ * @seq         : picture sequence
+ * @state       : decoding state
+ */
+struct vdec_vp9_slice_pfc {
+	struct vdec_vp9_slice_vsi vsi;
+
+	u64 ref_idx[3];
+
+	int seq;
+
+	/* LAT/Core CRC */
+	struct vdec_vp9_slice_state state[2];
+};
+
+/*
+ * enum vdec_vp9_slice_resolution_level
+ */
+enum vdec_vp9_slice_resolution_level {
+	VP9_RES_NONE,
+	VP9_RES_FHD,
+	VP9_RES_4K,
+	VP9_RES_8K,
+};
+
+/*
+ * struct vdec_vp9_slice_ref - picture's width & height should kept
+ *                             for later decoding as reference picture
+ */
+struct vdec_vp9_slice_ref {
+	unsigned int width;
+	unsigned int height;
+};
+
+/**
+ * struct vdec_vp9_slice_instance - represent one vp9 instance
+ * @ctx         : pointer to codec's context
+ * @vpu         : VPU instance
+ * @seq         : global picture sequence
+ * @level       : level of current resolution
+ * @width       : width of last picture
+ * @height      : height of last picture
+ * @frame_type  : frame_type of last picture
+ * @irq         : irq to Main CPU or MicroP
+ * @show_frame  : show_frame of last picture
+ * @dpb         : picture information (width/height) for reference
+ * @mv          : mv working buffer
+ * @seg         : segmentation working buffer
+ * @tile        : tile buffer
+ * @prob        : prob table buffer, used to set/update prob table
+ * @counts      : counts table buffer, used to update prob table
+ * @frame_ctx   : 4 frame context according to VP9 Spec
+ * @dirty       : state of each frame context
+ * @init_vsi    : vsi used for initialized VP9 instance
+ * @vsi         : vsi used for decoding/flush ...
+ * @core_vsi    : vsi used for Core stage
+ */
+struct vdec_vp9_slice_instance {
+	struct mtk_vcodec_ctx *ctx;
+	struct vdec_vpu_inst vpu;
+
+	int seq;
+
+	enum vdec_vp9_slice_resolution_level level;
+
+	/* for resolution change and get_pic_info */
+	unsigned int width;
+	unsigned int height;
+
+	/* for last_frame_type */
+	unsigned int frame_type;
+	unsigned int irq;
+
+	unsigned int show_frame;
+
+	/* maintain vp9 reference frame state */
+	struct vdec_vp9_slice_ref dpb[VB2_MAX_FRAME];
+
+	/*
+	 * normal working buffers
+	 * mv[0]/seg[0]/tile/prob/counts is used for LAT
+	 * mv[1]/seg[1] is used for CORE
+	 */
+	struct mtk_vcodec_mem mv[2];
+	struct mtk_vcodec_mem seg[2];
+	struct mtk_vcodec_mem tile;
+	struct mtk_vcodec_mem prob;
+	struct mtk_vcodec_mem counts;
+
+	/* 4 prob tables */
+	struct vdec_vp9_slice_frame_ctx frame_ctx[4];
+	unsigned char dirty[4];
+
+	/* MicroP vsi */
+	union {
+		struct vdec_vp9_slice_init_vsi *init_vsi;
+		struct vdec_vp9_slice_vsi *vsi;
+	};
+	struct vdec_vp9_slice_vsi *core_vsi;
+};
+
+/*
+ * (2, (0, (1, 3)))
+ * max level = 2
+ */
+static const signed char vdec_vp9_slice_inter_mode_tree[6] = {
+	-2, 2, 0, 4, -1, -3
+};
+
+/* max level = 6 */
+static const signed char vdec_vp9_slice_intra_mode_tree[18] = {
+	0, 2, -9, 4, -1, 6, 8, 12, -2, 10, -4, -5, -3, 14, -8, 16, -6, -7
+};
+
+/* max level = 2 */
+static const signed char vdec_vp9_slice_partition_tree[6] = {
+	0, 2, -1, 4, -2, -3
+};
+
+/* max level = 1 */
+static const signed char vdec_vp9_slice_switchable_interp_tree[4] = {
+	0, 2, -1, -2
+};
+
+/* max level = 2 */
+static const signed char vdec_vp9_slice_mv_joint_tree[6] = {
+	0, 2, -1, 4, -2, -3
+};
+
+/* max level = 6 */
+static const signed char vdec_vp9_slice_mv_class_tree[20] = {
+	0, 2, -1, 4, 6, 8, -2, -3, 10, 12,
+	-4, -5, -6, 14, 16, 18, -7, -8, -9, -10
+};
+
+/* max level = 0 */
+static const signed char vdec_vp9_slice_mv_class0_tree[2] = {
+	0, -1
+};
+
+/* max level = 2 */
+static const signed char vdec_vp9_slice_mv_fp_tree[6] = {
+	0, 2, -1, 4, -2, -3
+};
+
+/*
+ * all VP9 instances could share this default frame context.
+ */
+static struct vdec_vp9_slice_frame_ctx *vdec_vp9_slice_default_frame_ctx;
+static DEFINE_MUTEX(vdec_vp9_slice_frame_ctx_lock);
+
+static int vdec_vp9_slice_core_decode(struct vdec_lat_buf *lat_buf);
+
+static int vdec_vp9_slice_init_default_frame_ctx(struct vdec_vp9_slice_instance *instance)
+{
+	struct vdec_vp9_slice_frame_ctx *remote_frame_ctx;
+	struct vdec_vp9_slice_frame_ctx *frame_ctx;
+	struct mtk_vcodec_ctx *ctx;
+	struct vdec_vp9_slice_init_vsi *vsi;
+	int ret = 0;
+
+	ctx = instance->ctx;
+	vsi = instance->vpu.vsi;
+	if (!ctx || !vsi)
+		return -EINVAL;
+
+	remote_frame_ctx = mtk_vcodec_fw_map_dm_addr(ctx->dev->fw_handler,
+						     (u32)vsi->default_frame_ctx);
+	if (!remote_frame_ctx) {
+		mtk_vcodec_err(instance, "failed to map default frame ctx\n");
+		return -EINVAL;
+	}
+
+	mutex_lock(&vdec_vp9_slice_frame_ctx_lock);
+	if (vdec_vp9_slice_default_frame_ctx)
+		goto out;
+
+	frame_ctx = kmalloc(sizeof(*frame_ctx), GFP_KERNEL);
+	if (!frame_ctx) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	memcpy_fromio(frame_ctx, remote_frame_ctx, sizeof(*frame_ctx));
+	vdec_vp9_slice_default_frame_ctx = frame_ctx;
+
+out:
+	mutex_unlock(&vdec_vp9_slice_frame_ctx_lock);
+
+	return ret;
+}
+
+static int vdec_vp9_slice_alloc_working_buffer(struct vdec_vp9_slice_instance *instance,
+					       struct vdec_vp9_slice_vsi *vsi)
+{
+	struct mtk_vcodec_ctx *ctx = instance->ctx;
+	enum vdec_vp9_slice_resolution_level level;
+	/* super blocks */
+	unsigned int max_sb_w;
+	unsigned int max_sb_h;
+	unsigned int max_w;
+	unsigned int max_h;
+	unsigned int w;
+	unsigned int h;
+	size_t size;
+	int ret;
+	int i;
+
+	w = vsi->frame.uh.frame_width;
+	h = vsi->frame.uh.frame_height;
+
+	if (w > VCODEC_DEC_4K_CODED_WIDTH ||
+	    h > VCODEC_DEC_4K_CODED_HEIGHT) {
+		/* 8K? */
+		return -EINVAL;
+	} else if (w > MTK_VDEC_MAX_W || h > MTK_VDEC_MAX_H) {
+		/* 4K */
+		level = VP9_RES_4K;
+		max_w = VCODEC_DEC_4K_CODED_WIDTH;
+		max_h = VCODEC_DEC_4K_CODED_HEIGHT;
+	} else {
+		/* FHD */
+		level = VP9_RES_FHD;
+		max_w = MTK_VDEC_MAX_W;
+		max_h = MTK_VDEC_MAX_H;
+	}
+
+	if (level == instance->level)
+		return 0;
+
+	mtk_vcodec_debug(instance, "resolution level changed, from %u to %u, %ux%u",
+			 instance->level, level, w, h);
+
+	max_sb_w = DIV_ROUND_UP(max_w, 64);
+	max_sb_h = DIV_ROUND_UP(max_h, 64);
+	ret = -ENOMEM;
+
+	/*
+	 * Lat-flush must wait core idle, otherwise core will
+	 * use released buffers
+	 */
+
+	size = (max_sb_w * max_sb_h + 2) * 576;
+	for (i = 0; i < 2; i++) {
+		if (instance->mv[i].va)
+			mtk_vcodec_mem_free(ctx, &instance->mv[i]);
+		instance->mv[i].size = size;
+		if (mtk_vcodec_mem_alloc(ctx, &instance->mv[i]))
+			goto err;
+	}
+
+	size = (max_sb_w * max_sb_h * 32) + 256;
+	for (i = 0; i < 2; i++) {
+		if (instance->seg[i].va)
+			mtk_vcodec_mem_free(ctx, &instance->seg[i]);
+		instance->seg[i].size = size;
+		if (mtk_vcodec_mem_alloc(ctx, &instance->seg[i]))
+			goto err;
+	}
+
+	if (!instance->tile.va) {
+		instance->tile.size = VP9_TILE_BUF_SIZE;
+		if (mtk_vcodec_mem_alloc(ctx, &instance->tile))
+			goto err;
+	}
+
+	if (!instance->prob.va) {
+		instance->prob.size = VP9_PROB_BUF_SIZE;
+		if (mtk_vcodec_mem_alloc(ctx, &instance->prob))
+			goto err;
+	}
+
+	if (!instance->counts.va) {
+		instance->counts.size = VP9_COUNTS_BUF_SIZE;
+		if (mtk_vcodec_mem_alloc(ctx, &instance->counts))
+			goto err;
+	}
+
+	instance->level = level;
+	return 0;
+
+err:
+	instance->level = VP9_RES_NONE;
+	return ret;
+}
+
+static void vdec_vp9_slice_free_working_buffer(struct vdec_vp9_slice_instance *instance)
+{
+	struct mtk_vcodec_ctx *ctx = instance->ctx;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(instance->mv); i++) {
+		if (instance->mv[i].va)
+			mtk_vcodec_mem_free(ctx, &instance->mv[i]);
+	}
+	for (i = 0; i < ARRAY_SIZE(instance->seg); i++) {
+		if (instance->seg[i].va)
+			mtk_vcodec_mem_free(ctx, &instance->seg[i]);
+	}
+	if (instance->tile.va)
+		mtk_vcodec_mem_free(ctx, &instance->tile);
+	if (instance->prob.va)
+		mtk_vcodec_mem_free(ctx, &instance->prob);
+	if (instance->counts.va)
+		mtk_vcodec_mem_free(ctx, &instance->counts);
+
+	instance->level = VP9_RES_NONE;
+}
+
+static void vdec_vp9_slice_vsi_from_remote(struct vdec_vp9_slice_vsi *vsi,
+					   struct vdec_vp9_slice_vsi *remote_vsi,
+					   int skip)
+{
+	struct vdec_vp9_slice_frame *rf;
+	struct vdec_vp9_slice_frame *f;
+
+	/*
+	 * compressed header
+	 * dequant
+	 * buffer position
+	 * decode state
+	 */
+	if (!skip) {
+		rf = &remote_vsi->frame;
+		f = &vsi->frame;
+		memcpy_fromio(&f->ch, &rf->ch, sizeof(f->ch));
+		memcpy_fromio(&f->uh.dequant, &rf->uh.dequant, sizeof(f->uh.dequant));
+		memcpy_fromio(&vsi->trans, &remote_vsi->trans, sizeof(vsi->trans));
+	}
+
+	memcpy_fromio(&vsi->state, &remote_vsi->state, sizeof(vsi->state));
+}
+
+static void vdec_vp9_slice_vsi_to_remote(struct vdec_vp9_slice_vsi *vsi,
+					 struct vdec_vp9_slice_vsi *remote_vsi)
+{
+	memcpy_toio(remote_vsi, vsi, sizeof(*vsi));
+}
+
+static int vdec_vp9_slice_tile_offset(int idx, int mi_num, int tile_log2)
+{
+	int sbs = (mi_num + 7) >> 3;
+	int offset = ((idx * sbs) >> tile_log2) << 3;
+
+	return offset < mi_num ? offset : mi_num;
+}
+
+static int vdec_vp9_slice_setup_lat_from_src_buf(struct vdec_vp9_slice_instance *instance,
+						 struct vdec_lat_buf *lat_buf)
+{
+	struct vb2_v4l2_buffer *src;
+	struct vb2_v4l2_buffer *dst;
+
+	src = v4l2_m2m_next_src_buf(instance->ctx->m2m_ctx);
+	if (!src)
+		return -EINVAL;
+
+	dst = &lat_buf->ts_info;
+	v4l2_m2m_buf_copy_metadata(src, dst, true);
+	return 0;
+}
+
+static void vdec_vp9_slice_setup_hdr(struct vdec_vp9_slice_instance *instance,
+				     struct vdec_vp9_slice_uncompressed_header *uh,
+				     struct v4l2_ctrl_vp9_frame *hdr)
+{
+	int i;
+
+	uh->profile = hdr->profile;
+	uh->last_frame_type = instance->frame_type;
+	uh->frame_type = !HDR_FLAG(KEY_FRAME);
+	uh->last_show_frame = instance->show_frame;
+	uh->show_frame = HDR_FLAG(SHOW_FRAME);
+	uh->error_resilient_mode = HDR_FLAG(ERROR_RESILIENT);
+	uh->bit_depth = hdr->bit_depth;
+	uh->last_frame_width = instance->width;
+	uh->last_frame_height = instance->height;
+	uh->frame_width = hdr->frame_width_minus_1 + 1;
+	uh->frame_height = hdr->frame_height_minus_1 + 1;
+	uh->intra_only = HDR_FLAG(INTRA_ONLY);
+	/* map v4l2 enum to values defined in VP9 spec for firmware */
+	switch (hdr->reset_frame_context) {
+	case V4L2_VP9_RESET_FRAME_CTX_NONE:
+		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_NONE0;
+		break;
+	case V4L2_VP9_RESET_FRAME_CTX_SPEC:
+		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_SPEC;
+		break;
+	case V4L2_VP9_RESET_FRAME_CTX_ALL:
+		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_ALL;
+		break;
+	default:
+		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_NONE0;
+		break;
+	}
+	/*
+	 * ref_frame_sign_bias specifies the intended direction
+	 * of the motion vector in time for each reference frame.
+	 * - INTRA_FRAME = 0,
+	 * - LAST_FRAME = 1,
+	 * - GOLDEN_FRAME = 2,
+	 * - ALTREF_FRAME = 3,
+	 * ref_frame_sign_bias[INTRA_FRAME] is always 0
+	 * and VDA only passes another 3 directions
+	 */
+	uh->ref_frame_sign_bias[0] = 0;
+	for (i = 0; i < 3; i++)
+		uh->ref_frame_sign_bias[i + 1] =
+			!!(hdr->ref_frame_sign_bias & (1 << i));
+	uh->allow_high_precision_mv = HDR_FLAG(ALLOW_HIGH_PREC_MV);
+	uh->interpolation_filter = hdr->interpolation_filter;
+	uh->refresh_frame_context = HDR_FLAG(REFRESH_FRAME_CTX);
+	uh->frame_parallel_decoding_mode = HDR_FLAG(PARALLEL_DEC_MODE);
+	uh->frame_context_idx = hdr->frame_context_idx;
+
+	/* tile info */
+	uh->tile_cols_log2 = hdr->tile_cols_log2;
+	uh->tile_rows_log2 = hdr->tile_rows_log2;
+
+	uh->uncompressed_header_size = hdr->uncompressed_header_size;
+	uh->header_size_in_bytes = hdr->compressed_header_size;
+}
+
+static void vdec_vp9_slice_setup_frame_ctx(struct vdec_vp9_slice_instance *instance,
+					   struct vdec_vp9_slice_uncompressed_header *uh,
+					   struct v4l2_ctrl_vp9_frame *hdr)
+{
+	int error_resilient_mode;
+	int reset_frame_context;
+	int key_frame;
+	int intra_only;
+	int i;
+
+	key_frame = HDR_FLAG(KEY_FRAME);
+	intra_only = HDR_FLAG(INTRA_ONLY);
+	error_resilient_mode = HDR_FLAG(ERROR_RESILIENT);
+	reset_frame_context = uh->reset_frame_context;
+
+	/*
+	 * according to "6.2 Uncompressed header syntax" in
+	 * "VP9 Bitstream & Decoding Process Specification",
+	 * reset @frame_context_idx when (FrameIsIntra || error_resilient_mode)
+	 */
+	if (key_frame || intra_only || error_resilient_mode) {
+		/*
+		 * @reset_frame_context specifies
+		 * whether the frame context should be
+		 * reset to default values:
+		 * 0 or 1 means do not reset any frame context
+		 * 2 resets just the context specified in the frame header
+		 * 3 resets all contexts
+		 */
+		if (key_frame || error_resilient_mode ||
+		    reset_frame_context == 3) {
+			/* use default table */
+			for (i = 0; i < 4; i++)
+				instance->dirty[i] = 0;
+		} else if (reset_frame_context == 2) {
+			instance->dirty[uh->frame_context_idx] = 0;
+		}
+		uh->frame_context_idx = 0;
+	}
+}
+
+static void vdec_vp9_slice_setup_loop_filter(struct vdec_vp9_slice_uncompressed_header *uh,
+					     struct v4l2_vp9_loop_filter *lf)
+{
+	int i;
+
+	uh->loop_filter_level = lf->level;
+	uh->loop_filter_sharpness = lf->sharpness;
+	uh->loop_filter_delta_enabled = LF_FLAG(DELTA_ENABLED);
+	for (i = 0; i < 4; i++)
+		uh->loop_filter_ref_deltas[i] = lf->ref_deltas[i];
+	for (i = 0; i < 2; i++)
+		uh->loop_filter_mode_deltas[i] = lf->mode_deltas[i];
+}
+
+static void vdec_vp9_slice_setup_quantization(struct vdec_vp9_slice_uncompressed_header *uh,
+					      struct v4l2_vp9_quantization *quant)
+{
+	uh->base_q_idx = quant->base_q_idx;
+	uh->delta_q_y_dc = quant->delta_q_y_dc;
+	uh->delta_q_uv_dc = quant->delta_q_uv_dc;
+	uh->delta_q_uv_ac = quant->delta_q_uv_ac;
+}
+
+static void vdec_vp9_slice_setup_segmentation(struct vdec_vp9_slice_uncompressed_header *uh,
+					      struct v4l2_vp9_segmentation *seg)
+{
+	int i;
+	int j;
+
+	uh->segmentation_enabled = SEG_FLAG(ENABLED);
+	uh->segmentation_update_map = SEG_FLAG(UPDATE_MAP);
+	for (i = 0; i < 7; i++)
+		uh->segmentation_tree_probs[i] = seg->tree_probs[i];
+	uh->segmentation_temporal_udpate = SEG_FLAG(TEMPORAL_UPDATE);
+	for (i = 0; i < 3; i++)
+		uh->segmentation_pred_prob[i] = seg->pred_probs[i];
+	uh->segmentation_update_data = SEG_FLAG(UPDATE_DATA);
+	uh->segmentation_abs_or_delta_update = SEG_FLAG(ABS_OR_DELTA_UPDATE);
+	for (i = 0; i < 8; i++) {
+		uh->feature_enabled[i] = seg->feature_enabled[i];
+		for (j = 0; j < 4; j++)
+			uh->feature_value[i][j] = seg->feature_data[i][j];
+	}
+}
+
+static int vdec_vp9_slice_setup_tile(struct vdec_vp9_slice_vsi *vsi,
+				     struct v4l2_ctrl_vp9_frame *hdr)
+{
+	unsigned int rows_log2;
+	unsigned int cols_log2;
+	unsigned int rows;
+	unsigned int cols;
+	unsigned int mi_rows;
+	unsigned int mi_cols;
+	struct vdec_vp9_slice_tiles *tiles;
+	int offset;
+	int start;
+	int end;
+	int i;
+
+	rows_log2 = hdr->tile_rows_log2;
+	cols_log2 = hdr->tile_cols_log2;
+	rows = 1 << rows_log2;
+	cols = 1 << cols_log2;
+	tiles = &vsi->frame.tiles;
+	tiles->actual_rows = 0;
+
+	if (rows > 4 || cols > 64)
+		return -EINVAL;
+
+	/* setup mi rows/cols information */
+	mi_rows = (hdr->frame_height_minus_1 + 1 + 7) >> 3;
+	mi_cols = (hdr->frame_width_minus_1 + 1 + 7) >> 3;
+
+	for (i = 0; i < rows; i++) {
+		start = vdec_vp9_slice_tile_offset(i, mi_rows, rows_log2);
+		end = vdec_vp9_slice_tile_offset(i + 1, mi_rows, rows_log2);
+		offset = end - start;
+		tiles->mi_rows[i] = (offset + 7) >> 3;
+		if (tiles->mi_rows[i])
+			tiles->actual_rows++;
+	}
+
+	for (i = 0; i < cols; i++) {
+		start = vdec_vp9_slice_tile_offset(i, mi_cols, cols_log2);
+		end = vdec_vp9_slice_tile_offset(i + 1, mi_cols, cols_log2);
+		offset = end - start;
+		tiles->mi_cols[i] = (offset + 7) >> 3;
+	}
+
+	return 0;
+}
+
+static void vdec_vp9_slice_setup_state(struct vdec_vp9_slice_vsi *vsi)
+{
+	memset(&vsi->state, 0, sizeof(vsi->state));
+}
+
+static void vdec_vp9_slice_setup_ref_idx(struct vdec_vp9_slice_pfc *pfc,
+					 struct v4l2_ctrl_vp9_frame *hdr)
+{
+	pfc->ref_idx[0] = hdr->last_frame_ts;
+	pfc->ref_idx[1] = hdr->golden_frame_ts;
+	pfc->ref_idx[2] = hdr->alt_frame_ts;
+}
+
+static int vdec_vp9_slice_setup_pfc(struct vdec_vp9_slice_instance *instance,
+				    struct vdec_vp9_slice_pfc *pfc)
+{
+	struct v4l2_ctrl_vp9_frame *hdr;
+	struct vdec_vp9_slice_uncompressed_header *uh;
+	struct v4l2_ctrl *hdr_ctrl;
+	struct vdec_vp9_slice_vsi *vsi;
+	int ret;
+
+	/* frame header */
+	hdr_ctrl = v4l2_ctrl_find(&instance->ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_FRAME);
+	if (!hdr_ctrl || !hdr_ctrl->p_cur.p)
+		return -EINVAL;
+
+	hdr = hdr_ctrl->p_cur.p;
+	vsi = &pfc->vsi;
+	uh = &vsi->frame.uh;
+
+	/* setup vsi information */
+	vdec_vp9_slice_setup_hdr(instance, uh, hdr);
+	vdec_vp9_slice_setup_frame_ctx(instance, uh, hdr);
+	vdec_vp9_slice_setup_loop_filter(uh, &hdr->lf);
+	vdec_vp9_slice_setup_quantization(uh, &hdr->quant);
+	vdec_vp9_slice_setup_segmentation(uh, &hdr->seg);
+	ret = vdec_vp9_slice_setup_tile(vsi, hdr);
+	if (ret)
+		return ret;
+	vdec_vp9_slice_setup_state(vsi);
+
+	/* core stage needs buffer index to get ref y/c ... */
+	vdec_vp9_slice_setup_ref_idx(pfc, hdr);
+
+	pfc->seq = instance->seq;
+	instance->seq++;
+
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_lat_buffer(struct vdec_vp9_slice_instance *instance,
+					   struct vdec_vp9_slice_vsi *vsi,
+					   struct mtk_vcodec_mem *bs,
+					   struct vdec_lat_buf *lat_buf)
+{
+	int i;
+
+	vsi->bs.buf.dma_addr = bs->dma_addr;
+	vsi->bs.buf.size = bs->size;
+	vsi->bs.frame.dma_addr = bs->dma_addr;
+	vsi->bs.frame.size = bs->size;
+
+	for (i = 0; i < 2; i++) {
+		vsi->mv[i].dma_addr = instance->mv[i].dma_addr;
+		vsi->mv[i].size = instance->mv[i].size;
+	}
+	for (i = 0; i < 2; i++) {
+		vsi->seg[i].dma_addr = instance->seg[i].dma_addr;
+		vsi->seg[i].size = instance->seg[i].size;
+	}
+	vsi->tile.dma_addr = instance->tile.dma_addr;
+	vsi->tile.size = instance->tile.size;
+	vsi->prob.dma_addr = instance->prob.dma_addr;
+	vsi->prob.size = instance->prob.size;
+	vsi->counts.dma_addr = instance->counts.dma_addr;
+	vsi->counts.size = instance->counts.size;
+
+	vsi->ube.dma_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
+	vsi->ube.size = lat_buf->ctx->msg_queue.wdma_addr.size;
+	vsi->trans.dma_addr = lat_buf->ctx->msg_queue.wdma_wptr_addr;
+	/* used to store trans end */
+	vsi->trans.dma_addr_end = lat_buf->ctx->msg_queue.wdma_rptr_addr;
+	vsi->err_map.dma_addr = lat_buf->wdma_err_addr.dma_addr;
+	vsi->err_map.size = lat_buf->wdma_err_addr.size;
+
+	vsi->row_info.buf = 0;
+	vsi->row_info.size = 0;
+
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_prob_buffer(struct vdec_vp9_slice_instance *instance,
+					    struct vdec_vp9_slice_vsi *vsi)
+{
+	struct vdec_vp9_slice_frame_ctx *frame_ctx;
+	struct vdec_vp9_slice_uncompressed_header *uh;
+
+	uh = &vsi->frame.uh;
+
+	mtk_vcodec_debug(instance, "ctx dirty %u idx %d\n",
+			 instance->dirty[uh->frame_context_idx],
+			 uh->frame_context_idx);
+
+	if (instance->dirty[uh->frame_context_idx])
+		frame_ctx = &instance->frame_ctx[uh->frame_context_idx];
+	else
+		frame_ctx = vdec_vp9_slice_default_frame_ctx;
+	memcpy(instance->prob.va, frame_ctx, sizeof(*frame_ctx));
+
+	return 0;
+}
+
+static void vdec_vp9_slice_setup_seg_buffer(struct vdec_vp9_slice_instance *instance,
+					    struct vdec_vp9_slice_vsi *vsi,
+					    struct mtk_vcodec_mem *buf)
+{
+	struct vdec_vp9_slice_uncompressed_header *uh;
+
+	/* reset segment buffer */
+	uh = &vsi->frame.uh;
+	if (uh->frame_type == 0 ||
+	    uh->intra_only ||
+	    uh->error_resilient_mode ||
+	    uh->frame_width != instance->width ||
+	    uh->frame_height != instance->height) {
+		mtk_vcodec_debug(instance, "reset seg\n");
+		memset(buf->va, 0, buf->size);
+	}
+}
+
+/*
+ * parse tiles according to `6.4 Decode tiles syntax`
+ * in "vp9-bitstream-specification"
+ *
+ * frame contains uncompress header, compressed header and several tiles.
+ * this function parses tiles' position and size, stores them to tile buffer
+ * for decoding.
+ */
+static int vdec_vp9_slice_setup_tile_buffer(struct vdec_vp9_slice_instance *instance,
+					    struct vdec_vp9_slice_vsi *vsi,
+					    struct mtk_vcodec_mem *bs)
+{
+	struct vdec_vp9_slice_uncompressed_header *uh;
+	unsigned int rows_log2;
+	unsigned int cols_log2;
+	unsigned int rows;
+	unsigned int cols;
+	unsigned int mi_row;
+	unsigned int mi_col;
+	unsigned int offset;
+	unsigned int pa;
+	unsigned int size;
+	struct vdec_vp9_slice_tiles *tiles;
+	unsigned char *pos;
+	unsigned char *end;
+	unsigned char *va;
+	unsigned int *tb;
+	int i;
+	int j;
+
+	uh = &vsi->frame.uh;
+	rows_log2 = uh->tile_rows_log2;
+	cols_log2 = uh->tile_cols_log2;
+	rows = 1 << rows_log2;
+	cols = 1 << cols_log2;
+
+	if (rows > 4 || cols > 64) {
+		mtk_vcodec_err(instance, "tile_rows %u tile_cols %u\n",
+			       rows, cols);
+		return -EINVAL;
+	}
+
+	offset = uh->uncompressed_header_size +
+		uh->header_size_in_bytes;
+	if (bs->size <= offset) {
+		mtk_vcodec_err(instance, "bs size %zu tile offset %u\n",
+			       bs->size, offset);
+		return -EINVAL;
+	}
+
+	tiles = &vsi->frame.tiles;
+	/* setup tile buffer */
+
+	va = (unsigned char *)bs->va;
+	pos = va + offset;
+	end = va + bs->size;
+	/* truncated */
+	pa = (unsigned int)bs->dma_addr + offset;
+	tb = instance->tile.va;
+	for (i = 0; i < rows; i++) {
+		for (j = 0; j < cols; j++) {
+			if (i == rows - 1 &&
+			    j == cols - 1) {
+				size = (unsigned int)(end - pos);
+			} else {
+				if (end - pos < 4)
+					return -EINVAL;
+
+				size = (pos[0] << 24) | (pos[1] << 16) |
+					(pos[2] << 8) | pos[3];
+				pos += 4;
+				pa += 4;
+				offset += 4;
+				if (end - pos < size)
+					return -EINVAL;
+			}
+			tiles->size[i][j] = size;
+			if (tiles->mi_rows[i]) {
+				*tb++ = (size << 3) + ((offset << 3) & 0x7f);
+				*tb++ = pa & ~0xf;
+				*tb++ = (pa << 3) & 0x7f;
+				mi_row = (tiles->mi_rows[i] - 1) & 0x1ff;
+				mi_col = (tiles->mi_cols[j] - 1) & 0x3f;
+				*tb++ = (mi_row << 6) + mi_col;
+			}
+			pos += size;
+			pa += size;
+			offset += size;
+		}
+	}
+
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
+				    struct mtk_vcodec_mem *bs,
+				    struct vdec_lat_buf *lat_buf,
+				    struct vdec_vp9_slice_pfc *pfc)
+{
+	struct vdec_vp9_slice_vsi *vsi = &pfc->vsi;
+	int ret;
+
+	ret = vdec_vp9_slice_setup_lat_from_src_buf(instance, lat_buf);
+	if (ret)
+		goto err;
+
+	ret = vdec_vp9_slice_setup_pfc(instance, pfc);
+	if (ret)
+		goto err;
+
+	ret = vdec_vp9_slice_alloc_working_buffer(instance, vsi);
+	if (ret)
+		goto err;
+
+	ret = vdec_vp9_slice_setup_lat_buffer(instance, vsi, bs, lat_buf);
+	if (ret)
+		goto err;
+
+	vdec_vp9_slice_setup_seg_buffer(instance, vsi, &instance->seg[0]);
+
+	/* setup prob/tile buffers for LAT */
+
+	ret = vdec_vp9_slice_setup_prob_buffer(instance, vsi);
+	if (ret)
+		goto err;
+
+	ret = vdec_vp9_slice_setup_tile_buffer(instance, vsi, bs);
+	if (ret)
+		goto err;
+
+	return 0;
+
+err:
+	return ret;
+}
+
+/* implement merge prob process defined in 8.4.1 */
+static unsigned char vdec_vp9_slice_merge_prob(unsigned char pre, unsigned int ct0,
+					       unsigned int ct1, unsigned int cs,
+					       unsigned int uf)
+{
+	unsigned int den;
+	unsigned int prob;
+	unsigned int count;
+	unsigned int factor;
+
+	/*
+	 * The variable den representing the total times
+	 * this boolean has been decoded is set equal to ct0 + ct1.
+	 */
+	den = ct0 + ct1;
+	if (!den)
+		return pre;  /* => count = 0 => factor = 0 */
+	/*
+	 * The variable prob estimating the probability that
+	 * the boolean is decoded as a 0 is set equal to
+	 * (den == 0) ? 128 : Clip3(1, 255, (ct0 * 256 + (den >> 1)) / den).
+	 */
+	prob = ((ct0 << 8) + (den >> 1)) / den;
+	prob = prob < 1 ? 1 : (prob > 255 ? 255 : prob);
+	/* The variable count is set equal to Min(ct0 + ct1, countSat) */
+	count = den < cs ? den : cs;
+	/*
+	 * The variable factor is set equal to
+	 * maxUpdateFactor * count / countSat.
+	 */
+	factor = uf * count / cs;
+	/*
+	 * The return variable outProb is set equal to
+	 * Round2(preProb * (256 - factor) + prob * factor, 8).
+	 */
+	return pre + (((prob - pre) * factor + 128) >> 8);
+}
+
+static inline unsigned char vdec_vp9_slice_adapt_prob(unsigned char pre, unsigned int ct0,
+						      unsigned int ct1)
+{
+	return vdec_vp9_slice_merge_prob(pre, ct0, ct1, 20, 128);
+}
+
+/* implement merge probs process defined in 8.4.2 */
+static unsigned int vdec_vp9_slice_merge_probs(const signed char *tree, int location,
+					       unsigned char *pre_probs, unsigned int *counts,
+					       unsigned char *probs, unsigned int cs,
+					       unsigned int uf)
+{
+	int left = tree[location];
+	int right = tree[location + 1];
+	unsigned int left_count;
+	unsigned int right_count;
+
+	if (left <= 0)
+		left_count = counts[-left];
+	else
+		left_count = vdec_vp9_slice_merge_probs(tree, left, pre_probs, counts,
+							probs, cs, uf);
+
+	if (right <= 0)
+		right_count = counts[-right];
+	else
+		right_count = vdec_vp9_slice_merge_probs(tree, right, pre_probs, counts,
+							 probs, cs, uf);
+
+	/* merge left and right */
+	probs[location >> 1] =
+		vdec_vp9_slice_merge_prob(pre_probs[location >> 1],
+					  left_count, right_count, cs, uf);
+	return left_count + right_count;
+}
+
+static inline void vdec_vp9_slice_adapt_probs(const signed char *tree,
+					      unsigned char *pre_probs,
+					      unsigned int *counts,
+					      unsigned char *probs)
+{
+	vdec_vp9_slice_merge_probs(tree, 0, pre_probs, counts, probs, 20, 128);
+}
+
+/* 8.4 Probability adaptation process */
+static void vdec_vp9_slice_adapt_table(struct vdec_vp9_slice_vsi *vsi,
+				       struct vdec_vp9_slice_frame_ctx *ctx,
+				       struct vdec_vp9_slice_frame_ctx *pre_ctx,
+				       struct vdec_vp9_slice_frame_counts *counts)
+{
+	unsigned char *pp;
+	unsigned char *p;
+	unsigned int *c;
+	unsigned int *e;
+	unsigned int uf;
+	int t, i, j, k, l;
+
+	uf = 128;
+	if (!vsi->frame.uh.frame_type || vsi->frame.uh.intra_only ||
+	    vsi->frame.uh.last_frame_type)
+		uf = 112;
+
+	p = (unsigned char *)&ctx->coef_probs;
+	pp = (unsigned char *)&pre_ctx->coef_probs;
+	c = (unsigned int *)&counts->coef_probs;
+	e = (unsigned int *)&counts->eob_branch;
+
+	/* 8.4.3 Coefficient probability adaption process */
+	for (t = 0; t < 16; t++) {
+		for (((k) = 0); ((k) < 6); ((k)++)) {
+			for (l = 0; l < (k == 0 ? 3 : 6); l++) {
+				p[0] = vdec_vp9_slice_merge_prob(pp[0], c[3], e[0]
+								 - c[3], 24, uf);
+				p[1] = vdec_vp9_slice_merge_prob(pp[1],	c[0], c[1]
+								 + c[2], 24, uf);
+				p[2] = vdec_vp9_slice_merge_prob(pp[2], c[1],
+								 c[2], 24, uf);
+				p += 3;
+				pp += 3;
+				c += 4;
+				e++;
+			}
+			if (k == 0) {
+				/* 3 * 3 unused values and 2 bytes padding */
+				p += 11;
+				pp += 11;
+				e++;
+			} else {
+				/* extra 2 bytes could make 4 bytes align (3 * 6 + 2) */
+				p += 2;
+				pp += 2;
+				/* 5 * 6=30, extra 2 int */
+				if (k == 5)
+					e += 2;
+			}
+		}
+	}
+
+	if (!vsi->frame.uh.frame_type || vsi->frame.uh.intra_only)
+		return;
+
+	/* 8.4.4 Non coefficient probability adaption process */
+
+	for (i = 0; i < 4; i++) {
+		ctx->intra_inter_prob[i] =
+			vdec_vp9_slice_adapt_prob(pre_ctx->intra_inter_prob[i],
+						  counts->intra_inter[i][0],
+						  counts->intra_inter[i][1]);
+	}
+
+	for (i = 0; i < 5; i++) {
+		ctx->comp_inter_prob[i] =
+			vdec_vp9_slice_adapt_prob(pre_ctx->comp_inter_prob[i],
+						  counts->comp_inter[i][0],
+						  counts->comp_inter[i][1]);
+	}
+
+	for (i = 0; i < 5; i++) {
+		ctx->comp_ref_prob[i] =
+			vdec_vp9_slice_adapt_prob(pre_ctx->comp_ref_prob[i],
+						  counts->comp_ref[i][0],
+						  counts->comp_ref[i][1]);
+	}
+
+	for (i = 0; i < 5; i++) {
+		for (j = 0; j < 2; j++) {
+			ctx->single_ref_prob[i][j] =
+				vdec_vp9_slice_adapt_prob(pre_ctx->single_ref_prob[i][j],
+							  counts->single_ref[i][j][0],
+							  counts->single_ref[i][j][1]);
+		}
+	}
+
+	for (i = 0; i < 7; i++) {
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_inter_mode_tree,
+					   &pre_ctx->inter_mode_probs[i][0],
+					   &counts->inter_mode[i][0],
+					   &ctx->inter_mode_probs[i][0]);
+	}
+
+	for (i = 0; i < 4; i++) {
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_intra_mode_tree,
+					   &pre_ctx->y_mode_prob[i][0],
+					   &counts->y_mode[i][0],
+					   &ctx->y_mode_prob[i][0]);
+	}
+
+	for (i = 0; i < 10; i++) {
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_intra_mode_tree,
+					   &pre_ctx->uv_mode_prob[i][0],
+					   &counts->uv_mode[i][0],
+					   &ctx->uv_mode_prob[i][0]);
+	}
+
+	for (i = 0; i < 16; i++) {
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_partition_tree,
+					   &pre_ctx->partition_prob[i][0],
+					   &counts->partition[i][0],
+					   &ctx->partition_prob[i][0]);
+	}
+
+	if (vsi->frame.uh.interpolation_filter == 4) {
+		for (i = 0; i < 4; i++) {
+			vdec_vp9_slice_adapt_probs(vdec_vp9_slice_switchable_interp_tree,
+						   &pre_ctx->switch_interp_prob[i][0],
+						   &counts->switchable_interp[i][0],
+						   &ctx->switch_interp_prob[i][0]);
+		}
+	}
+
+	if (vsi->frame.ch.tx_mode == 4) {
+		for (i = 0; i < 2; i++) {
+			ctx->tx_p8x8[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p8x8[i][0],
+								       counts->tx_p8x8[i][0],
+								       counts->tx_p8x8[i][1]);
+			ctx->tx_p16x16[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p16x16[i][0],
+									 counts->tx_p16x16[i][0],
+									 counts->tx_p16x16[i][1] +
+									 counts->tx_p16x16[i][2]);
+			ctx->tx_p16x16[i][1] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p16x16[i][1],
+									 counts->tx_p16x16[i][1],
+									 counts->tx_p16x16[i][2]);
+			ctx->tx_p32x32[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][0],
+									 counts->tx_p32x32[i][0],
+									 counts->tx_p32x32[i][1] +
+									 counts->tx_p32x32[i][2] +
+									 counts->tx_p32x32[i][3]);
+			ctx->tx_p32x32[i][1] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][1],
+									 counts->tx_p32x32[i][1],
+									 counts->tx_p32x32[i][2] +
+									 counts->tx_p32x32[i][3]);
+			ctx->tx_p32x32[i][2] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][2],
+									 counts->tx_p32x32[i][2],
+									 counts->tx_p32x32[i][3]);
+		}
+	}
+
+	for (i = 0; i < 3; i++) {
+		ctx->skip_probs[i] = vdec_vp9_slice_adapt_prob(pre_ctx->skip_probs[i],
+							       counts->skip[i][0],
+							       counts->skip[i][1]);
+	}
+
+	vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_joint_tree,
+				   &pre_ctx->joint[0],
+				   &counts->joint[0],
+				   &ctx->joint[0]);
+
+	for (i = 0; i < 2; i++) {
+		ctx->sign_classes[i].sign = vdec_vp9_slice_adapt_prob(pre_ctx->sign_classes[i].sign,
+								      counts->mvcomp[i].sign[0],
+								      counts->mvcomp[i].sign[1]);
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_class_tree,
+					   &pre_ctx->sign_classes[i].classes[0],
+					   &counts->mvcomp[i].classes[0],
+					   &ctx->sign_classes[i].classes[0]);
+
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_class0_tree,
+					   pre_ctx->class0_bits[i].class0,
+					   counts->mvcomp[i].class0,
+					   ctx->class0_bits[i].class0);
+		for (j = 0; j < 10; j++) {
+			ctx->class0_bits[i].bits[j] =
+				vdec_vp9_slice_adapt_prob(pre_ctx->class0_bits[i].bits[j],
+							  counts->mvcomp[i].bits[j][0],
+							  counts->mvcomp[i].bits[j][1]);
+		}
+
+		for (j = 0; j < 2; ++j) {
+			vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_fp_tree,
+						   pre_ctx->class0_fp_hp[i].class0_fp[j],
+						   counts->mvcomp[i].class0_fp[j],
+						   ctx->class0_fp_hp[i].class0_fp[j]);
+		}
+		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_fp_tree,
+					   pre_ctx->class0_fp_hp[i].fp,
+					   counts->mvcomp[i].fp,
+					   ctx->class0_fp_hp[i].fp);
+		if (vsi->frame.uh.allow_high_precision_mv) {
+			ctx->class0_fp_hp[i].class0_hp =
+				vdec_vp9_slice_adapt_prob(pre_ctx->class0_fp_hp[i].class0_hp,
+							  counts->mvcomp[i].class0_hp[0],
+							  counts->mvcomp[i].class0_hp[1]);
+			ctx->class0_fp_hp[i].hp =
+				vdec_vp9_slice_adapt_prob(pre_ctx->class0_fp_hp[i].hp,
+							  counts->mvcomp[i].hp[0],
+							  counts->mvcomp[i].hp[1]);
+		}
+	}
+}
+
+static int vdec_vp9_slice_update_prob(struct vdec_vp9_slice_instance *instance,
+				      struct vdec_vp9_slice_vsi *vsi)
+{
+	struct vdec_vp9_slice_frame_ctx *pre_frame_ctx;
+	struct vdec_vp9_slice_frame_ctx *frame_ctx;
+	struct vdec_vp9_slice_frame_counts *counts;
+	struct vdec_vp9_slice_uncompressed_header *uh;
+
+	uh = &vsi->frame.uh;
+	pre_frame_ctx = &instance->frame_ctx[uh->frame_context_idx];
+	frame_ctx = (struct vdec_vp9_slice_frame_ctx *)instance->prob.va;
+	counts = (struct vdec_vp9_slice_frame_counts *)instance->counts.va;
+
+	if (!uh->refresh_frame_context)
+		return 0;
+
+	if (!uh->frame_parallel_decoding_mode) {
+		/* uh->error_resilient_mode must be 0 */
+		vdec_vp9_slice_adapt_table(vsi,	frame_ctx,
+					   /* use default frame ctx? */
+					   instance->dirty[uh->frame_context_idx] ?
+					   pre_frame_ctx :
+					   vdec_vp9_slice_default_frame_ctx,
+					   counts);
+	}
+
+	memcpy(pre_frame_ctx, frame_ctx, sizeof(*frame_ctx));
+	instance->dirty[uh->frame_context_idx] = 1;
+
+	return 0;
+}
+
+static int vdec_vp9_slice_update_lat(struct vdec_vp9_slice_instance *instance,
+				     struct vdec_lat_buf *lat_buf,
+				     struct vdec_vp9_slice_pfc *pfc)
+{
+	struct vdec_vp9_slice_vsi *vsi;
+
+	vsi = &pfc->vsi;
+	memcpy(&pfc->state[0], &vsi->state, sizeof(vsi->state));
+
+	mtk_vcodec_debug(instance, "Frame %u LAT CRC 0x%08x\n",
+			 pfc->seq, vsi->state.crc[0]);
+
+	/* buffer full, need to re-decode */
+	if (vsi->state.full) {
+		/* buffer not enough */
+		if (vsi->trans.dma_addr_end - vsi->trans.dma_addr ==
+			vsi->ube.size)
+			return -ENOMEM;
+		return -EAGAIN;
+	}
+
+	vdec_vp9_slice_update_prob(instance, vsi);
+
+	instance->width = vsi->frame.uh.frame_width;
+	instance->height = vsi->frame.uh.frame_height;
+	instance->frame_type = vsi->frame.uh.frame_type;
+	instance->show_frame = vsi->frame.uh.show_frame;
+
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_core_to_dst_buf(struct vdec_vp9_slice_instance *instance,
+						struct vdec_lat_buf *lat_buf)
+{
+	struct vb2_v4l2_buffer *src;
+	struct vb2_v4l2_buffer *dst;
+
+	dst = v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx);
+	if (!dst)
+		return -EINVAL;
+
+	src = &lat_buf->ts_info;
+	dst->vb2_buf.timestamp = src->vb2_buf.timestamp;
+	dst->timecode = src->timecode;
+	dst->field = src->field;
+	dst->flags = src->flags;
+	dst->vb2_buf.copied_timestamp = src->vb2_buf.copied_timestamp;
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_core_buffer(struct vdec_vp9_slice_instance *instance,
+					    struct vdec_vp9_slice_pfc *pfc,
+					    struct vdec_vp9_slice_vsi *vsi,
+					    struct vdec_fb *fb,
+					    struct vdec_lat_buf *lat_buf)
+{
+	struct vb2_buffer *vb;
+	struct vb2_queue *vq;
+	struct vdec_vp9_slice_reference *ref;
+	int plane;
+	int size;
+	int idx;
+	int w;
+	int h;
+	int i;
+
+	plane = instance->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
+	w = vsi->frame.uh.frame_width;
+	h = vsi->frame.uh.frame_height;
+	size = ALIGN(w, 64) * ALIGN(h, 64);
+
+	/* frame buffer */
+	vsi->fb.y.dma_addr = fb->base_y.dma_addr;
+	if (plane == 1)
+		vsi->fb.c.dma_addr = fb->base_y.dma_addr + size;
+	else
+		vsi->fb.c.dma_addr = fb->base_c.dma_addr;
+
+	/* reference buffers */
+	vq = v4l2_m2m_get_vq(instance->ctx->m2m_ctx,
+			     V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+	if (!vq)
+		return -EINVAL;
+
+	/* get current output buffer */
+	vb = &v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx)->vb2_buf;
+	if (!vb)
+		return -EINVAL;
+
+	/* update internal buffer's width/height */
+	for (i = 0; i < vq->num_buffers; i++) {
+		if (vb == vq->bufs[i]) {
+			instance->dpb[i].width = w;
+			instance->dpb[i].height = h;
+			break;
+		}
+	}
+
+	/*
+	 * get buffer's width/height from instance
+	 * get buffer address from vb2buf
+	 */
+	for (i = 0; i < 3; i++) {
+		ref = &vsi->frame.ref[i];
+		idx = vb2_find_timestamp(vq, pfc->ref_idx[i], 0);
+		if (idx < 0) {
+			ref->frame_width = w;
+			ref->frame_height = h;
+			memset(&vsi->ref[i], 0, sizeof(vsi->ref[i]));
+		} else {
+			ref->frame_width = instance->dpb[idx].width;
+			ref->frame_height = instance->dpb[idx].height;
+			vb = vq->bufs[idx];
+			vsi->ref[i].y.dma_addr =
+				vb2_dma_contig_plane_dma_addr(vb, 0);
+			if (plane == 1)
+				vsi->ref[i].c.dma_addr =
+					vsi->ref[i].y.dma_addr + size;
+			else
+				vsi->ref[i].c.dma_addr =
+					vb2_dma_contig_plane_dma_addr(vb, 1);
+		}
+	}
+
+	return 0;
+}
+
+static int vdec_vp9_slice_setup_core(struct vdec_vp9_slice_instance *instance,
+				     struct vdec_fb *fb,
+				     struct vdec_lat_buf *lat_buf,
+				     struct vdec_vp9_slice_pfc *pfc)
+{
+	struct vdec_vp9_slice_vsi *vsi = &pfc->vsi;
+	int ret;
+
+	vdec_vp9_slice_setup_state(vsi);
+
+	ret = vdec_vp9_slice_setup_core_to_dst_buf(instance, lat_buf);
+	if (ret)
+		goto err;
+
+	ret = vdec_vp9_slice_setup_core_buffer(instance, pfc, vsi, fb, lat_buf);
+	if (ret)
+		goto err;
+
+	vdec_vp9_slice_setup_seg_buffer(instance, vsi, &instance->seg[1]);
+
+	return 0;
+
+err:
+	return ret;
+}
+
+static int vdec_vp9_slice_update_core(struct vdec_vp9_slice_instance *instance,
+				      struct vdec_lat_buf *lat_buf,
+				      struct vdec_vp9_slice_pfc *pfc)
+{
+	struct vdec_vp9_slice_vsi *vsi;
+
+	vsi = &pfc->vsi;
+	memcpy(&pfc->state[1], &vsi->state, sizeof(vsi->state));
+
+	mtk_vcodec_debug(instance, "Frame %u Y_CRC %08x %08x %08x %08x\n",
+			 pfc->seq,
+			 vsi->state.crc[0], vsi->state.crc[1],
+			 vsi->state.crc[2], vsi->state.crc[3]);
+	mtk_vcodec_debug(instance, "Frame %u C_CRC %08x %08x %08x %08x\n",
+			 pfc->seq,
+			 vsi->state.crc[4], vsi->state.crc[5],
+			 vsi->state.crc[6], vsi->state.crc[7]);
+
+	return 0;
+}
+
+static int vdec_vp9_slice_init(struct mtk_vcodec_ctx *ctx)
+{
+	struct vdec_vp9_slice_instance *instance;
+	struct vdec_vp9_slice_init_vsi *vsi;
+	int ret;
+
+	instance = kzalloc(sizeof(*instance), GFP_KERNEL);
+	if (!instance)
+		return -ENOMEM;
+
+	instance->ctx = ctx;
+	instance->vpu.id = SCP_IPI_VDEC_LAT;
+	instance->vpu.core_id = SCP_IPI_VDEC_CORE;
+	instance->vpu.ctx = ctx;
+	instance->vpu.codec_type = ctx->current_codec;
+
+	ret = vpu_dec_init(&instance->vpu);
+	if (ret) {
+		mtk_vcodec_err(instance, "failed to init vpu dec, ret %d\n", ret);
+		goto error_vpu_init;
+	}
+
+	/* init vsi and global flags */
+
+	vsi = instance->vpu.vsi;
+	if (!vsi) {
+		mtk_vcodec_err(instance, "failed to get VP9 vsi\n");
+		ret = -EINVAL;
+		goto error_vsi;
+	}
+	instance->init_vsi = vsi;
+	instance->core_vsi = mtk_vcodec_fw_map_dm_addr(ctx->dev->fw_handler,
+						       (u32)vsi->core_vsi);
+	if (!instance->core_vsi) {
+		mtk_vcodec_err(instance, "failed to get VP9 core vsi\n");
+		ret = -EINVAL;
+		goto error_vsi;
+	}
+
+	instance->irq = 1;
+
+	ret = vdec_vp9_slice_init_default_frame_ctx(instance);
+	if (ret)
+		goto error_default_frame_ctx;
+
+	ctx->drv_handle = instance;
+
+	return 0;
+
+error_default_frame_ctx:
+error_vsi:
+	vpu_dec_deinit(&instance->vpu);
+error_vpu_init:
+	kfree(instance);
+	return ret;
+}
+
+static void vdec_vp9_slice_deinit(void *h_vdec)
+{
+	struct vdec_vp9_slice_instance *instance = h_vdec;
+
+	if (!instance)
+		return;
+
+	vpu_dec_deinit(&instance->vpu);
+	vdec_vp9_slice_free_working_buffer(instance);
+	vdec_msg_queue_deinit(&instance->ctx->msg_queue, instance->ctx);
+	kfree(instance);
+}
+
+static int vdec_vp9_slice_flush(void *h_vdec, struct mtk_vcodec_mem *bs,
+				struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_vp9_slice_instance *instance = h_vdec;
+
+	mtk_vcodec_debug(instance, "flush ...\n");
+
+	vdec_msg_queue_wait_lat_buf_full(&instance->ctx->msg_queue);
+	return vpu_dec_reset(&instance->vpu);
+}
+
+static void vdec_vp9_slice_get_pic_info(struct vdec_vp9_slice_instance *instance)
+{
+	struct mtk_vcodec_ctx *ctx = instance->ctx;
+	unsigned int data[3];
+
+	mtk_vcodec_debug(instance, "w %u h %u\n",
+			 ctx->picinfo.pic_w, ctx->picinfo.pic_h);
+
+	data[0] = ctx->picinfo.pic_w;
+	data[1] = ctx->picinfo.pic_h;
+	data[2] = ctx->capture_fourcc;
+	vpu_dec_get_param(&instance->vpu, data, 3, GET_PARAM_PIC_INFO);
+
+	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
+	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);
+	ctx->picinfo.fb_sz[0] = instance->vpu.fb_sz[0];
+	ctx->picinfo.fb_sz[1] = instance->vpu.fb_sz[1];
+}
+
+static void vdec_vp9_slice_get_dpb_size(struct vdec_vp9_slice_instance *instance,
+					unsigned int *dpb_sz)
+{
+	/* refer VP9 specification */
+	*dpb_sz = 9;
+}
+
+static void vdec_vp9_slice_get_crop_info(struct vdec_vp9_slice_instance *instance,
+					 struct v4l2_rect *cr)
+{
+	struct mtk_vcodec_ctx *ctx = instance->ctx;
+
+	cr->left = 0;
+	cr->top = 0;
+	cr->width = ctx->picinfo.pic_w;
+	cr->height = ctx->picinfo.pic_h;
+
+	mtk_vcodec_debug(instance, "l=%d, t=%d, w=%d, h=%d\n",
+			 cr->left, cr->top, cr->width, cr->height);
+}
+
+static int vdec_vp9_slice_get_param(void *h_vdec, enum vdec_get_param_type type, void *out)
+{
+	struct vdec_vp9_slice_instance *instance = h_vdec;
+
+	switch (type) {
+	case GET_PARAM_PIC_INFO:
+		vdec_vp9_slice_get_pic_info(instance);
+		break;
+	case GET_PARAM_DPB_SIZE:
+		vdec_vp9_slice_get_dpb_size(instance, out);
+		break;
+	case GET_PARAM_CROP_INFO:
+		vdec_vp9_slice_get_crop_info(instance, out);
+		break;
+	default:
+		mtk_vcodec_err(instance, "invalid get parameter type=%d\n",
+			       type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int vdec_vp9_slice_lat_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
+				     struct vdec_fb *fb, bool *res_chg)
+{
+	struct vdec_vp9_slice_instance *instance = h_vdec;
+	struct vdec_lat_buf *lat_buf;
+	struct vdec_vp9_slice_pfc *pfc;
+	struct vdec_vp9_slice_vsi *vsi;
+	struct mtk_vcodec_ctx *ctx;
+	int ret;
+
+	if (!instance || !instance->ctx)
+		return -EINVAL;
+	ctx = instance->ctx;
+
+	/* init msgQ for the first time */
+	if (vdec_msg_queue_init(&ctx->msg_queue, ctx,
+				vdec_vp9_slice_core_decode,
+				sizeof(*pfc)))
+		return -ENOMEM;
+
+	/* bs NULL means flush decoder */
+	if (!bs)
+		return vdec_vp9_slice_flush(h_vdec, bs, fb, res_chg);
+
+	lat_buf = vdec_msg_queue_dqbuf(&instance->ctx->msg_queue.lat_ctx);
+	if (!lat_buf) {
+		mtk_vcodec_err(instance, "Failed to get VP9 lat buf\n");
+		return -EBUSY;
+	}
+	pfc = (struct vdec_vp9_slice_pfc *)lat_buf->private_data;
+	if (!pfc)
+		return -EINVAL;
+	vsi = &pfc->vsi;
+
+	ret = vdec_vp9_slice_setup_lat(instance, bs, lat_buf, pfc);
+	if (ret) {
+		mtk_vcodec_err(instance, "Failed to setup VP9 lat ret %d\n", ret);
+		return ret;
+	}
+	vdec_vp9_slice_vsi_to_remote(vsi, instance->vsi);
+
+	ret = vpu_dec_start(&instance->vpu, 0, 0);
+	if (ret) {
+		mtk_vcodec_err(instance, "Failed to dec VP9 ret %d\n", ret);
+		return ret;
+	}
+
+	if (instance->irq) {
+		ret = mtk_vcodec_wait_for_done_ctx(ctx,	MTK_INST_IRQ_RECEIVED,
+						   WAIT_INTR_TIMEOUT_MS, MTK_VDEC_LAT0);
+		/* update remote vsi if decode timeout */
+		if (ret) {
+			mtk_vcodec_err(instance, "VP9 decode timeout %d\n", ret);
+			writel(1, &instance->vsi->state.timeout);
+		}
+		vpu_dec_end(&instance->vpu);
+	}
+
+	vdec_vp9_slice_vsi_from_remote(vsi, instance->vsi, 0);
+	ret = vdec_vp9_slice_update_lat(instance, lat_buf, pfc);
+
+	/* LAT trans full, no more UBE or decode timeout */
+	if (ret) {
+		mtk_vcodec_err(instance, "VP9 decode error: %d\n", ret);
+		return ret;
+	}
+
+	mtk_vcodec_debug(instance, "lat dma 1 0x%llx 0x%llx\n",
+			 pfc->vsi.trans.dma_addr, pfc->vsi.trans.dma_addr_end);
+
+	vdec_msg_queue_update_ube_wptr(&ctx->msg_queue,
+				       vsi->trans.dma_addr_end +
+				       ctx->msg_queue.wdma_addr.dma_addr);
+	vdec_msg_queue_qbuf(&ctx->dev->msg_queue_core_ctx, lat_buf);
+
+	return 0;
+}
+
+static int vdec_vp9_slice_core_decode(struct vdec_lat_buf *lat_buf)
+{
+	struct vdec_vp9_slice_instance *instance;
+	struct vdec_vp9_slice_pfc *pfc;
+	struct mtk_vcodec_ctx *ctx = NULL;
+	struct vdec_fb *fb = NULL;
+	int ret = -EINVAL;
+
+	if (!lat_buf)
+		goto err;
+
+	pfc = lat_buf->private_data;
+	ctx = lat_buf->ctx;
+	if (!pfc || !ctx)
+		goto err;
+
+	instance = ctx->drv_handle;
+	if (!instance)
+		goto err;
+
+	fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
+	if (!fb) {
+		ret = -EBUSY;
+		goto err;
+	}
+
+	ret = vdec_vp9_slice_setup_core(instance, fb, lat_buf, pfc);
+	if (ret) {
+		mtk_vcodec_err(instance, "vdec_vp9_slice_setup_core\n");
+		goto err;
+	}
+	vdec_vp9_slice_vsi_to_remote(&pfc->vsi, instance->core_vsi);
+
+	ret = vpu_dec_core(&instance->vpu);
+	if (ret) {
+		mtk_vcodec_err(instance, "vpu_dec_core\n");
+		goto err;
+	}
+
+	if (instance->irq) {
+		ret = mtk_vcodec_wait_for_done_ctx(ctx, MTK_INST_IRQ_RECEIVED,
+						   WAIT_INTR_TIMEOUT_MS, MTK_VDEC_CORE);
+		/* update remote vsi if decode timeout */
+		if (ret) {
+			mtk_vcodec_err(instance, "VP9 core timeout\n");
+			writel(1, &instance->core_vsi->state.timeout);
+		}
+		vpu_dec_core_end(&instance->vpu);
+	}
+
+	vdec_vp9_slice_vsi_from_remote(&pfc->vsi, instance->core_vsi, 1);
+	ret = vdec_vp9_slice_update_core(instance, lat_buf, pfc);
+	if (ret) {
+		mtk_vcodec_err(instance, "vdec_vp9_slice_update_core\n");
+		goto err;
+	}
+
+	pfc->vsi.trans.dma_addr_end += ctx->msg_queue.wdma_addr.dma_addr;
+	mtk_vcodec_debug(instance, "core dma_addr_end 0x%llx\n", pfc->vsi.trans.dma_addr_end);
+	vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, pfc->vsi.trans.dma_addr_end);
+	ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, 0);
+
+	return 0;
+
+err:
+	if (ctx) {
+		/* always update read pointer */
+		vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, pfc->vsi.trans.dma_addr_end);
+
+		if (fb)
+			ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, 1);
+	}
+	return ret;
+}
+
+const struct vdec_common_if vdec_vp9_slice_lat_if = {
+	.init		= vdec_vp9_slice_init,
+	.decode		= vdec_vp9_slice_lat_decode,
+	.get_param	= vdec_vp9_slice_get_param,
+	.deinit		= vdec_vp9_slice_deinit,
+};
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
index 9db9a57da2c1..2d3a45781359 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
@@ -44,6 +44,10 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
 		ctx->dec_if = &vdec_vp9_if;
 		ctx->hw_id = MTK_VDEC_CORE;
 		break;
+	case V4L2_PIX_FMT_VP9_FRAME:
+		ctx->dec_if = &vdec_vp9_slice_lat_if;
+		ctx->hw_id = MTK_VDEC_LAT0;
+		break;
 	default:
 		return -EINVAL;
 	}
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index e3adf8f36342..e383a04db7b8 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -60,6 +60,7 @@ extern const struct vdec_common_if vdec_h264_slice_lat_if;
 extern const struct vdec_common_if vdec_vp8_if;
 extern const struct vdec_common_if vdec_vp8_slice_if;
 extern const struct vdec_common_if vdec_vp9_if;
+extern const struct vdec_common_if vdec_vp9_slice_lat_if;
 
 /**
  * vdec_if_init() - initialize decode driver
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers
  2022-02-23  3:39 ` [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers Yunfei Dong
@ 2022-02-25  9:23   ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 36+ messages in thread
From: AngeloGioacchino Del Regno @ 2022-02-25  9:23 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	Benjamin Gaignard, Tiffany Lin, Andrew-CT Chen,
	Mauro Carvalho Chehab, Rob Herring, Matthias Brugger,
	Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Il 23/02/22 04:39, Yunfei Dong ha scritto:
> Lock, power and clock are highly coupled operations. Adds vdec
> enable/disable hardware helpers and uses them.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-02-23  3:39 ` [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability Yunfei Dong
@ 2022-02-25  9:23   ` AngeloGioacchino Del Regno
  2022-02-28 21:29   ` Nicolas Dufresne
  1 sibling, 0 replies; 36+ messages in thread
From: AngeloGioacchino Del Regno @ 2022-02-25  9:23 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	Benjamin Gaignard, Tiffany Lin, Andrew-CT Chen,
	Mauro Carvalho Chehab, Rob Herring, Matthias Brugger,
	Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Il 23/02/22 04:39, Yunfei Dong ha scritto:
> Supported max resolution for different platforms are not the same: 2K
> or 4K, getting it according to dec_capability.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes
  2022-02-23  3:40 ` [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes Yunfei Dong
@ 2022-02-25  9:24   ` AngeloGioacchino Del Regno
  2022-03-01 14:34   ` Nicolas Dufresne
  1 sibling, 0 replies; 36+ messages in thread
From: AngeloGioacchino Del Regno @ 2022-02-25  9:24 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	Benjamin Gaignard, Tiffany Lin, Andrew-CT Chen,
	Mauro Carvalho Chehab, Rob Herring, Matthias Brugger,
	Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Il 23/02/22 04:40, Yunfei Dong ha scritto:
> Supported output and capture format types for mt8192 are different
> with mt8183. Needs to get format types according to decoder capability.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C
  2022-02-23  3:40 ` [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C Yunfei Dong
@ 2022-02-25  9:24   ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 36+ messages in thread
From: AngeloGioacchino Del Regno @ 2022-02-25  9:24 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	Benjamin Gaignard, Tiffany Lin, Andrew-CT Chen,
	Mauro Carvalho Chehab, Rob Herring, Matthias Brugger,
	Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Il 23/02/22 04:40, Yunfei Dong ha scritto:
> Needs to use mediatek compressed mode for mt8192 decoder.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type
  2022-02-23  3:40 ` [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type Yunfei Dong
@ 2022-02-25  9:24   ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 36+ messages in thread
From: AngeloGioacchino Del Regno @ 2022-02-25  9:24 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	Benjamin Gaignard, Tiffany Lin, Andrew-CT Chen,
	Mauro Carvalho Chehab, Rob Herring, Matthias Brugger,
	Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Il 23/02/22 04:40, Yunfei Dong ha scritto:
> Capture queue format type is difference for different platform,
> need to calculate capture buffer size according to capture queue
> format type in scp.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>


This change is ok, but the commit message should be changed to advertise
that this is preparation for the new stateless H264 decoding driver.
Besides, I suggest to reorder the commits sequence, so that this commit
goes in between "Extract H264 common code" and
"support stateless H.264 decoding for mt8192", as this last one is
the actual real user of this change.


Anyway, this is my commit message proposal:

The capture queue format type may be differ depending on platform:
for stateless decoder drivers, we need to calculate the capture buffer
size according to the capture queue format type in SCP.

As a preparation for introducing drivers for stateless decoding, save
the current capture queue type on a per vcodec context basis.

After fixing,
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-02-23  3:39 ` [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability Yunfei Dong
  2022-02-25  9:23   ` AngeloGioacchino Del Regno
@ 2022-02-28 21:29   ` Nicolas Dufresne
  2022-03-02  1:47     ` yunfei.dong
  2022-06-17  6:46     ` Chen-Yu Tsai
  1 sibling, 2 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-02-28 21:29 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Hi Yunfei,

this patch does not work unless userland calls enum_framesizes, which is
completely optional. See comment and suggestion below.

Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> Supported max resolution for different platforms are not the same: 2K
> or 4K, getting it according to dec_capability.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
> ---
>  .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 29 +++++++++++--------
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 +++
>  2 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> index 130ecef2e766..304f5afbd419 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> @@ -152,13 +152,15 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
>  	q_data->coded_height = DFT_CFG_HEIGHT;
>  	q_data->fmt = ctx->dev->vdec_pdata->default_cap_fmt;
>  	q_data->field = V4L2_FIELD_NONE;
> +	ctx->max_width = MTK_VDEC_MAX_W;
> +	ctx->max_height = MTK_VDEC_MAX_H;
>  
>  	v4l_bound_align_image(&q_data->coded_width,
>  				MTK_VDEC_MIN_W,
> -				MTK_VDEC_MAX_W, 4,
> +				ctx->max_width, 4,
>  				&q_data->coded_height,
>  				MTK_VDEC_MIN_H,
> -				MTK_VDEC_MAX_H, 5, 6);
> +				ctx->max_height, 5, 6);
>  
>  	q_data->sizeimage[0] = q_data->coded_width * q_data->coded_height;
>  	q_data->bytesperline[0] = q_data->coded_width;
> @@ -217,7 +219,7 @@ static int vidioc_vdec_subscribe_evt(struct v4l2_fh *fh,
>  	}
>  }
>  
> -static int vidioc_try_fmt(struct v4l2_format *f,
> +static int vidioc_try_fmt(struct mtk_vcodec_ctx *ctx, struct v4l2_format *f,
>  			  const struct mtk_video_fmt *fmt)
>  {
>  	struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
> @@ -225,9 +227,9 @@ static int vidioc_try_fmt(struct v4l2_format *f,
>  	pix_fmt_mp->field = V4L2_FIELD_NONE;
>  
>  	pix_fmt_mp->width =
> -		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W, MTK_VDEC_MAX_W);
> +		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W, ctx->max_width);
>  	pix_fmt_mp->height =
> -		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H, MTK_VDEC_MAX_H);
> +		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H, ctx->max_height);
>  
>  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
>  		pix_fmt_mp->num_planes = 1;
> @@ -245,16 +247,16 @@ static int vidioc_try_fmt(struct v4l2_format *f,
>  		tmp_h = pix_fmt_mp->height;
>  		v4l_bound_align_image(&pix_fmt_mp->width,
>  					MTK_VDEC_MIN_W,
> -					MTK_VDEC_MAX_W, 6,
> +					ctx->max_width, 6,
>  					&pix_fmt_mp->height,
>  					MTK_VDEC_MIN_H,
> -					MTK_VDEC_MAX_H, 6, 9);
> +					ctx->max_height, 6, 9);
>  
>  		if (pix_fmt_mp->width < tmp_w &&
> -			(pix_fmt_mp->width + 64) <= MTK_VDEC_MAX_W)
> +			(pix_fmt_mp->width + 64) <= ctx->max_width)
>  			pix_fmt_mp->width += 64;
>  		if (pix_fmt_mp->height < tmp_h &&
> -			(pix_fmt_mp->height + 64) <= MTK_VDEC_MAX_H)
> +			(pix_fmt_mp->height + 64) <= ctx->max_height)
>  			pix_fmt_mp->height += 64;
>  
>  		mtk_v4l2_debug(0,
> @@ -294,7 +296,7 @@ static int vidioc_try_fmt_vid_cap_mplane(struct file *file, void *priv,
>  		fmt = mtk_vdec_find_format(f, dec_pdata);
>  	}
>  
> -	return vidioc_try_fmt(f, fmt);
> +	return vidioc_try_fmt(ctx, f, fmt);
>  }
>  
>  static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
> @@ -317,7 +319,7 @@ static int vidioc_try_fmt_vid_out_mplane(struct file *file, void *priv,
>  		return -EINVAL;
>  	}
>  
> -	return vidioc_try_fmt(f, fmt);
> +	return vidioc_try_fmt(ctx, f, fmt);
>  }
>  
>  static int vidioc_vdec_g_selection(struct file *file, void *priv,
> @@ -445,7 +447,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
>  		return -EINVAL;
>  
>  	q_data->fmt = fmt;
> -	vidioc_try_fmt(f, q_data->fmt);
> +	vidioc_try_fmt(ctx, f, q_data->fmt);
>  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
>  		q_data->sizeimage[0] = pix_mp->plane_fmt[0].sizeimage;
>  		q_data->coded_width = pix_mp->width;
> @@ -545,6 +547,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
>  				fsize->stepwise.min_height,
>  				fsize->stepwise.max_height,
>  				fsize->stepwise.step_height);
> +
> +		ctx->max_width = fsize->stepwise.max_width;
> +		ctx->max_height = fsize->stepwise.max_height;

The spec does not require calling enum_fmt, so changing the maximum here is
incorrect (and fail with GStreamer). If userland never enum the framesizes, the
resolution get limited to 1080p.

As this only depends and the OUTPUT format and the device being open()
(condition being dev_capability being set and OUTPUT format being known / not
VP8), you could initialize the cxt max inside s_fmt(OUTPUT) instead, which is a
mandatory call. I have tested this change to verify this:


diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 044e3dfbdd8c..3e7c571526a4 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -484,6 +484,14 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
 	if (fmt == NULL)
 		return -EINVAL;
 
+	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE &&
+	    !(ctx->dev->dec_capability & VCODEC_CAPABILITY_4K_DISABLED) &&
+	    fmt->fourcc != V4L2_PIX_FMT_VP8_FRAME) {
+		mtk_v4l2_debug(3, "4K is enabled");
+		ctx->max_width = VCODEC_DEC_4K_CODED_WIDTH;
+		ctx->max_height = VCODEC_DEC_4K_CODED_HEIGHT;
+	}
+
 	q_data->fmt = fmt;
 	vidioc_try_fmt(ctx, f, q_data->fmt);
 	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
@@ -574,15 +582,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 
 		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
 		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
-		if (!(ctx->dev->dec_capability &
-				VCODEC_CAPABILITY_4K_DISABLED) &&
-				fsize->pixel_format != V4L2_PIX_FMT_VP8_FRAME) {
-			mtk_v4l2_debug(3, "4K is enabled");
-			fsize->stepwise.max_width =
-					VCODEC_DEC_4K_CODED_WIDTH;
-			fsize->stepwise.max_height =
-					VCODEC_DEC_4K_CODED_HEIGHT;
-		}
+		fsize->stepwise.max_width = ctx->max_width;
+		fsize->stepwise.max_height = ctx->max_height;
+
 		mtk_v4l2_debug(1, "%x, %d %d %d %d %d %d",
 				ctx->dev->dec_capability,
 				fsize->stepwise.min_width,
@@ -592,8 +594,6 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
 				fsize->stepwise.max_height,
 				fsize->stepwise.step_height);
 
-		ctx->max_width = fsize->stepwise.max_width;
-		ctx->max_height = fsize->stepwise.max_height;
 		return 0;
 	}
 


>  		return 0;
>  	}
>  
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index bb7b8e914d24..6d27e4d41ede 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -284,6 +284,8 @@ struct vdec_pic_info {
>   *	  mtk_video_dec_buf.
>   * @hw_id: hardware index used to identify different hardware.
>   *
> + * @max_width: hardware supported max width
> + * @max_height: hardware supported max height
>   * @msg_queue: msg queue used to store lat buffer information.
>   */
>  struct mtk_vcodec_ctx {
> @@ -329,6 +331,8 @@ struct mtk_vcodec_ctx {
>  	struct mutex lock;
>  	int hw_id;
>  
> +	unsigned int max_width;
> +	unsigned int max_height;
>  	struct vdec_msg_queue msg_queue;
>  };
>  


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes
  2022-02-23  3:40 ` [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes Yunfei Dong
  2022-02-25  9:24   ` AngeloGioacchino Del Regno
@ 2022-03-01 14:34   ` Nicolas Dufresne
  2022-03-04  7:27     ` yunfei.dong
  1 sibling, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 14:34 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> Supported output and capture format types for mt8192 are different
> with mt8183. Needs to get format types according to decoder capability.

This patch is both refactoring and changing the behaviour. Can you please split
the non-functional changes from the functional one. This ensure we can proceed
with a good review of the functional changes.

regards,
Nicolas

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |   8 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |  13 +-
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 117 +++++++++++++-----
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  13 +-
>  4 files changed, 107 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> index 304f5afbd419..bae43938ee37 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> @@ -26,7 +26,7 @@ mtk_vdec_find_format(struct v4l2_format *f,
>  	const struct mtk_video_fmt *fmt;
>  	unsigned int k;
>  
> -	for (k = 0; k < dec_pdata->num_formats; k++) {
> +	for (k = 0; k < *dec_pdata->num_formats; k++) {
>  		fmt = &dec_pdata->vdec_formats[k];
>  		if (fmt->fourcc == f->fmt.pix_mp.pixelformat)
>  			return fmt;
> @@ -525,7 +525,7 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
>  	if (fsize->index != 0)
>  		return -EINVAL;
>  
> -	for (i = 0; i < dec_pdata->num_framesizes; ++i) {
> +	for (i = 0; i < *dec_pdata->num_framesizes; ++i) {
>  		if (fsize->pixel_format != dec_pdata->vdec_framesizes[i].fourcc)
>  			continue;
>  
> @@ -564,7 +564,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
>  	const struct mtk_video_fmt *fmt;
>  	int i, j = 0;
>  
> -	for (i = 0; i < dec_pdata->num_formats; i++) {
> +	for (i = 0; i < *dec_pdata->num_formats; i++) {
>  		if (output_queue &&
>  		    dec_pdata->vdec_formats[i].type != MTK_FMT_DEC)
>  			continue;
> @@ -577,7 +577,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void *priv,
>  		++j;
>  	}
>  
> -	if (i == dec_pdata->num_formats)
> +	if (i == *dec_pdata->num_formats)
>  		return -EINVAL;
>  
>  	fmt = &dec_pdata->vdec_formats[i];
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
> index 7966c132be8f..3f33beb9c551 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
> @@ -37,7 +37,9 @@ static const struct mtk_video_fmt mtk_video_formats[] = {
>  	},
>  };
>  
> -#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> +static const unsigned int num_supported_formats =
> +	ARRAY_SIZE(mtk_video_formats);
> +
>  #define DEFAULT_OUT_FMT_IDX 0
>  #define DEFAULT_CAP_FMT_IDX 3
>  
> @@ -59,7 +61,8 @@ static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
>  	},
>  };
>  
> -#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> +static const unsigned int num_supported_framesize =
> +	ARRAY_SIZE(mtk_vdec_framesizes);
>  
>  /*
>   * This function tries to clean all display buffers, the buffers will return
> @@ -235,7 +238,7 @@ static void mtk_vdec_update_fmt(struct mtk_vcodec_ctx *ctx,
>  	unsigned int k;
>  
>  	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
> -	for (k = 0; k < NUM_FORMATS; k++) {
> +	for (k = 0; k < num_supported_formats; k++) {
>  		fmt = &mtk_video_formats[k];
>  		if (fmt->fourcc == pixelformat) {
>  			mtk_v4l2_debug(1, "Update cap fourcc(%d -> %d)",
> @@ -617,11 +620,11 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8173_pdata = {
>  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
>  	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
>  	.vdec_formats = mtk_video_formats,
> -	.num_formats = NUM_FORMATS,
> +	.num_formats = &num_supported_formats,
>  	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
>  	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
>  	.vdec_framesizes = mtk_vdec_framesizes,
> -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> +	.num_framesizes = &num_supported_framesize,
>  	.worker = mtk_vdec_worker,
>  	.flush_decoder = mtk_vdec_flush_decoder,
>  	.is_subdev_supported = false,
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> index 6d481410bf89..e51d935bd21d 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> @@ -81,33 +81,23 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
>  
>  #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
>  
> -static const struct mtk_video_fmt mtk_video_formats[] = {
> -	{
> -		.fourcc = V4L2_PIX_FMT_H264_SLICE,
> -		.type = MTK_FMT_DEC,
> -		.num_planes = 1,
> -	},
> -	{
> -		.fourcc = V4L2_PIX_FMT_MM21,
> -		.type = MTK_FMT_FRAME,
> -		.num_planes = 2,
> -	},
> +static struct mtk_video_fmt mtk_video_formats[2];
> +static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
> +
> +static struct mtk_video_fmt default_out_format;
> +static struct mtk_video_fmt default_cap_format;
> +static unsigned int num_formats;
> +static unsigned int num_framesizes;
> +
> +static struct v4l2_frmsize_stepwise stepwise_fhd = {
> +	.min_width = MTK_VDEC_MIN_W,
> +	.max_width = MTK_VDEC_MAX_W,
> +	.step_width = 16,
> +	.min_height = MTK_VDEC_MIN_H,
> +	.max_height = MTK_VDEC_MAX_H,
> +	.step_height = 16
>  };
>  
> -#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> -#define DEFAULT_OUT_FMT_IDX    0
> -#define DEFAULT_CAP_FMT_IDX    1
> -
> -static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> -	{
> -		.fourcc	= V4L2_PIX_FMT_H264_SLICE,
> -		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> -				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
> -	},
> -};
> -
> -#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> -
>  static void mtk_vdec_stateless_out_to_done(struct mtk_vcodec_ctx *ctx,
>  					   struct mtk_vcodec_mem *bs, int error)
>  {
> @@ -350,6 +340,62 @@ const struct media_device_ops mtk_vcodec_media_ops = {
>  	.req_queue	= v4l2_m2m_request_queue,
>  };
>  
> +static void mtk_vcodec_add_formats(unsigned int fourcc,
> +				   struct mtk_vcodec_ctx *ctx)
> +{
> +	struct mtk_vcodec_dev *dev = ctx->dev;
> +	const struct mtk_vcodec_dec_pdata *pdata = dev->vdec_pdata;
> +	int count_formats = *pdata->num_formats;
> +	int count_framesizes = *pdata->num_framesizes;
> +
> +	switch (fourcc) {
> +	case V4L2_PIX_FMT_H264_SLICE:
> +			[count_formats].fourcc = fourcc;
> +		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
> +		mtk_video_formats[count_formats].num_planes = 1;
> +
> +		mtk_vdec_framesizes[count_framesizes].fourcc = fourcc;
> +		mtk_vdec_framesizes[count_framesizes].stepwise = stepwise_fhd;
> +		num_framesizes++;
> +		break;
> +	case V4L2_PIX_FMT_MM21:
> +		mtk_video_formats[count_formats].fourcc = fourcc;
> +		mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
> +		mtk_video_formats[count_formats].num_planes = 2;
> +		break;
> +	default:
> +		mtk_v4l2_err("Can not add unsupported format type");
> +		return;
> +	}
> +
> +	num_formats++;
> +	mtk_v4l2_debug(3, "num_formats: %d num_frames:%d dec_capability: 0x%x",
> +		       count_formats, count_framesizes, ctx->dev->dec_capability);
> +}
> +
> +static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
> +{
> +	int cap_format_count = 0, out_format_count = 0;
> +
> +	if (num_formats && num_framesizes)
> +		return;
> +
> +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_MM21) {
> +		mtk_vcodec_add_formats(V4L2_PIX_FMT_MM21, ctx);
> +		cap_format_count++;
> +	}
> +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_H264_SLICE) {
> +		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
> +		out_format_count++;
> +	}
> +
> +	if (cap_format_count)
> +		default_cap_format = mtk_video_formats[cap_format_count - 1];
> +	if (out_format_count)
> +		default_out_format =
> +			mtk_video_formats[cap_format_count + out_format_count - 1];
> +}
> +
>  static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
>  {
>  	struct vb2_queue *src_vq;
> @@ -360,6 +406,11 @@ static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
>  	if (ctx->dev->vdec_pdata->hw_arch != MTK_VDEC_PURE_SINGLE_CORE)
>  		v4l2_m2m_set_dst_buffered(ctx->m2m_ctx, 1);
>  
> +	if (!ctx->dev->vdec_pdata->is_subdev_supported)
> +		ctx->dev->dec_capability |=
> +			MTK_VDEC_FORMAT_H264_SLICE | MTK_VDEC_FORMAT_MM21;
> +	mtk_vcodec_get_supported_formats(ctx);
> +
>  	/* Support request api for output plane */
>  	src_vq->supports_requests = true;
>  	src_vq->requires_requests = true;
> @@ -393,11 +444,11 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
>  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
>  	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
>  	.vdec_formats = mtk_video_formats,
> -	.num_formats = NUM_FORMATS,
> -	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
> -	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
> +	.num_formats = &num_formats,
> +	.default_out_fmt = &default_out_format,
> +	.default_cap_fmt = &default_cap_format,
>  	.vdec_framesizes = mtk_vdec_framesizes,
> -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> +	.num_framesizes = &num_framesizes,
>  	.uses_stateless_api = true,
>  	.worker = mtk_vdec_worker,
>  	.flush_decoder = mtk_vdec_flush_decoder,
> @@ -413,11 +464,11 @@ const struct mtk_vcodec_dec_pdata mtk_lat_sig_core_pdata = {
>  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
>  	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
>  	.vdec_formats = mtk_video_formats,
> -	.num_formats = NUM_FORMATS,
> -	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
> -	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
> +	.num_formats = &num_formats,
> +	.default_out_fmt = &default_out_format,
> +	.default_cap_fmt = &default_cap_format,
>  	.vdec_framesizes = mtk_vdec_framesizes,
> -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> +	.num_framesizes = &num_framesizes,
>  	.uses_stateless_api = true,
>  	.worker = mtk_vdec_worker,
>  	.flush_decoder = mtk_vdec_flush_decoder,
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index 9fcaf69549dd..270c73c05285 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -344,6 +344,15 @@ enum mtk_vdec_hw_arch {
>  	MTK_VDEC_LAT_SINGLE_CORE,
>  };
>  
> +/*
> + * struct mtk_vdec_format_types - Structure used to get supported
> + *		  format types according to decoder capability
> + */
> +enum mtk_vdec_format_types {
> +	MTK_VDEC_FORMAT_MM21 = 0x20,
> +	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
> +};
> +
>  /**
>   * struct mtk_vcodec_dec_pdata - compatible data for each IC
>   * @init_vdec_params: init vdec params
> @@ -379,12 +388,12 @@ struct mtk_vcodec_dec_pdata {
>  	struct vb2_ops *vdec_vb2_ops;
>  
>  	const struct mtk_video_fmt *vdec_formats;
> -	const int num_formats;
> +	const int *num_formats;
>  	const struct mtk_video_fmt *default_out_fmt;
>  	const struct mtk_video_fmt *default_cap_fmt;
>  
>  	const struct mtk_codec_framesizes *vdec_framesizes;
> -	const int num_framesizes;
> +	const int *num_framesizes;
>  
>  	enum mtk_vdec_hw_arch hw_arch;
>  


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp
  2022-02-23  3:39 ` [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp Yunfei Dong
@ 2022-03-01 14:44   ` Nicolas Dufresne
  2022-03-02  2:26     ` yunfei.dong
  0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 14:44 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Thanks for your patch, though perhaps it could be improved, see comment below.

Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> Different capture buffer format has different buffer size, need to get
> real buffer size according to buffer type from scp.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  .../media/platform/mtk-vcodec/vdec_ipi_msg.h  | 36 ++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 49 +++++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_vpu_if.h   | 15 ++++++
>  3 files changed, 100 insertions(+)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> index bf54d6d9a857..47070be2a991 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> +++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> @@ -20,6 +20,7 @@ enum vdec_ipi_msgid {
>  	AP_IPIMSG_DEC_RESET = 0xA004,
>  	AP_IPIMSG_DEC_CORE = 0xA005,
>  	AP_IPIMSG_DEC_CORE_END = 0xA006,
> +	AP_IPIMSG_DEC_GET_PARAM = 0xA007,
>  
>  	VPU_IPIMSG_DEC_INIT_ACK = 0xB000,
>  	VPU_IPIMSG_DEC_START_ACK = 0xB001,
> @@ -28,6 +29,7 @@ enum vdec_ipi_msgid {
>  	VPU_IPIMSG_DEC_RESET_ACK = 0xB004,
>  	VPU_IPIMSG_DEC_CORE_ACK = 0xB005,
>  	VPU_IPIMSG_DEC_CORE_END_ACK = 0xB006,
> +	VPU_IPIMSG_DEC_GET_PARAM_ACK = 0xB007,
>  };
>  
>  /**
> @@ -114,4 +116,38 @@ struct vdec_vpu_ipi_init_ack {
>  	uint32_t inst_id;
>  };
>  
> +/**
> + * struct vdec_ap_ipi_get_param - for AP_IPIMSG_DEC_GET_PARAM
> + * @msg_id	: AP_IPIMSG_DEC_GET_PARAM
> + * @inst_id     : instance ID. Used if the ABI version >= 2.
> + * @data	: picture information
> + * @param_type	: get param type
> + * @codec_type	: Codec fourcc
> + */
> +struct vdec_ap_ipi_get_param {
> +	u32 msg_id;
> +	u32 inst_id;
> +	u32 data[4];
> +	u32 param_type;
> +	u32 codec_type;
> +};
> +
> +/**
> + * struct vdec_vpu_ipi_get_param_ack - for VPU_IPIMSG_DEC_GET_PARAM_ACK
> + * @msg_id	: VPU_IPIMSG_DEC_GET_PARAM_ACK
> + * @status	: VPU execution result
> + * @ap_inst_addr	: AP vcodec_vpu_inst instance address
> + * @data     : picture information from SCP.
> + * @param_type	: get param type
> + * @reserved : reserved param
> + */
> +struct vdec_vpu_ipi_get_param_ack {
> +	u32 msg_id;
> +	s32 status;
> +	u64 ap_inst_addr;
> +	u32 data[4];
> +	u32 param_type;
> +	u32 reserved;
> +};
> +
>  #endif
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> index 7210061c772f..35f4d5583084 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> @@ -6,6 +6,7 @@
>  
>  #include "mtk_vcodec_drv.h"
>  #include "mtk_vcodec_util.h"
> +#include "vdec_drv_if.h"
>  #include "vdec_ipi_msg.h"
>  #include "vdec_vpu_if.h"
>  #include "mtk_vcodec_fw.h"
> @@ -54,6 +55,26 @@ static void handle_init_ack_msg(const struct vdec_vpu_ipi_init_ack *msg)
>  	}
>  }
>  
> +static void handle_get_param_msg_ack(const struct vdec_vpu_ipi_get_param_ack *msg)
> +{
> +	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
> +					(unsigned long)msg->ap_inst_addr;
> +
> +	mtk_vcodec_debug(vpu, "+ ap_inst_addr = 0x%llx", msg->ap_inst_addr);
> +
> +	/* param_type is enum vdec_get_param_type */
> +	switch (msg->param_type) {
> +	case GET_PARAM_PIC_INFO:
> +		vpu->fb_sz[0] = msg->data[0];
> +		vpu->fb_sz[1] = msg->data[1];
> +		break;
> +	default:
> +		mtk_vcodec_err(vpu, "invalid get param type=%d", msg->param_type);
> +		vpu->failure = 1;
> +		break;
> +	}
> +}
> +
>  /*
>   * vpu_dec_ipi_handler - Handler for VPU ipi message.
>   *
> @@ -89,6 +110,9 @@ static void vpu_dec_ipi_handler(void *data, unsigned int len, void *priv)
>  		case VPU_IPIMSG_DEC_CORE_END_ACK:
>  			break;
>  
> +		case VPU_IPIMSG_DEC_GET_PARAM_ACK:
> +			handle_get_param_msg_ack(data);
> +			break;
>  		default:
>  			mtk_vcodec_err(vpu, "invalid msg=%X", msg->msg_id);
>  			break;
> @@ -217,6 +241,31 @@ int vpu_dec_start(struct vdec_vpu_inst *vpu, uint32_t *data, unsigned int len)
>  	return err;
>  }
>  
> +int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
> +		      unsigned int len, unsigned int param_type)
> +{
> +	struct vdec_ap_ipi_get_param msg;
> +	int err;
> +
> +	mtk_vcodec_debug_enter(vpu);
> +
> +	if (len > ARRAY_SIZE(msg.data)) {
> +		mtk_vcodec_err(vpu, "invalid len = %d\n", len);
> +		return -EINVAL;
> +	}
> +
> +	memset(&msg, 0, sizeof(msg));
> +	msg.msg_id = AP_IPIMSG_DEC_GET_PARAM;
> +	msg.inst_id = vpu->inst_id;
> +	memcpy(msg.data, data, sizeof(unsigned int) * len);
> +	msg.param_type = param_type;
> +	msg.codec_type = vpu->codec_type;
> +
> +	err = vcodec_vpu_send_msg(vpu, (void *)&msg, sizeof(msg));
> +	mtk_vcodec_debug(vpu, "- ret=%d", err);
> +	return err;
> +}
> +
>  int vpu_dec_core(struct vdec_vpu_inst *vpu)
>  {
>  	return vcodec_send_ap_ipi(vpu, AP_IPIMSG_DEC_CORE);
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> index 4cb3c7f5a3ad..d1feba41dd39 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> +++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> @@ -28,6 +28,8 @@ struct mtk_vcodec_ctx;
>   * @wq          : wait queue to wait VPU message ack
>   * @handler     : ipi handler for each decoder
>   * @codec_type     : use codec type to separate different codecs
> + * @capture_type    : used capture type to separate different capture format
> + * @fb_sz  : frame buffer size of each plane
>   */
>  struct vdec_vpu_inst {
>  	int id;
> @@ -42,6 +44,8 @@ struct vdec_vpu_inst {
>  	wait_queue_head_t wq;
>  	mtk_vcodec_ipi_handler handler;
>  	unsigned int codec_type;
> +	unsigned int capture_type;

This structure member is added in this patch, but never set or used.

> +	unsigned int fb_sz[2];
>  };
>  
>  /**
> @@ -104,4 +108,15 @@ int vpu_dec_core(struct vdec_vpu_inst *vpu);
>   */
>  int vpu_dec_core_end(struct vdec_vpu_inst *vpu);
>  
> +/**
> + * vpu_dec_get_param - get param from scp
> + *
> + * @vpu : instance for vdec_vpu_inst
> + * @data: meta data to pass bitstream info to VPU decoder
> + * @len : meta data length
> + * @param_type : get param type
> + */
> +int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
> +		      unsigned int len, unsigned int param_type);
> +
>  #endif


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered
  2022-02-23  3:39 ` [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered Yunfei Dong
@ 2022-03-01 18:50   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 18:50 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> lat thread: output queue      \
>                                -> lat hardware -> lat trans buffer
>             lat trans buffer  /
> 
> core thread: capture queue     \
>                                 ->core hardware -> capture queue
>              lat trans buffer  /
> 
> Lat and core work in different thread, setting capture buffer buffered.

... so that output queue buffers (bitstream) can be process regardless if there
is available capture buffers.

I have concerns around the usefulness of running a dedicated thread to drive the
lat and the core blocks. Having 3 threads (counting the m2m worker thread) here
increase the complexity. The hardware is asynchronous by definition. I think
this patch will go away after a proper rework of the driver thread model here.

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> index 5aebf88f997b..23a154c4e321 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> @@ -314,6 +314,9 @@ static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
>  	src_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
>  				 V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
>  
> +	if (ctx->dev->vdec_pdata->hw_arch != MTK_VDEC_PURE_SINGLE_CORE)
> +		v4l2_m2m_set_dst_buffered(ctx->m2m_ctx, 1);
> +
>  	/* Support request api for output plane */
>  	src_vq->supports_requests = true;
>  	src_vq->requires_requests = true;


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow
  2022-02-23  3:39 ` [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow Yunfei Dong
@ 2022-03-01 19:00   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 19:00 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> For lat and core decode in parallel, need to get capture buffer
> when core start to decode and put capture buffer to display
> list when core decode done.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 121 ++++++++++++------
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   5 +-
>  .../mtk-vcodec/vdec/vdec_h264_req_if.c        |  16 ++-
>  3 files changed, 102 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> index 23a154c4e321..6d481410bf89 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> @@ -108,37 +108,87 @@ static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
>  
>  #define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
>  
> -static void mtk_vdec_stateless_set_dst_payload(struct mtk_vcodec_ctx *ctx,
> -					       struct vdec_fb *fb)
> +static void mtk_vdec_stateless_out_to_done(struct mtk_vcodec_ctx *ctx,
> +					   struct mtk_vcodec_mem *bs, int error)
>  {
> -	struct mtk_video_dec_buf *vdec_frame_buf =
> -		container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> -	struct vb2_v4l2_buffer *vb = &vdec_frame_buf->m2m_buf.vb;
> -	unsigned int cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +	struct mtk_video_dec_buf *out_buf;
> +	struct vb2_v4l2_buffer *vb;
>  
> -	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> -	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> -		unsigned int cap_c_size =
> -			ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +	if (!bs) {
> +		mtk_v4l2_err("Free bitstream buffer fail.");
> +		return;
> +	}
> +	out_buf = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +	vb = &out_buf->m2m_buf.vb;
>  
> -		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> +	mtk_v4l2_debug(2, "Free bitsteam buffer id = %d to done_list",
> +		       vb->vb2_buf.index);
> +
> +	v4l2_m2m_src_buf_remove(ctx->m2m_ctx);
> +	if (error) {
> +		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR);
> +		if (error == -EIO)
> +			out_buf->error = true;
> +	} else {
> +		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
>  	}
>  }
>  
> -static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx,
> -					   struct vb2_v4l2_buffer *vb2_v4l2)
> +static void mtk_vdec_stateless_cap_to_disp(struct mtk_vcodec_ctx *ctx,
> +					   struct vdec_fb *fb, int error)
>  {
> -	struct mtk_video_dec_buf *framebuf =
> -		container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> -	struct vdec_fb *pfb = &framebuf->frame_buffer;
> -	struct vb2_buffer *dst_buf = &vb2_v4l2->vb2_buf;
> +	struct mtk_video_dec_buf *vdec_frame_buf;
> +	struct vb2_v4l2_buffer *vb;
> +	unsigned int cap_y_size, cap_c_size;
> +
> +	if (!fb) {
> +		mtk_v4l2_err("Free frame buffer fail.");
> +		return;
> +	}
> +	vdec_frame_buf = container_of(fb, struct mtk_video_dec_buf,
> +				      frame_buffer);
> +	vb = &vdec_frame_buf->m2m_buf.vb;
> +
> +	cap_y_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
> +	cap_c_size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> +
> +	v4l2_m2m_dst_buf_remove(ctx->m2m_ctx);
>  
> -	pfb->base_y.va = NULL;
> +	vb2_set_plane_payload(&vb->vb2_buf, 0, cap_y_size);
> +	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
> +		vb2_set_plane_payload(&vb->vb2_buf, 1, cap_c_size);
> +
> +	mtk_v4l2_debug(2, "Free frame buffer id = %d to done_list",
> +		       vb->vb2_buf.index);
> +	if (error)
> +		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR);
> +	else
> +		v4l2_m2m_buf_done(vb, VB2_BUF_STATE_DONE);
> +}
> +
> +static struct vdec_fb *vdec_get_cap_buffer(struct mtk_vcodec_ctx *ctx)
> +{
> +	struct mtk_video_dec_buf *framebuf;
> +	struct vb2_v4l2_buffer *vb2_v4l2;
> +	struct vb2_buffer *dst_buf;
> +	struct vdec_fb *pfb;
> +
> +	vb2_v4l2 = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> +	if (!vb2_v4l2) {
> +		mtk_v4l2_debug(1, "[%d] dst_buf empty!!", ctx->id);
> +		return NULL;
> +	}
> +
> +	dst_buf = &vb2_v4l2->vb2_buf;
> +	framebuf = container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
> +
> +	pfb = &framebuf->frame_buffer;
> +	pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
>  	pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
>  	pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
>  
>  	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> -		pfb->base_c.va = NULL;
> +		pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
>  		pfb->base_c.dma_addr =
>  			vb2_dma_contig_plane_dma_addr(dst_buf, 1);
>  		pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
> @@ -162,12 +212,11 @@ static void mtk_vdec_worker(struct work_struct *work)
>  	struct mtk_vcodec_ctx *ctx =
>  		container_of(work, struct mtk_vcodec_ctx, decode_work);
>  	struct mtk_vcodec_dev *dev = ctx->dev;
> -	struct vb2_v4l2_buffer *vb2_v4l2_src, *vb2_v4l2_dst;
> +	struct vb2_v4l2_buffer *vb2_v4l2_src;
>  	struct vb2_buffer *vb2_src;
>  	struct mtk_vcodec_mem *bs_src;
>  	struct mtk_video_dec_buf *dec_buf_src;
>  	struct media_request *src_buf_req;
> -	struct vdec_fb *dst_buf;
>  	bool res_chg = false;
>  	int ret;
>  
> @@ -178,13 +227,6 @@ static void mtk_vdec_worker(struct work_struct *work)
>  		return;
>  	}
>  
> -	vb2_v4l2_dst = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> -	if (!vb2_v4l2_dst) {
> -		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> -		mtk_v4l2_debug(1, "[%d] no available destination buffer", ctx->id);
> -		return;
> -	}
> -
>  	vb2_src = &vb2_v4l2_src->vb2_buf;
>  	dec_buf_src = container_of(vb2_v4l2_src, struct mtk_video_dec_buf,
>  				   m2m_buf.vb);
> @@ -193,9 +235,15 @@ static void mtk_vdec_worker(struct work_struct *work)
>  	mtk_v4l2_debug(3, "[%d] (%d) id=%d, vb=%p", ctx->id,
>  		       vb2_src->vb2_queue->type, vb2_src->index, vb2_src);
>  
> -	bs_src->va = NULL;
> +	bs_src->va = vb2_plane_vaddr(vb2_src, 0);
>  	bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
>  	bs_src->size = (size_t)vb2_src->planes[0].bytesused;
> +	if (!bs_src->va) {
> +		v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
> +		mtk_v4l2_err("[%d] id=%d source buffer is NULL", ctx->id,
> +			     vb2_src->index);
> +		return;
> +	}
>  
>  	mtk_v4l2_debug(3, "[%d] Bitstream VA=%p DMA=%pad Size=%zx vb=%p",
>  		       ctx->id, bs_src->va, &bs_src->dma_addr, bs_src->size, vb2_src);
> @@ -206,9 +254,7 @@ static void mtk_vdec_worker(struct work_struct *work)
>  	else
>  		mtk_v4l2_err("vb2 buffer media request is NULL");
>  
> -	dst_buf = vdec_get_cap_buffer(ctx, vb2_v4l2_dst);
> -	v4l2_m2m_buf_copy_metadata(vb2_v4l2_src, vb2_v4l2_dst, true);
> -	ret = vdec_if_decode(ctx, bs_src, dst_buf, &res_chg);
> +	ret = vdec_if_decode(ctx, bs_src, NULL, &res_chg);
>  	if (ret) {
>  		mtk_v4l2_err(" <===[%d], src_buf[%d] sz=0x%zx pts=%llu vdec_if_decode() ret=%d res_chg=%d===>",
>  			     ctx->id, vb2_src->index, bs_src->size,
> @@ -220,12 +266,9 @@ static void mtk_vdec_worker(struct work_struct *work)
>  		}
>  	}
>  
> -	mtk_vdec_stateless_set_dst_payload(ctx, dst_buf);
> -
> -	v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx,
> -					 ret ? VB2_BUF_STATE_ERROR : VB2_BUF_STATE_DONE);
> -
> +	mtk_vdec_stateless_out_to_done(ctx, bs_src, ret);
>  	v4l2_ctrl_request_complete(src_buf_req, &ctx->ctrl_hdl);

This haven't changed since last version, so recording the problem with this
patch again. The request is being completed here as soon as the lat job is done.
This is too soon, here's what the spec says [1]:

   User-space can poll() a request file descriptor in order to wait until the
   request completes. A request is considered complete once all its associated
   buffers are available for dequeuing and all the associated controls have been
   updated with the values at the time of completion. Note that user-space does not
   need to wait for the request to complete to dequeue its buffers: buffers that
   are available halfway through a request can be dequeued independently of the
   request’s state.

In short, the request can't be completed until the core has finished and the
related capture buffer have been marked done. As a side effect, you need to
handle completing the request in all the possible error cases (you might want to
refactor this). Please sync with Benjamin, he's currently trying to find a way
to simplify the threading model and the driver while at it. This will otherwise
tend to be racy and hard to maintain.

https://www.kernel.org/doc/html/latest/userspace-api/media/mediactl/request-api.html#request-submission

> +	v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
>  }
>  
>  static void vb2ops_vdec_stateless_buf_queue(struct vb2_buffer *vb)
> @@ -358,6 +401,8 @@ const struct mtk_vcodec_dec_pdata mtk_vdec_8183_pdata = {
>  	.uses_stateless_api = true,
>  	.worker = mtk_vdec_worker,
>  	.flush_decoder = mtk_vdec_flush_decoder,
> +	.cap_to_disp = mtk_vdec_stateless_cap_to_disp,
> +	.get_cap_buffer = vdec_get_cap_buffer,
>  	.is_subdev_supported = false,
>  	.hw_arch = MTK_VDEC_PURE_SINGLE_CORE,
>  };
> @@ -376,6 +421,8 @@ const struct mtk_vcodec_dec_pdata mtk_lat_sig_core_pdata = {
>  	.uses_stateless_api = true,
>  	.worker = mtk_vdec_worker,
>  	.flush_decoder = mtk_vdec_flush_decoder,
> +	.cap_to_disp = mtk_vdec_stateless_cap_to_disp,
> +	.get_cap_buffer = vdec_get_cap_buffer,
>  	.is_subdev_supported = true,
>  	.hw_arch = MTK_VDEC_LAT_SINGLE_CORE,
>  };
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index 6d27e4d41ede..9fcaf69549dd 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -350,7 +350,8 @@ enum mtk_vdec_hw_arch {
>   * @ctrls_setup: init vcodec dec ctrls
>   * @worker: worker to start a decode job
>   * @flush_decoder: function that flushes the decoder
> - *
> + * @get_cap_buffer: get capture buffer from capture queue
> + * @cap_to_disp: put capture buffer to disp list
>   * @vdec_vb2_ops: struct vb2_ops
>   *
>   * @vdec_formats: supported video decoder formats
> @@ -372,6 +373,8 @@ struct mtk_vcodec_dec_pdata {
>  	int (*ctrls_setup)(struct mtk_vcodec_ctx *ctx);
>  	void (*worker)(struct work_struct *work);
>  	int (*flush_decoder)(struct mtk_vcodec_ctx *ctx);
> +	struct vdec_fb *(*get_cap_buffer)(struct mtk_vcodec_ctx *ctx);
> +	void (*cap_to_disp)(struct mtk_vcodec_ctx *ctx, struct vdec_fb *fb, int error);
>  
>  	struct vb2_ops *vdec_vb2_ops;
>  
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> index 43542de11e9c..36f3dc1fbe3b 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> @@ -670,32 +670,42 @@ static void vdec_h264_slice_deinit(void *h_vdec)
>  }
>  
>  static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> -				  struct vdec_fb *fb, bool *res_chg)
> +				  struct vdec_fb *unused, bool *res_chg)
>  {
>  	struct vdec_h264_slice_inst *inst = h_vdec;
>  	const struct v4l2_ctrl_h264_decode_params *dec_params =
>  		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
>  	struct vdec_vpu_inst *vpu = &inst->vpu;
> +	struct mtk_video_dec_buf *src_buf_info;
> +	struct mtk_video_dec_buf *dst_buf_info;
> +	struct vdec_fb *fb;
>  	u32 data[2];
>  	u64 y_fb_dma;
>  	u64 c_fb_dma;
>  	int err;
>  
> +	inst->num_nalu++;
>  	/* bs NULL means flush decoder */
>  	if (!bs)
>  		return vpu_dec_reset(vpu);
>  
> +	fb = inst->ctx->dev->vdec_pdata->get_cap_buffer(inst->ctx);
> +	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +	dst_buf_info = container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> +
>  	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
>  	c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
>  
>  	mtk_vcodec_debug(inst, "+ [%d] FB y_dma=%llx c_dma=%llx va=%p",
> -			 ++inst->num_nalu, y_fb_dma, c_fb_dma, fb);
> +			 inst->num_nalu, y_fb_dma, c_fb_dma, fb);
>  
>  	inst->vsi_ctx.dec.bs_dma = (uint64_t)bs->dma_addr;
>  	inst->vsi_ctx.dec.y_fb_dma = y_fb_dma;
>  	inst->vsi_ctx.dec.c_fb_dma = c_fb_dma;
>  	inst->vsi_ctx.dec.vdec_fb_va = (u64)(uintptr_t)fb;
>  
> +	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
> +				   &dst_buf_info->m2m_buf.vb, true);
>  	get_vdec_decode_parameters(inst);
>  	data[0] = bs->size;
>  	/*
> @@ -734,6 +744,8 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
>  
>  	memcpy(&inst->vsi_ctx, inst->vpu.vsi, sizeof(inst->vsi_ctx));
>  	mtk_vcodec_debug(inst, "\n - NALU[%d]", inst->num_nalu);
> +
> +	inst->ctx->dev->vdec_pdata->cap_to_disp(inst->ctx, fb, 0);
>  	return 0;
>  
>  err_free_fb_out:


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability
  2022-02-23  3:40 ` [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability Yunfei Dong
@ 2022-03-01 19:02   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 19:02 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> For vp8 not support 4K, need to disable it.

This patch will need to be changed after you have moved this code into the
proper ioctl.

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> index bae43938ee37..ba188d16f0fb 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> @@ -532,7 +532,8 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
>  		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
>  		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
>  		if (!(ctx->dev->dec_capability &
> -				VCODEC_CAPABILITY_4K_DISABLED)) {
> +				VCODEC_CAPABILITY_4K_DISABLED) &&
> +				fsize->pixel_format != V4L2_PIX_FMT_VP8_FRAME) {
>  			mtk_v4l2_debug(3, "4K is enabled");
>  			fsize->stepwise.max_width =
>  					VCODEC_DEC_4K_CODED_WIDTH;


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code
  2022-02-23  3:40 ` [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code Yunfei Dong
@ 2022-03-01 21:30   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 21:30 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Daniel Vetter, dri-devel, Irui Wang, Steve Cho, linux-media,
	devicetree, linux-kernel, linux-arm-kernel, srv_heupstream,
	linux-mediatek, Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> Mt8192 can use some of common code with mt8183. Moves them to
> a new file in order to reuse.

With the documentation fixed as per my comments below, you can add:

Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../mtk-vcodec/vdec/vdec_h264_req_common.c    | 310 +++++++++++++
>  .../mtk-vcodec/vdec/vdec_h264_req_common.h    | 253 +++++++++++
>  .../mtk-vcodec/vdec/vdec_h264_req_if.c        | 424 ++----------------
>  4 files changed, 606 insertions(+), 382 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
> 
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index 359619653a0e..3f41d748eee5 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -9,6 +9,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>  		vdec/vdec_vp8_if.o \
>  		vdec/vdec_vp9_if.o \
>  		vdec/vdec_h264_req_if.o \
> +		vdec/vdec_h264_req_common.o \
>  		mtk_vcodec_dec_drv.o \
>  		vdec_drv_if.o \
>  		vdec_vpu_if.o \
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
> new file mode 100644
> index 000000000000..6c68bee632d6
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.c
> @@ -0,0 +1,310 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + * Author: Yunfei Dong <yunfei.dong@mediatek.com>
> + */
> +
> +#include "vdec_h264_req_common.h"
> +
> +/* get used parameters for sps/pps */
> +#define GET_MTK_VDEC_FLAG(cond, flag) \
> +	{ dst_param->cond = ((src_param->flags & flag) ? (1) : (0)); }
> +#define GET_MTK_VDEC_PARAM(param) \
> +	{ dst_param->param = src_param->param; }
> +
> +/*
> + * The firmware expects unused reflist entries to have the value 0x20.
> + */
> +void mtk_vdec_h264_fixup_ref_list(u8 *ref_list, size_t num_valid)
> +{
> +	memset_io(&ref_list[num_valid], 0x20, 32 - num_valid);
> +}
> +
> +void *mtk_vdec_h264_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
> +{
> +	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> +
> +	if (!ctrl)
> +		return ERR_PTR(-EINVAL);
> +
> +	return ctrl->p_cur.p;
> +}
> +
> +void mtk_vdec_h264_fill_dpb_info(struct mtk_vcodec_ctx *ctx,
> +				 struct slice_api_h264_decode_param *decode_params,
> +				 struct mtk_h264_dpb_info *h264_dpb_info)
> +{
> +	const struct slice_h264_dpb_entry *dpb;
> +	struct vb2_queue *vq;
> +	struct vb2_buffer *vb;
> +	struct vb2_v4l2_buffer *vb2_v4l2;
> +	int index, vb2_index;
> +
> +	vq = v4l2_m2m_get_vq(ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> +
> +	for (index = 0; index < V4L2_H264_NUM_DPB_ENTRIES; index++) {
> +		dpb = &decode_params->dpb[index];
> +		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> +			h264_dpb_info[index].reference_flag = 0;
> +			continue;
> +		}
> +
> +		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> +		if (vb2_index < 0) {
> +			dev_err(&ctx->dev->plat_dev->dev,
> +				"Reference invalid: dpb_index(%d) reference_ts(%lld)",
> +				index, dpb->reference_ts);
> +			continue;
> +		}
> +
> +		/* 1 for short term reference, 2 for long term reference */
> +		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> +			h264_dpb_info[index].reference_flag = 1;
> +		else
> +			h264_dpb_info[index].reference_flag = 2;
> +
> +		vb = vq->bufs[vb2_index];
> +		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> +		h264_dpb_info[index].field = vb2_v4l2->field;
> +
> +		h264_dpb_info[index].y_dma_addr =
> +			vb2_dma_contig_plane_dma_addr(vb, 0);
> +		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
> +			h264_dpb_info[index].c_dma_addr =
> +				vb2_dma_contig_plane_dma_addr(vb, 1);
> +		else
> +			h264_dpb_info[index].c_dma_addr =
> +				h264_dpb_info[index].y_dma_addr +
> +				ctx->picinfo.fb_sz[0];
> +	}
> +}
> +
> +void mtk_vdec_h264_copy_sps_params(struct mtk_h264_sps_param *dst_param,
> +				   const struct v4l2_ctrl_h264_sps *src_param)
> +{
> +	GET_MTK_VDEC_PARAM(chroma_format_idc);
> +	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> +	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> +	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> +	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> +	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> +	GET_MTK_VDEC_PARAM(max_num_ref_frames);
> +	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> +	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> +
> +	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> +			  V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> +	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> +			  V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> +	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> +			  V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> +	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> +			  V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> +	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> +			  V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> +	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> +			  V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> +}
> +
> +void mtk_vdec_h264_copy_pps_params(struct mtk_h264_pps_param *dst_param,
> +				   const struct v4l2_ctrl_h264_pps *src_param)
> +{
> +	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> +	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> +	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> +	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> +	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> +	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> +
> +	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> +			  V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> +	GET_MTK_VDEC_FLAG(pic_order_present_flag,
> +			  V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> +	GET_MTK_VDEC_FLAG(weighted_pred_flag,
> +			  V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> +	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> +			  V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> +	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> +			  V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> +	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> +			  V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> +	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> +			  V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> +	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> +			  V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> +}
> +
> +void mtk_vdec_h264_copy_slice_hd_params(struct mtk_h264_slice_hd_param *dst_param,
> +					const struct v4l2_ctrl_h264_slice_params *src_param,
> +					const struct v4l2_ctrl_h264_decode_params *dec_param)
> +{
> +	int temp;
> +
> +	GET_MTK_VDEC_PARAM(first_mb_in_slice);
> +	GET_MTK_VDEC_PARAM(slice_type);
> +	GET_MTK_VDEC_PARAM(cabac_init_idc);
> +	GET_MTK_VDEC_PARAM(slice_qp_delta);
> +	GET_MTK_VDEC_PARAM(disable_deblocking_filter_idc);
> +	GET_MTK_VDEC_PARAM(slice_alpha_c0_offset_div2);
> +	GET_MTK_VDEC_PARAM(slice_beta_offset_div2);
> +	GET_MTK_VDEC_PARAM(num_ref_idx_l0_active_minus1);
> +	GET_MTK_VDEC_PARAM(num_ref_idx_l1_active_minus1);
> +
> +	dst_param->frame_num = dec_param->frame_num;
> +	dst_param->pic_order_cnt_lsb = dec_param->pic_order_cnt_lsb;
> +
> +	dst_param->delta_pic_order_cnt_bottom =
> +		dec_param->delta_pic_order_cnt_bottom;
> +	dst_param->delta_pic_order_cnt0 =
> +		dec_param->delta_pic_order_cnt0;
> +	dst_param->delta_pic_order_cnt1 =
> +		dec_param->delta_pic_order_cnt1;
> +
> +	temp = dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC;
> +	dst_param->field_pic_flag = temp ? 1 : 0;
> +
> +	temp = dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD;
> +	dst_param->bottom_field_flag = temp ? 1 : 0;
> +
> +	GET_MTK_VDEC_FLAG(direct_spatial_mv_pred_flag,
> +			  V4L2_H264_SLICE_FLAG_DIRECT_SPATIAL_MV_PRED);
> +}
> +
> +void mtk_vdec_h264_copy_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> +				       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> +{
> +	memcpy_toio(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> +		    sizeof(dst_matrix->scaling_list_4x4));
> +
> +	memcpy_toio(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> +		    sizeof(dst_matrix->scaling_list_8x8));
> +}
> +
> +void
> +mtk_vdec_h264_copy_decode_params(struct slice_api_h264_decode_param *dst_params,
> +				 const struct v4l2_ctrl_h264_decode_params *src_params,
> +				 const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> +{
> +	struct slice_h264_dpb_entry *dst_entry;
> +	const struct v4l2_h264_dpb_entry *src_entry;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> +		dst_entry = &dst_params->dpb[i];
> +		src_entry = &dpb[i];
> +
> +		dst_entry->reference_ts = src_entry->reference_ts;
> +		dst_entry->frame_num = src_entry->frame_num;
> +		dst_entry->pic_num = src_entry->pic_num;
> +		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> +		dst_entry->bottom_field_order_cnt =
> +			src_entry->bottom_field_order_cnt;
> +		dst_entry->flags = src_entry->flags;
> +	}
> +
> +	/* num_slices is a leftover from the old H.264 support and is ignored
> +	 * by the firmware.
> +	 */
> +	dst_params->num_slices = 0;
> +	dst_params->nal_ref_idc = src_params->nal_ref_idc;
> +	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> +	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> +	dst_params->flags = src_params->flags;
> +}
> +
> +static bool mtk_vdec_h264_dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> +					  const struct v4l2_h264_dpb_entry *b)
> +{
> +	return a->top_field_order_cnt == b->top_field_order_cnt &&
> +	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> +}
> +
> +/*
> + * Move DPB entries of dec_param that refer to a frame already existing in dpb
> + * into the already existing slot in dpb, and move other entries into new slots.
> + *
> + * This function is an adaptation of the similarly-named function in
> + * hantro_h264.c.
> + */
> +void mtk_vdec_h264_update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> +			      struct v4l2_h264_dpb_entry *dpb)
> +{
> +	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> +	unsigned int i, j;
> +
> +	/* Disable all entries by default, and mark the ones in use. */
> +	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> +			set_bit(i, in_use);
> +		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> +	}
> +
> +	/* Try to match new DPB entries with existing ones by their POCs. */
> +	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> +		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +
> +		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> +			continue;
> +
> +		/*
> +		 * To cut off some comparisons, iterate only on target DPB
> +		 * entries were already used.
> +		 */
> +		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> +			struct v4l2_h264_dpb_entry *cdpb;
> +
> +			cdpb = &dpb[j];
> +			if (!mtk_vdec_h264_dpb_entry_match(cdpb, ndpb))
> +				continue;
> +
> +			*cdpb = *ndpb;
> +			set_bit(j, used);
> +			/* Don't reiterate on this one. */
> +			clear_bit(j, in_use);
> +			break;
> +		}
> +
> +		if (j == ARRAY_SIZE(dec_param->dpb))
> +			set_bit(i, new);
> +	}
> +
> +	/* For entries that could not be matched, use remaining free slots. */
> +	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> +		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> +		struct v4l2_h264_dpb_entry *cdpb;
> +
> +		/*
> +		 * Both arrays are of the same sizes, so there is no way
> +		 * we can end up with no space in target array, unless
> +		 * something is buggy.
> +		 */
> +		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> +		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> +			return;
> +
> +		cdpb = &dpb[j];
> +		*cdpb = *ndpb;
> +		set_bit(j, used);
> +	}
> +}
> +
> +unsigned int mtk_vdec_h264_get_mv_buf_size(unsigned int width, unsigned int height)
> +{
> +	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> +
> +	return HW_MB_STORE_SZ * unit_size;
> +}
> +
> +int mtk_vdec_h264_find_start_code(unsigned char *data, unsigned int data_sz)
> +{
> +	if (data_sz > 3 && data[0] == 0 && data[1] == 0 && data[2] == 1)
> +		return 3;
> +
> +	if (data_sz > 4 && data[0] == 0 && data[1] == 0 && data[2] == 0 &&
> +	    data[3] == 1)
> +		return 4;
> +
> +	return -1;
> +}
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
> new file mode 100644
> index 000000000000..2d731bc777ca
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_common.h
> @@ -0,0 +1,253 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + * Author: Yunfei Dong <yunfei.dong@mediatek.com>
> + */
> +
> +#ifndef _VDEC_H264_REQ_COMMON_H_
> +#define _VDEC_H264_REQ_COMMON_H_
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <media/v4l2-h264.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include "../mtk_vcodec_drv.h"
> +
> +#define NAL_NON_IDR_SLICE			0x01
> +#define NAL_IDR_SLICE				0x05
> +#define NAL_TYPE(value)				((value) & 0x1F)
> +
> +#define BUF_PREDICTION_SZ			(64 * 4096)
> +#define MB_UNIT_LEN				16
> +
> +/* motion vector size (bytes) for every macro block */
> +#define HW_MB_STORE_SZ				64
> +
> +#define H264_MAX_MV_NUM				32
> +
> +/**
> + * struct mtk_h264_dpb_info  - h264 dpb information

All other doc I have seen will add an empty line between this and the first
parameter.

> + * @y_dma_addr: Y bitstream physical address
> + * @c_dma_addr: CbCr bitstream physical address
> + * @reference_flag: reference picture flag (short/long term reference picture)
> + * @field: field picture flag
> + */
> +struct mtk_h264_dpb_info {
> +	dma_addr_t y_dma_addr;
> +	dma_addr_t c_dma_addr;
> +	int reference_flag;
> +	int field;
> +};
> +
> +/**
> + * struct mtk_h264_sps_param  - parameters for sps

Each member is documented in previous structure, why not this one ? Same apply
in many places.

> + */
> +struct mtk_h264_sps_param {
> +	unsigned char chroma_format_idc;
> +	unsigned char bit_depth_luma_minus8;
> +	unsigned char bit_depth_chroma_minus8;
> +	unsigned char log2_max_frame_num_minus4;
> +	unsigned char pic_order_cnt_type;
> +	unsigned char log2_max_pic_order_cnt_lsb_minus4;
> +	unsigned char max_num_ref_frames;
> +	unsigned char separate_colour_plane_flag;
> +	unsigned short pic_width_in_mbs_minus1;
> +	unsigned short pic_height_in_map_units_minus1;
> +	unsigned int max_frame_nums;
> +	unsigned char qpprime_y_zero_transform_bypass_flag;
> +	unsigned char delta_pic_order_always_zero_flag;
> +	unsigned char frame_mbs_only_flag;
> +	unsigned char mb_adaptive_frame_field_flag;
> +	unsigned char direct_8x8_inference_flag;
> +	unsigned char reserved[3];
> +};
> +
> +/**
> + * struct mtk_h264_pps_param  - parameters for pps
> + */
> +struct mtk_h264_pps_param {
> +	unsigned char num_ref_idx_l0_default_active_minus1;
> +	unsigned char num_ref_idx_l1_default_active_minus1;
> +	unsigned char weighted_bipred_idc;
> +	char pic_init_qp_minus26;
> +	char chroma_qp_index_offset;
> +	char second_chroma_qp_index_offset;
> +	unsigned char entropy_coding_mode_flag;
> +	unsigned char pic_order_present_flag;
> +	unsigned char deblocking_filter_control_present_flag;
> +	unsigned char constrained_intra_pred_flag;
> +	unsigned char weighted_pred_flag;
> +	unsigned char redundant_pic_cnt_present_flag;
> +	unsigned char transform_8x8_mode_flag;
> +	unsigned char scaling_matrix_present_flag;
> +	unsigned char reserved[2];
> +};
> +
> +/**
> + * struct mtk_h264_slice_hd_param  - parameters for slice header
> + */
> +struct mtk_h264_slice_hd_param {
> +	unsigned int first_mb_in_slice;
> +	unsigned int field_pic_flag;
> +	unsigned int slice_type;
> +	unsigned int frame_num;
> +	int pic_order_cnt_lsb;
> +	int delta_pic_order_cnt_bottom;
> +	unsigned int bottom_field_flag;
> +	unsigned int direct_spatial_mv_pred_flag;
> +	int delta_pic_order_cnt0;
> +	int delta_pic_order_cnt1;
> +	unsigned int cabac_init_idc;
> +	int slice_qp_delta;
> +	unsigned int disable_deblocking_filter_idc;
> +	int slice_alpha_c0_offset_div2;
> +	int slice_beta_offset_div2;
> +	unsigned int num_ref_idx_l0_active_minus1;
> +	unsigned int num_ref_idx_l1_active_minus1;
> +	unsigned int reserved;
> +};
> +

And why these two don't have doc now ?

> +struct slice_api_h264_scaling_matrix {
> +	unsigned char scaling_list_4x4[6][16];
> +	unsigned char scaling_list_8x8[6][64];
> +};
> +
> +struct slice_h264_dpb_entry {
> +	unsigned long long reference_ts;
> +	unsigned short frame_num;
> +	unsigned short pic_num;
> +	/* Note that field is indicated by v4l2_buffer.field */
> +	int top_field_order_cnt;
> +	int bottom_field_order_cnt;
> +	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */

You could move that into a doc block ?

> +};
> +
> +/**
> + * struct slice_api_h264_decode_param - parameters for decode.
> + */
> +struct slice_api_h264_decode_param {
> +	struct slice_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES];
> +	unsigned short num_slices;
> +	unsigned short nal_ref_idc;
> +	unsigned char ref_pic_list_p0[32];
> +	unsigned char ref_pic_list_b0[32];
> +	unsigned char ref_pic_list_b1[32];
> +	int top_field_order_cnt;
> +	int bottom_field_order_cnt;
> +	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> +};
> +
> +/**
> + * struct h264_fb - h264 decode frame buffer information
> + * @vdec_fb_va  : virtual address of struct vdec_fb
> + * @y_fb_dma    : dma address of Y frame buffer (luma)
> + * @c_fb_dma    : dma address of C frame buffer (chroma)
> + * @poc         : picture order count of frame buffer
> + * @reserved    : for 8 bytes alignment

This style does not match what came before.

> + */
> +struct h264_fb {
> +	u64 vdec_fb_va;
> +	u64 y_fb_dma;
> +	u64 c_fb_dma;
> +	s32 poc;
> +	u32 reserved;
> +};
> +
> +/**
> + * mtk_vdec_h264_fixup_ref_list - fixup unused reference to 0x20.
> + * @ref_list: reference picture list
> + * @num_valid: used reference number
> + */
> +void mtk_vdec_h264_fixup_ref_list(u8 *ref_list, size_t num_valid);
> +
> +/**
> + * mtk_vdec_h264_get_ctrl_ptr - get each CID contrl address.
> + * @ctx: v4l2 ctx
> + * @id: CID control ID
> + */
> +void *mtk_vdec_h264_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id);
> +
> +/**
> + * mtk_vdec_h264_fill_dpb_info - get each CID contrl address.
> + * @ctx: v4l2 ctx
> + * @decode_params: slice decode params
> + * @h264_dpb_info: dpb buffer information
> + */
> +void mtk_vdec_h264_fill_dpb_info(struct mtk_vcodec_ctx *ctx,
> +				 struct slice_api_h264_decode_param *decode_params,
> +				 struct mtk_h264_dpb_info *h264_dpb_info);
> +
> +/**
> + * mtk_vdec_h264_copy_sps_params - get sps params.
> + * @dst_params: sps params for hw decoder
> + * @src_params: sps params from user driver
> + */
> +void mtk_vdec_h264_copy_sps_params(struct mtk_h264_sps_param *dst_param,
> +				   const struct v4l2_ctrl_h264_sps *src_param);
> +
> +/**
> + * mtk_vdec_h264_copy_pps_params - get pps params.
> + * @dst_params: pps params for hw decoder
> + * @src_params: pps params from user driver
> + */
> +void mtk_vdec_h264_copy_pps_params(struct mtk_h264_pps_param *dst_param,
> +				   const struct v4l2_ctrl_h264_pps *src_param);
> +
> +/**
> + * mtk_vdec_h264_copy_slice_hd_params - get slice header params.
> + * @dst_params: slice params for hw decoder
> + * @src_params: slice params from user driver
> + * @dec_param: decode params from user driver
> + */
> +void mtk_vdec_h264_copy_slice_hd_params(struct mtk_h264_slice_hd_param *dst_param,
> +					const struct v4l2_ctrl_h264_slice_params *src_param,
> +					const struct v4l2_ctrl_h264_decode_params *dec_param);
> +
> +/**
> + * mtk_vdec_h264_copy_scaling_matrix - get each CID contrl address.
> + * @dst_matrix: scaling list params for hw decoder
> + * @src_matrix: scaling list params from user driver
> + */
> +void mtk_vdec_h264_copy_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> +				       const struct v4l2_ctrl_h264_scaling_matrix *src_matrix);
> +
> +/**
> + * mtk_vdec_h264_copy_decode_params - get decode params.
> + * @dst_params: dst params for hw decoder
> + * @src_params: decode params from user driver
> + * @dpb: dpb information
> + */
> +void
> +mtk_vdec_h264_copy_decode_params(struct slice_api_h264_decode_param *dst_params,
> +				 const struct v4l2_ctrl_h264_decode_params *src_params,
> +				 const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES]);
> +
> +/**
> + * mtk_vdec_h264_update_dpb - updata dpb list.
> + * @dec_param: v4l2 control decode params
> + * @dpb: dpb entry informaton
> + */
> +void mtk_vdec_h264_update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> +			      struct v4l2_h264_dpb_entry *dpb);
> +
> +/**
> + * mtk_vdec_h264_find_start_code - find h264 start code using sofeware.
> + * @data: input buffer address
> + * @data_sz: input buffer size
> + *
> + * Return: returns start code position.
> + */
> +int mtk_vdec_h264_find_start_code(unsigned char *data, unsigned int data_sz);
> +
> +/**
> + * mtk_vdec_h264_get_mv_buf_size - get mv buffer size.
> + * @width: picture width
> + * @height: picture height
> + *
> + * Return: returns mv buffer size.
> + */
> +unsigned int mtk_vdec_h264_get_mv_buf_size(unsigned int width, unsigned int height);
> +
> +#endif
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> index 36f3dc1fbe3b..87e0b2f95572 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_if.c
> @@ -12,109 +12,7 @@
>  #include "../vdec_drv_base.h"
>  #include "../vdec_drv_if.h"
>  #include "../vdec_vpu_if.h"
> -
> -#define BUF_PREDICTION_SZ			(64 * 4096)
> -#define MB_UNIT_LEN				16
> -
> -/* get used parameters for sps/pps */
> -#define GET_MTK_VDEC_FLAG(cond, flag) \
> -	{ dst_param->cond = ((src_param->flags & (flag)) ? (1) : (0)); }
> -#define GET_MTK_VDEC_PARAM(param) \
> -	{ dst_param->param = src_param->param; }
> -/* motion vector size (bytes) for every macro block */
> -#define HW_MB_STORE_SZ				64
> -
> -#define H264_MAX_FB_NUM				17
> -#define H264_MAX_MV_NUM				32
> -#define HDR_PARSING_BUF_SZ			1024
> -
> -/**
> - * struct mtk_h264_dpb_info  - h264 dpb information
> - * @y_dma_addr: Y bitstream physical address
> - * @c_dma_addr: CbCr bitstream physical address
> - * @reference_flag: reference picture flag (short/long term reference picture)
> - * @field: field picture flag
> - */
> -struct mtk_h264_dpb_info {
> -	dma_addr_t y_dma_addr;
> -	dma_addr_t c_dma_addr;
> -	int reference_flag;
> -	int field;
> -};
> -
> -/*
> - * struct mtk_h264_sps_param  - parameters for sps
> - */
> -struct mtk_h264_sps_param {
> -	unsigned char chroma_format_idc;
> -	unsigned char bit_depth_luma_minus8;
> -	unsigned char bit_depth_chroma_minus8;
> -	unsigned char log2_max_frame_num_minus4;
> -	unsigned char pic_order_cnt_type;
> -	unsigned char log2_max_pic_order_cnt_lsb_minus4;
> -	unsigned char max_num_ref_frames;
> -	unsigned char separate_colour_plane_flag;
> -	unsigned short pic_width_in_mbs_minus1;
> -	unsigned short pic_height_in_map_units_minus1;
> -	unsigned int max_frame_nums;
> -	unsigned char qpprime_y_zero_transform_bypass_flag;
> -	unsigned char delta_pic_order_always_zero_flag;
> -	unsigned char frame_mbs_only_flag;
> -	unsigned char mb_adaptive_frame_field_flag;
> -	unsigned char direct_8x8_inference_flag;
> -	unsigned char reserved[3];
> -};
> -
> -/*
> - * struct mtk_h264_pps_param  - parameters for pps
> - */
> -struct mtk_h264_pps_param {
> -	unsigned char num_ref_idx_l0_default_active_minus1;
> -	unsigned char num_ref_idx_l1_default_active_minus1;
> -	unsigned char weighted_bipred_idc;
> -	char pic_init_qp_minus26;
> -	char chroma_qp_index_offset;
> -	char second_chroma_qp_index_offset;
> -	unsigned char entropy_coding_mode_flag;
> -	unsigned char pic_order_present_flag;
> -	unsigned char deblocking_filter_control_present_flag;
> -	unsigned char constrained_intra_pred_flag;
> -	unsigned char weighted_pred_flag;
> -	unsigned char redundant_pic_cnt_present_flag;
> -	unsigned char transform_8x8_mode_flag;
> -	unsigned char scaling_matrix_present_flag;
> -	unsigned char reserved[2];
> -};
> -
> -struct slice_api_h264_scaling_matrix {
> -	unsigned char scaling_list_4x4[6][16];
> -	unsigned char scaling_list_8x8[6][64];
> -};
> -
> -struct slice_h264_dpb_entry {
> -	unsigned long long reference_ts;
> -	unsigned short frame_num;
> -	unsigned short pic_num;
> -	/* Note that field is indicated by v4l2_buffer.field */
> -	int top_field_order_cnt;
> -	int bottom_field_order_cnt;
> -	unsigned int flags; /* V4L2_H264_DPB_ENTRY_FLAG_* */
> -};
> -
> -/*
> - * struct slice_api_h264_decode_param - parameters for decode.
> - */
> -struct slice_api_h264_decode_param {
> -	struct slice_h264_dpb_entry dpb[16];
> -	unsigned short num_slices;
> -	unsigned short nal_ref_idc;
> -	unsigned char ref_pic_list_p0[32];
> -	unsigned char ref_pic_list_b0[32];
> -	unsigned char ref_pic_list_b1[32];
> -	int top_field_order_cnt;
> -	int bottom_field_order_cnt;
> -	unsigned int flags; /* V4L2_H264_DECODE_PARAM_FLAG_* */
> -};
> +#include "vdec_h264_req_common.h"
>  
>  /*
>   * struct mtk_h264_dec_slice_param  - parameters for decode current frame
> @@ -127,22 +25,6 @@ struct mtk_h264_dec_slice_param {
>  	struct mtk_h264_dpb_info h264_dpb_info[16];
>  };
>  
> -/**
> - * struct h264_fb - h264 decode frame buffer information
> - * @vdec_fb_va  : virtual address of struct vdec_fb
> - * @y_fb_dma    : dma address of Y frame buffer (luma)
> - * @c_fb_dma    : dma address of C frame buffer (chroma)
> - * @poc         : picture order count of frame buffer
> - * @reserved    : for 8 bytes alignment
> - */
> -struct h264_fb {
> -	u64 vdec_fb_va;
> -	u64 y_fb_dma;
> -	u64 c_fb_dma;
> -	s32 poc;
> -	u32 reserved;
> -};
> -
>  /**
>   * struct vdec_h264_dec_info - decode information
>   * @dpb_sz		: decoding picture buffer size
> @@ -212,265 +94,45 @@ struct vdec_h264_slice_inst {
>  	struct v4l2_h264_dpb_entry dpb[16];
>  };
>  
> -static void *get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
> -{
> -	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> -
> -	return ctrl->p_cur.p;
> -}
> -
> -static void get_h264_dpb_list(struct vdec_h264_slice_inst *inst,
> -			      struct mtk_h264_dec_slice_param *slice_param)
> -{
> -	struct vb2_queue *vq;
> -	struct vb2_buffer *vb;
> -	struct vb2_v4l2_buffer *vb2_v4l2;
> -	u64 index;
> -
> -	vq = v4l2_m2m_get_vq(inst->ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> -
> -	for (index = 0; index < ARRAY_SIZE(slice_param->decode_params.dpb); index++) {
> -		const struct slice_h264_dpb_entry *dpb;
> -		int vb2_index;
> -
> -		dpb = &slice_param->decode_params.dpb[index];
> -		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) {
> -			slice_param->h264_dpb_info[index].reference_flag = 0;
> -			continue;
> -		}
> -
> -		vb2_index = vb2_find_timestamp(vq, dpb->reference_ts, 0);
> -		if (vb2_index < 0) {
> -			mtk_vcodec_err(inst, "Reference invalid: dpb_index(%lld) reference_ts(%lld)",
> -				       index, dpb->reference_ts);
> -			continue;
> -		}
> -		/* 1 for short term reference, 2 for long term reference */
> -		if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> -			slice_param->h264_dpb_info[index].reference_flag = 1;
> -		else
> -			slice_param->h264_dpb_info[index].reference_flag = 2;
> -
> -		vb = vq->bufs[vb2_index];
> -		vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
> -		slice_param->h264_dpb_info[index].field = vb2_v4l2->field;
> -
> -		slice_param->h264_dpb_info[index].y_dma_addr =
> -			vb2_dma_contig_plane_dma_addr(vb, 0);
> -		if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
> -			slice_param->h264_dpb_info[index].c_dma_addr =
> -				vb2_dma_contig_plane_dma_addr(vb, 1);
> -		}
> -	}
> -}
> -
> -static void get_h264_sps_parameters(struct mtk_h264_sps_param *dst_param,
> -				    const struct v4l2_ctrl_h264_sps *src_param)
> -{
> -	GET_MTK_VDEC_PARAM(chroma_format_idc);
> -	GET_MTK_VDEC_PARAM(bit_depth_luma_minus8);
> -	GET_MTK_VDEC_PARAM(bit_depth_chroma_minus8);
> -	GET_MTK_VDEC_PARAM(log2_max_frame_num_minus4);
> -	GET_MTK_VDEC_PARAM(pic_order_cnt_type);
> -	GET_MTK_VDEC_PARAM(log2_max_pic_order_cnt_lsb_minus4);
> -	GET_MTK_VDEC_PARAM(max_num_ref_frames);
> -	GET_MTK_VDEC_PARAM(pic_width_in_mbs_minus1);
> -	GET_MTK_VDEC_PARAM(pic_height_in_map_units_minus1);
> -
> -	GET_MTK_VDEC_FLAG(separate_colour_plane_flag,
> -			  V4L2_H264_SPS_FLAG_SEPARATE_COLOUR_PLANE);
> -	GET_MTK_VDEC_FLAG(qpprime_y_zero_transform_bypass_flag,
> -			  V4L2_H264_SPS_FLAG_QPPRIME_Y_ZERO_TRANSFORM_BYPASS);
> -	GET_MTK_VDEC_FLAG(delta_pic_order_always_zero_flag,
> -			  V4L2_H264_SPS_FLAG_DELTA_PIC_ORDER_ALWAYS_ZERO);
> -	GET_MTK_VDEC_FLAG(frame_mbs_only_flag,
> -			  V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY);
> -	GET_MTK_VDEC_FLAG(mb_adaptive_frame_field_flag,
> -			  V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD);
> -	GET_MTK_VDEC_FLAG(direct_8x8_inference_flag,
> -			  V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE);
> -}
> -
> -static void get_h264_pps_parameters(struct mtk_h264_pps_param *dst_param,
> -				    const struct v4l2_ctrl_h264_pps *src_param)
> +static int get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
>  {
> -	GET_MTK_VDEC_PARAM(num_ref_idx_l0_default_active_minus1);
> -	GET_MTK_VDEC_PARAM(num_ref_idx_l1_default_active_minus1);
> -	GET_MTK_VDEC_PARAM(weighted_bipred_idc);
> -	GET_MTK_VDEC_PARAM(pic_init_qp_minus26);
> -	GET_MTK_VDEC_PARAM(chroma_qp_index_offset);
> -	GET_MTK_VDEC_PARAM(second_chroma_qp_index_offset);
> -
> -	GET_MTK_VDEC_FLAG(entropy_coding_mode_flag,
> -			  V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE);
> -	GET_MTK_VDEC_FLAG(pic_order_present_flag,
> -			  V4L2_H264_PPS_FLAG_BOTTOM_FIELD_PIC_ORDER_IN_FRAME_PRESENT);
> -	GET_MTK_VDEC_FLAG(weighted_pred_flag,
> -			  V4L2_H264_PPS_FLAG_WEIGHTED_PRED);
> -	GET_MTK_VDEC_FLAG(deblocking_filter_control_present_flag,
> -			  V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT);
> -	GET_MTK_VDEC_FLAG(constrained_intra_pred_flag,
> -			  V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED);
> -	GET_MTK_VDEC_FLAG(redundant_pic_cnt_present_flag,
> -			  V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT);
> -	GET_MTK_VDEC_FLAG(transform_8x8_mode_flag,
> -			  V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE);
> -	GET_MTK_VDEC_FLAG(scaling_matrix_present_flag,
> -			  V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT);
> -}
> -
> -static void
> -get_h264_scaling_matrix(struct slice_api_h264_scaling_matrix *dst_matrix,
> -			const struct v4l2_ctrl_h264_scaling_matrix *src_matrix)
> -{
> -	memcpy(dst_matrix->scaling_list_4x4, src_matrix->scaling_list_4x4,
> -	       sizeof(dst_matrix->scaling_list_4x4));
> -
> -	memcpy(dst_matrix->scaling_list_8x8, src_matrix->scaling_list_8x8,
> -	       sizeof(dst_matrix->scaling_list_8x8));
> -}
> -
> -static void
> -get_h264_decode_parameters(struct slice_api_h264_decode_param *dst_params,
> -			   const struct v4l2_ctrl_h264_decode_params *src_params,
> -			   const struct v4l2_h264_dpb_entry dpb[V4L2_H264_NUM_DPB_ENTRIES])
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(dst_params->dpb); i++) {
> -		struct slice_h264_dpb_entry *dst_entry = &dst_params->dpb[i];
> -		const struct v4l2_h264_dpb_entry *src_entry = &dpb[i];
> -
> -		dst_entry->reference_ts = src_entry->reference_ts;
> -		dst_entry->frame_num = src_entry->frame_num;
> -		dst_entry->pic_num = src_entry->pic_num;
> -		dst_entry->top_field_order_cnt = src_entry->top_field_order_cnt;
> -		dst_entry->bottom_field_order_cnt =
> -			src_entry->bottom_field_order_cnt;
> -		dst_entry->flags = src_entry->flags;
> -	}
> -
> -	/*
> -	 * num_slices is a leftover from the old H.264 support and is ignored
> -	 * by the firmware.
> -	 */
> -	dst_params->num_slices = 0;
> -	dst_params->nal_ref_idc = src_params->nal_ref_idc;
> -	dst_params->top_field_order_cnt = src_params->top_field_order_cnt;
> -	dst_params->bottom_field_order_cnt = src_params->bottom_field_order_cnt;
> -	dst_params->flags = src_params->flags;
> -}
> -
> -static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
> -			    const struct v4l2_h264_dpb_entry *b)
> -{
> -	return a->top_field_order_cnt == b->top_field_order_cnt &&
> -	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
> -}
> -
> -/*
> - * Move DPB entries of dec_param that refer to a frame already existing in dpb
> - * into the already existing slot in dpb, and move other entries into new slots.
> - *
> - * This function is an adaptation of the similarly-named function in
> - * hantro_h264.c.
> - */
> -static void update_dpb(const struct v4l2_ctrl_h264_decode_params *dec_param,
> -		       struct v4l2_h264_dpb_entry *dpb)
> -{
> -	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> -	DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> -	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
> -	unsigned int i, j;
> -
> -	/* Disable all entries by default, and mark the ones in use. */
> -	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> -		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
> -			set_bit(i, in_use);
> -		dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
> -	}
> -
> -	/* Try to match new DPB entries with existing ones by their POCs. */
> -	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
> -		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> -
> -		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
> -			continue;
> -
> -		/*
> -		 * To cut off some comparisons, iterate only on target DPB
> -		 * entries were already used.
> -		 */
> -		for_each_set_bit(j, in_use, ARRAY_SIZE(dec_param->dpb)) {
> -			struct v4l2_h264_dpb_entry *cdpb;
> -
> -			cdpb = &dpb[j];
> -			if (!dpb_entry_match(cdpb, ndpb))
> -				continue;
> -
> -			*cdpb = *ndpb;
> -			set_bit(j, used);
> -			/* Don't reiterate on this one. */
> -			clear_bit(j, in_use);
> -			break;
> -		}
> -
> -		if (j == ARRAY_SIZE(dec_param->dpb))
> -			set_bit(i, new);
> -	}
> -
> -	/* For entries that could not be matched, use remaining free slots. */
> -	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
> -		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
> -		struct v4l2_h264_dpb_entry *cdpb;
> -
> -		/*
> -		 * Both arrays are of the same sizes, so there is no way
> -		 * we can end up with no space in target array, unless
> -		 * something is buggy.
> -		 */
> -		j = find_first_zero_bit(used, ARRAY_SIZE(dec_param->dpb));
> -		if (WARN_ON(j >= ARRAY_SIZE(dec_param->dpb)))
> -			return;
> -
> -		cdpb = &dpb[j];
> -		*cdpb = *ndpb;
> -		set_bit(j, used);
> -	}
> -}
> -
> -/*
> - * The firmware expects unused reflist entries to have the value 0x20.
> - */
> -static void fixup_ref_list(u8 *ref_list, size_t num_valid)
> -{
> -	memset(&ref_list[num_valid], 0x20, 32 - num_valid);
> -}
> -
> -static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
> -{
> -	const struct v4l2_ctrl_h264_decode_params *dec_params =
> -		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> -	const struct v4l2_ctrl_h264_sps *sps =
> -		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> -	const struct v4l2_ctrl_h264_pps *pps =
> -		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> -	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix =
> -		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> +	const struct v4l2_ctrl_h264_decode_params *dec_params;
> +	const struct v4l2_ctrl_h264_sps *sps;
> +	const struct v4l2_ctrl_h264_pps *pps;
> +	const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix;
>  	struct mtk_h264_dec_slice_param *slice_param = &inst->h264_slice_param;
>  	struct v4l2_h264_reflist_builder reflist_builder;
>  	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
>  	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
>  	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
>  
> -	update_dpb(dec_params, inst->dpb);
> +	dec_params =
> +		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> +	if (IS_ERR(dec_params))
> +		return PTR_ERR(dec_params);
> +
> +	sps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> +	if (IS_ERR(sps))
> +		return PTR_ERR(sps);
> +
> +	pps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> +	if (IS_ERR(pps))
> +		return PTR_ERR(pps);
>  
> -	get_h264_sps_parameters(&slice_param->sps, sps);
> -	get_h264_pps_parameters(&slice_param->pps, pps);
> -	get_h264_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> -	get_h264_decode_parameters(&slice_param->decode_params, dec_params,
> -				   inst->dpb);
> -	get_h264_dpb_list(inst, slice_param);
> +	scaling_matrix =
> +		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> +	if (IS_ERR(scaling_matrix))
> +		return PTR_ERR(scaling_matrix);
> +
> +	mtk_vdec_h264_update_dpb(dec_params, inst->dpb);
> +
> +	mtk_vdec_h264_copy_sps_params(&slice_param->sps, sps);
> +	mtk_vdec_h264_copy_pps_params(&slice_param->pps, pps);
> +	mtk_vdec_h264_copy_scaling_matrix(&slice_param->scaling_matrix, scaling_matrix);
> +	mtk_vdec_h264_copy_decode_params(&slice_param->decode_params,
> +					 dec_params, inst->dpb);
> +	mtk_vdec_h264_fill_dpb_info(inst->ctx, &slice_param->decode_params,
> +				    slice_param->h264_dpb_info);
>  
>  	/* Build the reference lists */
>  	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> @@ -478,19 +140,14 @@ static void get_vdec_decode_parameters(struct vdec_h264_slice_inst *inst)
>  	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
>  	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
>  	/* Adapt the built lists to the firmware's expectations */
> -	fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> -	fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> -	fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> +	mtk_vdec_h264_fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> +	mtk_vdec_h264_fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> +	mtk_vdec_h264_fixup_ref_list(b1_reflist, reflist_builder.num_valid);
>  
>  	memcpy(&inst->vsi_ctx.h264_slice_params, slice_param,
>  	       sizeof(inst->vsi_ctx.h264_slice_params));
> -}
>  
> -static unsigned int get_mv_buf_size(unsigned int width, unsigned int height)
> -{
> -	int unit_size = (width / MB_UNIT_LEN) * (height / MB_UNIT_LEN) + 8;
> -
> -	return HW_MB_STORE_SZ * unit_size;
> +	return 0;
>  }
>  
>  static int allocate_predication_buf(struct vdec_h264_slice_inst *inst)
> @@ -525,7 +182,7 @@ static int alloc_mv_buf(struct vdec_h264_slice_inst *inst,
>  	int i;
>  	int err;
>  	struct mtk_vcodec_mem *mem = NULL;
> -	unsigned int buf_sz = get_mv_buf_size(pic->buf_w, pic->buf_h);
> +	unsigned int buf_sz = mtk_vdec_h264_get_mv_buf_size(pic->buf_w, pic->buf_h);
>  
>  	mtk_v4l2_debug(3, "size = 0x%x", buf_sz);
>  	for (i = 0; i < H264_MAX_MV_NUM; i++) {
> @@ -674,7 +331,7 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
>  {
>  	struct vdec_h264_slice_inst *inst = h_vdec;
>  	const struct v4l2_ctrl_h264_decode_params *dec_params =
> -		get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> +		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
>  	struct vdec_vpu_inst *vpu = &inst->vpu;
>  	struct mtk_video_dec_buf *src_buf_info;
>  	struct mtk_video_dec_buf *dst_buf_info;
> @@ -706,7 +363,10 @@ static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
>  
>  	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
>  				   &dst_buf_info->m2m_buf.vb, true);
> -	get_vdec_decode_parameters(inst);
> +	err = get_vdec_decode_parameters(inst);
> +	if (err)
> +		goto err_free_fb_out;
> +
>  	data[0] = bs->size;
>  	/*
>  	 * Reconstruct the first byte of the NAL unit, as the firmware requests


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192
  2022-02-23  3:40 ` [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192 Yunfei Dong
@ 2022-03-01 22:01   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 22:01 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> Adds h264 lat and core architecture driver for mt8192,
> and the decode mode is frame based for stateless decoder.
> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  | 621 ++++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   8 +-
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
>  include/linux/remoteproc/mtk_scp.h            |   2 +
>  5 files changed, 632 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
> 
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index 3f41d748eee5..22edb1c86598 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -10,6 +10,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>  		vdec/vdec_vp9_if.o \
>  		vdec/vdec_h264_req_if.o \
>  		vdec/vdec_h264_req_common.o \
> +		vdec/vdec_h264_req_multi_if.o \
>  		mtk_vcodec_dec_drv.o \
>  		vdec_drv_if.o \
>  		vdec_vpu_if.o \
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
> new file mode 100644
> index 000000000000..82a279f327c4
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_h264_req_multi_if.c
> @@ -0,0 +1,621 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + * Author: Yunfei Dong <yunfei.dong@mediatek.com>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <media/v4l2-h264.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include "../mtk_vcodec_util.h"
> +#include "../mtk_vcodec_dec.h"
> +#include "../mtk_vcodec_intr.h"
> +#include "../vdec_drv_base.h"
> +#include "../vdec_drv_if.h"
> +#include "../vdec_vpu_if.h"
> +#include "vdec_h264_req_common.h"
> +
> +/**
> + * enum vdec_h264_core_dec_err_type  - core decode error type

Similar to my comment on other patch, I notice that a empty line is added here
in other doc comments. To be applied everywhere of course.

> + * @TRANS_BUFFER_FULL : trans buffer is full
> + * @SLICE_HEADER_FULL : slice header buffer is full
> + */
> +enum vdec_h264_core_dec_err_type {
> +	TRANS_BUFFER_FULL = 1,
> +	SLICE_HEADER_FULL,
> +};
> +
> +/**
> + * struct vdec_h264_slice_lat_dec_param  - parameters for decode current frame
> + * @sps : h264 sps syntax parameters
> + * @pps : h264 pps syntax parameters
> + * @slice_header: h264 slice header syntax parameters
> + * @scaling_matrix : h264 scaling list parameters
> + * @decode_params : decoder parameters of each frame used for hardware decode
> + * @h264_dpb_info : dpb reference list
> + */
> +struct vdec_h264_slice_lat_dec_param {
> +	struct mtk_h264_sps_param sps;
> +	struct mtk_h264_pps_param pps;
> +	struct mtk_h264_slice_hd_param slice_header;
> +	struct slice_api_h264_scaling_matrix scaling_matrix;
> +	struct slice_api_h264_decode_param decode_params;
> +	struct mtk_h264_dpb_info h264_dpb_info[V4L2_H264_NUM_DPB_ENTRIES];
> +};
> +
> +/**
> + * struct vdec_h264_slice_info - decode information
> + * @nal_info    : nal info of current picture
> + * @timeout     : Decode timeout: 1 timeout, 0 no timeount
> + * @bs_buf_size : bitstream size
> + * @bs_buf_addr : bitstream buffer dma address
> + * @y_fb_dma    : Y frame buffer dma address
> + * @c_fb_dma    : C frame buffer dma address
> + * @vdec_fb_va  : VDEC frame buffer struct virtual address
> + * @crc         : Used to check whether hardware's status is right
> + */
> +struct vdec_h264_slice_info {
> +	u16 nal_info;
> +	u16 timeout;
> +	u32 bs_buf_size;
> +	u64 bs_buf_addr;
> +	u64 y_fb_dma;
> +	u64 c_fb_dma;
> +	u64 vdec_fb_va;
> +	u32 crc[8];
> +};
> +
> +/**
> + * struct vdec_h264_slice_vsi - shared memory for decode information exchange
> + *        between VPU and Host. The memory is allocated by VPU then mapping to
> + *        Host in vdec_h264_slice_init() and freed in vdec_h264_slice_deinit()
> + *        by VPU. AP-W/R : AP is writer/reader on this item. VPU-W/R: VPU is
> + *        write/reader on this item.

Long description goes below the member list.

> + * @wdma_err_addr       : wdma error dma address
> + * @wdma_start_addr     : wdma start dma address
> + * @wdma_end_addr       : wdma end dma address
> + * @slice_bc_start_addr : slice bc start dma address
> + * @slice_bc_end_addr   : slice bc end dma address
> + * @row_info_start_addr : row info start dma address
> + * @row_info_end_addr   : row info end dma address
> + * @trans_start         : trans start dma address
> + * @trans_end           : trans end dma address
> + * @wdma_end_addr_offset: wdma end address offset
> + *
> + * @mv_buf_dma          : HW working motion vector buffer
> + *                        dma address (AP-W, VPU-R)
> + * @dec                 : decode information (AP-R, VPU-W)
> + * @h264_slice_params   : decode parameters for hw used

Please use consistent style, in general : has no space in other doc comment I
see. Please apply across the code.

> + */
> +struct vdec_h264_slice_vsi {
> +	/* LAT dec addr */
> +	u64 wdma_err_addr;
> +	u64 wdma_start_addr;
> +	u64 wdma_end_addr;
> +	u64 slice_bc_start_addr;
> +	u64 slice_bc_end_addr;
> +	u64 row_info_start_addr;
> +	u64 row_info_end_addr;
> +	u64 trans_start;
> +	u64 trans_end;
> +	u64 wdma_end_addr_offset;
> +
> +	u64 mv_buf_dma[H264_MAX_MV_NUM];
> +	struct vdec_h264_slice_info dec;
> +	struct vdec_h264_slice_lat_dec_param h264_slice_params;
> +};
> +
> +/**
> + * struct vdec_h264_slice_share_info - shared information used to exchange
> + *                                     message between lat and core
> + * @sps	              : sequence header information from user space
> + * @dec_params        : decoder params from user space
> + * @h264_slice_params : decoder params used for hardware
> + * @trans_start       : trans start dma address
> + * @trans_end         : trans end dma address
> + * @nal_info          : nal info of current picture
> + */
> +struct vdec_h264_slice_share_info {
> +	struct v4l2_ctrl_h264_sps sps;
> +	struct v4l2_ctrl_h264_decode_params dec_params;
> +	struct vdec_h264_slice_lat_dec_param h264_slice_params;
> +	u64 trans_start;
> +	u64 trans_end;
> +	u16 nal_info;
> +};
> +
> +/**
> + * struct vdec_h264_slice_inst - h264 decoder instance
> + * @slice_dec_num        : how many picture be decoded
> + * @ctx                 : point to mtk_vcodec_ctx
> + * @pred_buf            : HW working predication buffer
> + * @mv_buf              : HW working motion vector buffer
> + * @vpu                 : VPU instance
> + * @vsi                 : vsi used for lat
> + * @vsi_core            : vsi used for core
> + *
> + * @resolution_changed  : resolution changed
> + * @realloc_mv_buf      : reallocate mv buffer
> + * @cap_num_planes      : number of capture queue plane
> + *
> + * @dpb : decoded picture buffer used to store reference buffer information
> + */
> +struct vdec_h264_slice_inst {
> +	unsigned int slice_dec_num;
> +	struct mtk_vcodec_ctx *ctx;
> +	struct mtk_vcodec_mem pred_buf;
> +	struct mtk_vcodec_mem mv_buf[H264_MAX_MV_NUM];
> +	struct vdec_vpu_inst vpu;
> +	struct vdec_h264_slice_vsi *vsi;
> +	struct vdec_h264_slice_vsi *vsi_core;
> +
> +	unsigned int resolution_changed;
> +	unsigned int realloc_mv_buf;
> +	unsigned int cap_num_planes;
> +
> +	struct v4l2_h264_dpb_entry dpb[16];
> +};
> +
> +static int vdec_h264_slice_fill_decode_parameters(struct vdec_h264_slice_inst *inst,
> +						  struct vdec_h264_slice_share_info *share_info)
> +{
> +	struct vdec_h264_slice_lat_dec_param *slice_param = &inst->vsi->h264_slice_params;
> +	const struct v4l2_ctrl_h264_decode_params *dec_params;
> +	const struct v4l2_ctrl_h264_scaling_matrix *src_matrix;
> +	const struct v4l2_ctrl_h264_sps *sps;
> +	const struct v4l2_ctrl_h264_pps *pps;
> +
> +	dec_params =
> +		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_DECODE_PARAMS);
> +	if (IS_ERR(dec_params))
> +		return PTR_ERR(dec_params);
> +
> +	src_matrix =
> +		mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SCALING_MATRIX);
> +	if (IS_ERR(src_matrix))
> +		return PTR_ERR(src_matrix);
> +
> +	sps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_SPS);
> +	if (IS_ERR(sps))
> +		return PTR_ERR(sps);
> +
> +	pps = mtk_vdec_h264_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_H264_PPS);
> +	if (IS_ERR(pps))
> +		return PTR_ERR(pps);
> +
> +	if (dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC) {
> +		mtk_vcodec_err(inst, "h264 no support field bitstream.");
Perhaps rephrase to:

"No support for H.264 field decoding."

> +		return -EINVAL;
> +	}
> +
> +	mtk_vdec_h264_copy_sps_params(&slice_param->sps, sps);
> +	mtk_vdec_h264_copy_pps_params(&slice_param->pps, pps);
> +	mtk_vdec_h264_copy_scaling_matrix(&slice_param->scaling_matrix, src_matrix);
> +
> +	memcpy(&share_info->sps, sps, sizeof(*sps));
> +	memcpy(&share_info->dec_params, dec_params, sizeof(*dec_params));
> +
> +	return 0;
> +}
> +
> +static void vdec_h264_slice_fill_decode_reflist(struct vdec_h264_slice_inst *inst,
> +						struct vdec_h264_slice_lat_dec_param *slice_param,
> +						struct vdec_h264_slice_share_info *share_info)
> +{
> +	struct v4l2_ctrl_h264_decode_params *dec_params = &share_info->dec_params;
> +	struct v4l2_ctrl_h264_sps *sps = &share_info->sps;
> +	struct v4l2_h264_reflist_builder reflist_builder;
> +	u8 *p0_reflist = slice_param->decode_params.ref_pic_list_p0;
> +	u8 *b0_reflist = slice_param->decode_params.ref_pic_list_b0;
> +	u8 *b1_reflist = slice_param->decode_params.ref_pic_list_b1;
> +
> +	mtk_vdec_h264_update_dpb(dec_params, inst->dpb);
> +
> +	mtk_vdec_h264_copy_decode_params(&slice_param->decode_params, dec_params,
> +					 inst->dpb);
> +	mtk_vdec_h264_fill_dpb_info(inst->ctx, &slice_param->decode_params,
> +				    slice_param->h264_dpb_info);
> +
> +	mtk_v4l2_debug(3, "cur poc = %d\n", dec_params->bottom_field_order_cnt);
> +	/* Build the reference lists */
> +	v4l2_h264_init_reflist_builder(&reflist_builder, dec_params, sps,
> +				       inst->dpb);
> +	v4l2_h264_build_p_ref_list(&reflist_builder, p0_reflist);
> +	v4l2_h264_build_b_ref_lists(&reflist_builder, b0_reflist, b1_reflist);
> +
> +	/* Adapt the built lists to the firmware's expectations */
> +	mtk_vdec_h264_fixup_ref_list(p0_reflist, reflist_builder.num_valid);
> +	mtk_vdec_h264_fixup_ref_list(b0_reflist, reflist_builder.num_valid);
> +	mtk_vdec_h264_fixup_ref_list(b1_reflist, reflist_builder.num_valid);
> +}
> +
> +static int vdec_h264_slice_alloc_mv_buf(struct vdec_h264_slice_inst *inst,
> +					struct vdec_pic_info *pic)
> +{
> +	unsigned int buf_sz = mtk_vdec_h264_get_mv_buf_size(pic->buf_w, pic->buf_h);
> +	struct mtk_vcodec_mem *mem;
> +	int i, err;
> +
> +	mtk_v4l2_debug(3, "size = 0x%x", buf_sz);
> +	for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +		mem = &inst->mv_buf[i];

nit: Perhaps you could skip (or clear) if mem->size == buf_sz ?

> +		if (mem->va)
> +			mtk_vcodec_mem_free(inst->ctx, mem);
> +		mem->size = buf_sz;
> +		err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +		if (err) {
> +			mtk_vcodec_err(inst, "failed to allocate mv buf");
> +			return err;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void vdec_h264_slice_free_mv_buf(struct vdec_h264_slice_inst *inst)
> +{
> +	int i;
> +	struct mtk_vcodec_mem *mem;
> +
> +	for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +		mem = &inst->mv_buf[i];
> +		if (mem->va)
> +			mtk_vcodec_mem_free(inst->ctx, mem);
> +	}
> +}
> +
> +static void vdec_h264_slice_get_pic_info(struct vdec_h264_slice_inst *inst)
> +{
> +	struct mtk_vcodec_ctx *ctx = inst->ctx;
> +	unsigned int data[3];

nit: use u32 for clarity ?

> +
> +	data[0] = ctx->picinfo.pic_w;
> +	data[1] = ctx->picinfo.pic_h;
> +	data[2] = ctx->capture_fourcc;
> +	vpu_dec_get_param(&inst->vpu, data, 3, GET_PARAM_PIC_INFO);
> +
> +	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
> +	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);

I notice that this is hard coded alignment in many places, should at least have
a constant somewhere.

> +	ctx->picinfo.fb_sz[0] = inst->vpu.fb_sz[0];
> +	ctx->picinfo.fb_sz[1] = inst->vpu.fb_sz[1];
> +	inst->cap_num_planes =
> +		ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> +
> +	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> +			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> +			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> +	mtk_vcodec_debug(inst, "Y/C(%d, %d)", ctx->picinfo.fb_sz[0],
> +			 ctx->picinfo.fb_sz[1]);
> +
> +	if (ctx->last_decoded_picinfo.pic_w != ctx->picinfo.pic_w ||
> +	    ctx->last_decoded_picinfo.pic_h != ctx->picinfo.pic_h) {
> +		inst->resolution_changed = true;
> +		if (ctx->last_decoded_picinfo.buf_w != ctx->picinfo.buf_w ||
> +		    ctx->last_decoded_picinfo.buf_h != ctx->picinfo.buf_h)
> +			inst->realloc_mv_buf = true;
> +
> +		mtk_v4l2_debug(1, "resChg: (%d %d) : old(%d, %d) -> new(%d, %d)",
> +			       inst->resolution_changed,
> +			       inst->realloc_mv_buf,
> +			       ctx->last_decoded_picinfo.pic_w,
> +			       ctx->last_decoded_picinfo.pic_h,
> +			       ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> +	}
> +}
> +
> +static void vdec_h264_slice_get_crop_info(struct vdec_h264_slice_inst *inst,
> +					  struct v4l2_rect *cr)
> +{
> +	cr->left = 0;
> +	cr->top = 0;
> +	cr->width = inst->ctx->picinfo.pic_w;
> +	cr->height = inst->ctx->picinfo.pic_h;
> +
> +	mtk_vcodec_debug(inst, "l=%d, t=%d, w=%d, h=%d",
> +			 cr->left, cr->top, cr->width, cr->height);
> +}
> +
> +static int vdec_h264_slice_init(struct mtk_vcodec_ctx *ctx)
> +{
> +	struct vdec_h264_slice_inst *inst;
> +	int err, vsi_size;
> +
> +	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> +	if (!inst)
> +		return -ENOMEM;
> +
> +	inst->ctx = ctx;
> +
> +	inst->vpu.id = SCP_IPI_VDEC_LAT;
> +	inst->vpu.core_id = SCP_IPI_VDEC_CORE;
> +	inst->vpu.ctx = ctx;
> +	inst->vpu.codec_type = ctx->current_codec;
> +	inst->vpu.capture_type = ctx->capture_fourcc;
> +
> +	err = vpu_dec_init(&inst->vpu);
> +	if (err) {
> +		mtk_vcodec_err(inst, "vdec_h264 init err=%d", err);
> +		goto error_free_inst;
> +	}
> +
> +	vsi_size = round_up(sizeof(struct vdec_h264_slice_vsi), 64);
> +	inst->vsi = inst->vpu.vsi;
> +	inst->vsi_core =
> +		(struct vdec_h264_slice_vsi *)(((char *)inst->vpu.vsi) + vsi_size);
> +	inst->resolution_changed = true;
> +	inst->realloc_mv_buf = true;
> +
> +	mtk_vcodec_debug(inst, "lat struct size = %d,%d,%d,%d vsi: %d\n",
> +			 (int)sizeof(struct mtk_h264_sps_param),
> +			 (int)sizeof(struct mtk_h264_pps_param),
> +			 (int)sizeof(struct vdec_h264_slice_lat_dec_param),
> +			 (int)sizeof(struct mtk_h264_dpb_info),
> +			 vsi_size);
> +	mtk_vcodec_debug(inst, "lat H264 instance >> %p, codec_type = 0x%x",
> +			 inst, inst->vpu.codec_type);
> +
> +	ctx->drv_handle = inst;
> +	return 0;
> +
> +error_free_inst:
> +	kfree(inst);
> +	return err;
> +}
> +
> +static void vdec_h264_slice_deinit(void *h_vdec)
> +{
> +	struct vdec_h264_slice_inst *inst = h_vdec;
> +
> +	mtk_vcodec_debug_enter(inst);
> +
> +	vpu_dec_deinit(&inst->vpu);
> +	vdec_h264_slice_free_mv_buf(inst);
> +	vdec_msg_queue_deinit(&inst->ctx->msg_queue, inst->ctx);
> +
> +	kfree(inst);
> +}
> +
> +static int vdec_h264_slice_core_decode(struct vdec_lat_buf *lat_buf)
> +{
> +	struct vdec_fb *fb;
> +	u64 vdec_fb_va;
> +	u64 y_fb_dma, c_fb_dma;
> +	int err, timeout, i;
> +	struct mtk_vcodec_ctx *ctx = lat_buf->ctx;
> +	struct vdec_h264_slice_inst *inst = ctx->drv_handle;
> +	struct vb2_v4l2_buffer *vb2_v4l2;
> +	struct vdec_h264_slice_share_info *share_info = lat_buf->private_data;
> +	struct mtk_vcodec_mem *mem;
> +	struct vdec_vpu_inst *vpu = &inst->vpu;
> +
> +	mtk_vcodec_debug(inst, "[h264-core] vdec_h264 core decode");
> +	memcpy_toio(&inst->vsi_core->h264_slice_params, &share_info->h264_slice_params,
> +		    sizeof(share_info->h264_slice_params));
> +
> +	fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
> +	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> +	vdec_fb_va = (unsigned long)fb;
> +
> +	if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
> +		c_fb_dma =
> +			y_fb_dma + inst->ctx->picinfo.buf_w * inst->ctx->picinfo.buf_h;

Should use the stride (bytesperline) instead, this will allow un-hardcoding the
width alignment. And normally, the alignnement should also be found/set in the
fmt, so you could maybe use that only ?

Though, I'm not sure I understand why single plane is supported here. MM21 seems
to be defined with 2 planes and there is no other formats.


> +	else
> +		c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> +
> +	mtk_vcodec_debug(inst, "[h264-core] y/c addr = 0x%llx 0x%llx", y_fb_dma,
> +			 c_fb_dma);
> +
> +	inst->vsi_core->dec.y_fb_dma = y_fb_dma;
> +	inst->vsi_core->dec.c_fb_dma = c_fb_dma;
> +	inst->vsi_core->dec.vdec_fb_va = vdec_fb_va;
> +	inst->vsi_core->dec.nal_info = share_info->nal_info;
> +	inst->vsi_core->wdma_start_addr =
> +		lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
> +	inst->vsi_core->wdma_end_addr =
> +		lat_buf->ctx->msg_queue.wdma_addr.dma_addr +
> +		lat_buf->ctx->msg_queue.wdma_addr.size;
> +	inst->vsi_core->wdma_err_addr = lat_buf->wdma_err_addr.dma_addr;
> +	inst->vsi_core->slice_bc_start_addr = lat_buf->slice_bc_addr.dma_addr;
> +	inst->vsi_core->slice_bc_end_addr = lat_buf->slice_bc_addr.dma_addr +
> +		lat_buf->slice_bc_addr.size;
> +	inst->vsi_core->trans_start = share_info->trans_start;
> +	inst->vsi_core->trans_end = share_info->trans_end;
> +	for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +		mem = &inst->mv_buf[i];
> +		inst->vsi_core->mv_buf_dma[i] = mem->dma_addr;
> +	}
> +
> +	vb2_v4l2 = v4l2_m2m_next_dst_buf(ctx->m2m_ctx);
> +	vb2_v4l2->vb2_buf.timestamp = lat_buf->ts_info.vb2_buf.timestamp;
> +	vb2_v4l2->timecode = lat_buf->ts_info.timecode;
> +	vb2_v4l2->field = lat_buf->ts_info.field;
> +	vb2_v4l2->flags = lat_buf->ts_info.flags;

Not quite, not all src buffer flags needs to be copied. Please use
v4l2_m2m_buf_copy_metadata() instead.

> +	vb2_v4l2->vb2_buf.copied_timestamp =
> +		lat_buf->ts_info.vb2_buf.copied_timestamp;
> +
> +	vdec_h264_slice_fill_decode_reflist(inst, &inst->vsi_core->h264_slice_params,
> +					    share_info);
> +
> +	err = vpu_dec_core(vpu);
> +	if (err) {
> +		mtk_vcodec_err(inst, "core decode err=%d", err);
> +		goto vdec_dec_end;
> +	}
> +
> +	/* wait decoder done interrupt */
> +	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
> +					       WAIT_INTR_TIMEOUT_MS, MTK_VDEC_CORE);
> +	if (timeout)
> +		mtk_vcodec_err(inst, "core decode timeout: pic_%d",
> +			       ctx->decoded_frame_cnt);
> +	inst->vsi_core->dec.timeout = !!timeout;
> +
> +	vpu_dec_core_end(vpu);
> +	mtk_vcodec_debug(inst, "pic[%d] crc: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x",
> +			 ctx->decoded_frame_cnt,
> +			 inst->vsi_core->dec.crc[0], inst->vsi_core->dec.crc[1],
> +			 inst->vsi_core->dec.crc[2], inst->vsi_core->dec.crc[3],
> +			 inst->vsi_core->dec.crc[4], inst->vsi_core->dec.crc[5],
> +			 inst->vsi_core->dec.crc[6], inst->vsi_core->dec.crc[7]);
> +
> +vdec_dec_end:
> +	vdec_msg_queue_update_ube_rptr(&lat_buf->ctx->msg_queue,
> +				       share_info->trans_end);
> +	ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, !!err);
> +	mtk_vcodec_debug(inst, "core decode done err=%d", err);
> +	ctx->decoded_frame_cnt++;
> +	return 0;
> +}
> +
> +static int vdec_h264_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> +				  struct vdec_fb *fb, bool *res_chg)
> +{
> +	struct vdec_h264_slice_inst *inst = h_vdec;
> +	struct vdec_vpu_inst *vpu = &inst->vpu;
> +	struct mtk_video_dec_buf *src_buf_info;
> +	int nal_start_idx, err, timeout = 0, i;
> +	unsigned int data[2];
> +	struct vdec_lat_buf *lat_buf;
> +	struct vdec_h264_slice_share_info *share_info;
> +	unsigned char *buf;
> +	struct mtk_vcodec_mem *mem;
> +
> +	if (vdec_msg_queue_init(&inst->ctx->msg_queue, inst->ctx,
> +				vdec_h264_slice_core_decode,
> +				sizeof(*share_info)))
> +		return -ENOMEM;
> +
> +	/* bs NULL means flush decoder */
> +	if (!bs) {
> +		vdec_msg_queue_wait_lat_buf_full(&inst->ctx->msg_queue);
> +		return vpu_dec_reset(vpu);
> +	}
> +
> +	lat_buf = vdec_msg_queue_dqbuf(&inst->ctx->msg_queue.lat_ctx);
> +	if (!lat_buf) {
> +		mtk_vcodec_err(inst, "failed to get lat buffer");
> +		return -EINVAL;
> +	}
> +	share_info = lat_buf->private_data;
> +	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +
> +	buf = (unsigned char *)bs->va;
> +	nal_start_idx = mtk_vdec_h264_find_start_code(buf, bs->size);
> +	if (nal_start_idx < 0) {
> +		err = -EINVAL;
> +		goto err_free_fb_out;
> +	}
> +
> +	inst->vsi->dec.nal_info = buf[nal_start_idx];
> +	inst->vsi->dec.bs_buf_addr = (u64)bs->dma_addr;
> +	inst->vsi->dec.bs_buf_size = bs->size;
> +
> +	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
> +				   &lat_buf->ts_info, true);
> +
> +	err = vdec_h264_slice_fill_decode_parameters(inst, share_info);
> +	if (err)
> +		goto err_free_fb_out;
> +
> +	*res_chg = inst->resolution_changed;
> +	if (inst->resolution_changed) {
> +		mtk_vcodec_debug(inst, "- resolution changed -");
> +		if (inst->realloc_mv_buf) {
> +			err = vdec_h264_slice_alloc_mv_buf(inst, &inst->ctx->picinfo);
> +			inst->realloc_mv_buf = false;
> +			if (err)
> +				goto err_free_fb_out;
> +		}
> +		inst->resolution_changed = false;
> +	}
> +	for (i = 0; i < H264_MAX_MV_NUM; i++) {
> +		mem = &inst->mv_buf[i];
> +		inst->vsi->mv_buf_dma[i] = mem->dma_addr;
> +	}
> +	inst->vsi->wdma_start_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
> +	inst->vsi->wdma_end_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr +
> +		lat_buf->ctx->msg_queue.wdma_addr.size;
> +	inst->vsi->wdma_err_addr = lat_buf->wdma_err_addr.dma_addr;
> +	inst->vsi->slice_bc_start_addr = lat_buf->slice_bc_addr.dma_addr;
> +	inst->vsi->slice_bc_end_addr = lat_buf->slice_bc_addr.dma_addr +
> +		lat_buf->slice_bc_addr.size;
> +
> +	inst->vsi->trans_end = inst->ctx->msg_queue.wdma_rptr_addr;
> +	inst->vsi->trans_start = inst->ctx->msg_queue.wdma_wptr_addr;
> +	mtk_vcodec_debug(inst, "lat:trans(0x%llx 0x%llx)err:0x%llx",
> +			 inst->vsi->wdma_start_addr,
> +			 inst->vsi->wdma_end_addr,
> +			 inst->vsi->wdma_err_addr);
> +
> +	mtk_vcodec_debug(inst, "slice(0x%llx 0x%llx) rprt((0x%llx 0x%llx))",
> +			 inst->vsi->slice_bc_start_addr,
> +			 inst->vsi->slice_bc_end_addr,
> +			 inst->vsi->trans_start,
> +			 inst->vsi->trans_end);
> +	err = vpu_dec_start(vpu, data, 2);
> +	if (err) {
> +		mtk_vcodec_debug(inst, "lat decode err: %d", err);
> +		goto err_free_fb_out;
> +	}
> +
> +	/* wait decoder done interrupt */
> +	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
> +					       WAIT_INTR_TIMEOUT_MS, MTK_VDEC_LAT0);
> +	inst->vsi->dec.timeout = !!timeout;
> +
> +	err = vpu_dec_end(vpu);
> +	if (err == SLICE_HEADER_FULL || timeout || err == TRANS_BUFFER_FULL) {
> +		err = -EINVAL;
> +		goto err_free_fb_out;
> +	}
> +
> +	share_info->trans_end = inst->ctx->msg_queue.wdma_addr.dma_addr +
> +		inst->vsi->wdma_end_addr_offset;
> +	share_info->trans_start = inst->ctx->msg_queue.wdma_wptr_addr;
> +	share_info->nal_info = inst->vsi->dec.nal_info;
> +	vdec_msg_queue_update_ube_wptr(&lat_buf->ctx->msg_queue,
> +				       share_info->trans_end);
> +
> +	memcpy_fromio(&share_info->h264_slice_params, &inst->vsi->h264_slice_params,
> +		      sizeof(share_info->h264_slice_params));
> +	vdec_msg_queue_qbuf(&inst->ctx->dev->msg_queue_core_ctx, lat_buf);
> +
> +	inst->slice_dec_num++;
> +	return 0;
> +
> +err_free_fb_out:
> +	mtk_vcodec_err(inst, "slice dec number: %d err: %d", inst->slice_dec_num, err);
> +	return err;
> +}
> +
> +static int vdec_h264_slice_get_param(void *h_vdec, enum vdec_get_param_type type,
> +				     void *out)
> +{
> +	struct vdec_h264_slice_inst *inst = h_vdec;
> +
> +	switch (type) {
> +	case GET_PARAM_PIC_INFO:
> +		vdec_h264_slice_get_pic_info(inst);
> +		break;
> +	case GET_PARAM_DPB_SIZE:
> +		*(unsigned int *)out = 6;
> +		break;
> +	case GET_PARAM_CROP_INFO:
> +		vdec_h264_slice_get_crop_info(inst, out);
> +		break;
> +	default:
> +		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +const struct vdec_common_if vdec_h264_slice_lat_if = {
> +	.init		= vdec_h264_slice_init,
> +	.decode		= vdec_h264_slice_decode,
> +	.get_param	= vdec_h264_slice_get_param,
> +	.deinit		= vdec_h264_slice_deinit,
> +};
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> index c93dd0ea3537..c17a7815e1bb 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> @@ -20,7 +20,13 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
>  
>  	switch (fourcc) {
>  	case V4L2_PIX_FMT_H264_SLICE:
> -		ctx->dec_if = &vdec_h264_slice_if;
> +		if (ctx->dev->vdec_pdata->hw_arch == MTK_VDEC_PURE_SINGLE_CORE) {
> +			ctx->dec_if = &vdec_h264_slice_if;
> +			ctx->hw_id = MTK_VDEC_CORE;
> +		} else {
> +			ctx->dec_if = &vdec_h264_slice_lat_if;
> +			ctx->hw_id = MTK_VDEC_LAT0;
> +		}
>  		break;
>  	case V4L2_PIX_FMT_H264:
>  		ctx->dec_if = &vdec_h264_if;
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> index d467e8af4a84..6ce848e74167 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> @@ -56,6 +56,7 @@ struct vdec_fb_node {
>  
>  extern const struct vdec_common_if vdec_h264_if;
>  extern const struct vdec_common_if vdec_h264_slice_if;
> +extern const struct vdec_common_if vdec_h264_slice_lat_if;
>  extern const struct vdec_common_if vdec_vp8_if;
>  extern const struct vdec_common_if vdec_vp9_if;
>  
> diff --git a/include/linux/remoteproc/mtk_scp.h b/include/linux/remoteproc/mtk_scp.h
> index b47416f7aeb8..7c2b7cc9fe6c 100644
> --- a/include/linux/remoteproc/mtk_scp.h
> +++ b/include/linux/remoteproc/mtk_scp.h
> @@ -41,6 +41,8 @@ enum scp_ipi_id {
>  	SCP_IPI_ISP_FRAME,
>  	SCP_IPI_FD_CMD,
>  	SCP_IPI_CROS_HOST_CMD,
> +	SCP_IPI_VDEC_LAT,
> +	SCP_IPI_VDEC_CORE,
>  	SCP_IPI_NS_SERVICE = 0xFF,
>  	SCP_IPI_MAX = 0x100,
>  };


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding
  2022-02-23  3:40 ` [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding Yunfei Dong
@ 2022-03-01 22:15   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 22:15 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Thanks for this work.

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> Add support for VP8 decoding using the stateless API,
> as supported by MT8192.

With the struct members naming made consistent, even though I would like your
patch better if it was not duplicating so much code, I'll give you my:

Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |   1 +
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |  24 +-
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |   1 +
>  .../mtk-vcodec/vdec/vdec_vp8_req_if.c         | 445 ++++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |   4 +
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |   1 +
>  6 files changed, 474 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
> 
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index 22edb1c86598..b457daf2d196 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -7,6 +7,7 @@ obj-$(CONFIG_VIDEO_MEDIATEK_VCODEC) += mtk-vcodec-dec.o \
>  
>  mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>  		vdec/vdec_vp8_if.o \
> +		vdec/vdec_vp8_req_if.o \
>  		vdec/vdec_vp9_if.o \
>  		vdec/vdec_h264_req_if.o \
>  		vdec/vdec_h264_req_common.o \
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> index 9333e3418b98..2a0164ddc708 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> @@ -76,13 +76,28 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
>  			.max = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
>  		},
>  		.codec_type = V4L2_PIX_FMT_H264_SLICE,
> +	},
> +	{
> +		.cfg = {
> +			.id = V4L2_CID_STATELESS_VP8_FRAME,
> +		},
> +		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
> +	},
> +	{
> +		.cfg = {
> +			.id = V4L2_CID_MPEG_VIDEO_VP8_PROFILE,
> +			.min = V4L2_MPEG_VIDEO_VP8_PROFILE_0,
> +			.def = V4L2_MPEG_VIDEO_VP8_PROFILE_0,
> +			.max = V4L2_MPEG_VIDEO_VP8_PROFILE_3,
> +		},
> +		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
>  	}
>  };
>  
>  #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
>  
> -static struct mtk_video_fmt mtk_video_formats[3];
> -static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
> +static struct mtk_video_fmt mtk_video_formats[4];
> +static struct mtk_codec_framesizes mtk_vdec_framesizes[2];
>  
>  static struct mtk_video_fmt default_out_format;
>  static struct mtk_video_fmt default_cap_format;
> @@ -350,6 +365,7 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
>  
>  	switch (fourcc) {
>  	case V4L2_PIX_FMT_H264_SLICE:
> +	case V4L2_PIX_FMT_VP8_FRAME:
>  		mtk_video_formats[count_formats].fourcc = fourcc;
>  		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
>  		mtk_video_formats[count_formats].num_planes = 1;
> @@ -393,6 +409,10 @@ static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
>  		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
>  		out_format_count++;
>  	}
> +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_VP8_FRAME) {
> +		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP8_FRAME, ctx);
> +		out_format_count++;
> +	}
>  
>  	if (cap_format_count)
>  		default_cap_format = mtk_video_formats[cap_format_count - 1];
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index d60561065656..c68297db225e 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -354,6 +354,7 @@ enum mtk_vdec_format_types {
>  	MTK_VDEC_FORMAT_MM21 = 0x20,
>  	MTK_VDEC_FORMAT_MT21C = 0x40,
>  	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
> +	MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
>  };
>  
>  /**
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
> new file mode 100644
> index 000000000000..6bd4f2365826
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp8_req_if.c
> @@ -0,0 +1,445 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + * Author: Yunfei Dong <yunfei.dong@mediatek.com>
> + */
> +
> +#include <linux/slab.h>
> +#include <media/v4l2-mem2mem.h>
> +#include <media/videobuf2-dma-contig.h>
> +#include <uapi/linux/v4l2-controls.h>
> +
> +#include "../mtk_vcodec_util.h"
> +#include "../mtk_vcodec_dec.h"
> +#include "../mtk_vcodec_intr.h"
> +#include "../vdec_drv_base.h"
> +#include "../vdec_drv_if.h"
> +#include "../vdec_vpu_if.h"
> +
> +/* Decoding picture buffer size (3 reference frames plus current frame) */
> +#define VP8_DPB_SIZE 4
> +
> +/* HW working buffer size (bytes) */
> +#define VP8_SEG_ID_SZ   SZ_256K
> +#define VP8_PP_WRAPY_SZ SZ_64K
> +#define VP8_PP_WRAPC_SZ SZ_64K
> +#define VP8_VLD_PRED_SZ SZ_64K
> +
> +/**
> + * struct vdec_vp8_slice_info - decode misc information

Same missing 

> + * @vld_wrapper_dma   : vld wrapper dma address
> + * @seg_id_buf_dma    : seg id dma address
> + * @wrap_y_dma        : wrap y dma address
> + * @wrap_c_dma        : wrap y dma address
> + * @cur_y_fb_dma      : current plane Y frame buffer dma address
> + * @cur_c_fb_dma      : current plane C frame buffer dma address
> + * @bs_dma            : bitstream dma address
> + * @bs_sz             : bitstream size
> + * @resolution_changed: resolution change flag 1 - changed,  0 - not change
> + * @frame_header_type : current frame header type
> + * @wait_key_frame    : wait key frame coming
> + * @crc               : used to check whether hardware's status is right
> + * @reserved:         : reserved, currently unused
> + */
> +struct vdec_vp8_slice_info {
> +	u64 vld_wrapper_dma;
> +	u64 seg_id_buf_dma;
> +	u64 wrap_y_dma;
> +	u64 wrap_c_dma;
> +	u64 cur_y_fb_dma;
> +	u64 cur_c_fb_dma;
> +	u64 bs_dma;
> +	u32 bs_sz;
> +	u32 resolution_changed;
> +	u32 frame_header_type;
> +	u32 crc[8];
> +	u32 reserved;
> +};
> +
> +/**
> + * struct vdec_vp8_slice_dpb_info  - vp8 reference information
> + * @y_dma_addr    : Y bitstream physical address
> + * @c_dma_addr    : CbCr bitstream physical address
> + * @reference_flag: reference picture flag
> + * @reserved      : 64bit align
> + */
> +struct vdec_vp8_slice_dpb_info {
> +	dma_addr_t y_dma_addr;
> +	dma_addr_t c_dma_addr;
> +	int reference_flag;
> +	int reserved;
> +};
> +
> +/**
> + * struct vdec_vp8_slice_vsi - VPU shared information
> + * @dec          : decoding information
> + * @pic          : picture information
> + * @vp8_dpb_info : reference buffer information
> + */
> +struct vdec_vp8_slice_vsi {
> +	struct vdec_vp8_slice_info dec;
> +	struct vdec_pic_info pic;

This is not consistent, this is called picinfo in the H.264 implementation.

> +	struct vdec_vp8_slice_dpb_info vp8_dpb_info[3];
> +};
> +
> +/**
> + * struct vdec_vp8_slice_inst - VP8 decoder instance
> + * @seg_id_buf     : seg buffer
> + * @wrap_y_buf     : wrapper y buffer
> + * @wrap_c_buf     : wrapper c buffer
> + * @vld_wrapper_buf: vld wrapper buffer
> + * @ctx            : V4L2 context
> + * @vpu            : VPU instance for decoder
> + * @vsi            : VPU share information
> + */
> +struct vdec_vp8_slice_inst {
> +	struct mtk_vcodec_mem seg_id_buf;
> +	struct mtk_vcodec_mem wrap_y_buf;
> +	struct mtk_vcodec_mem wrap_c_buf;
> +	struct mtk_vcodec_mem vld_wrapper_buf;
> +	struct mtk_vcodec_ctx *ctx;
> +	struct vdec_vpu_inst vpu;
> +	struct vdec_vp8_slice_vsi *vsi;
> +};
> +
> +static void *vdec_vp8_slice_get_ctrl_ptr(struct mtk_vcodec_ctx *ctx, int id)
> +{
> +	struct v4l2_ctrl *ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, id);
> +
> +	if (!ctrl)
> +		return ERR_PTR(-EINVAL);
> +
> +	return ctrl->p_cur.p;
> +}
> +
> +static void vdec_vp8_slice_get_crop_info(struct vdec_vp8_slice_inst *inst,
> +					 struct v4l2_rect *cr)
> +{
> +	cr->left = 0;
> +	cr->top = 0;
> +	cr->width = inst->vsi->pic.pic_w;
> +	cr->height = inst->vsi->pic.pic_h;
> +	mtk_vcodec_debug(inst, "get crop info l=%d, t=%d, w=%d, h=%d",
> +			 cr->left, cr->top, cr->width, cr->height);
> +}

There is clearly room for improvement, this is line by line identical to the
H.264 code. There is a lot in this file that looks like copy-paste with minor
edits. Probably not a road block, but it would be really nice to try and clean
these by making the common code common to avoid load of copy paste, which may
lead to having to fix bugs in multiple places.

> +
> +static void vdec_vp8_slice_get_pic_info(struct vdec_vp8_slice_inst *inst)
> +{
> +	struct mtk_vcodec_ctx *ctx = inst->ctx;
> +	unsigned int data[3];
> +
> +	data[0] = ctx->picinfo.pic_w;
> +	data[1] = ctx->picinfo.pic_h;
> +	data[2] = ctx->capture_fourcc;
> +	vpu_dec_get_param(&inst->vpu, data, 3, GET_PARAM_PIC_INFO);
> +
> +	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
> +	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);
> +	ctx->picinfo.fb_sz[0] = inst->vpu.fb_sz[0];
> +	ctx->picinfo.fb_sz[1] = inst->vpu.fb_sz[1];
> +
> +	inst->vsi->pic.pic_w = ctx->picinfo.pic_w;
> +	inst->vsi->pic.pic_h = ctx->picinfo.pic_h;
> +	inst->vsi->pic.buf_w = ctx->picinfo.buf_w;
> +	inst->vsi->pic.buf_h = ctx->picinfo.buf_h;
> +	inst->vsi->pic.fb_sz[0] = ctx->picinfo.fb_sz[0];
> +	inst->vsi->pic.fb_sz[1] = ctx->picinfo.fb_sz[1];
> +	mtk_vcodec_debug(inst, "pic(%d, %d), buf(%d, %d)",
> +			 ctx->picinfo.pic_w, ctx->picinfo.pic_h,
> +			 ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> +	mtk_vcodec_debug(inst, "fb size: Y(%d), C(%d)",
> +			 ctx->picinfo.fb_sz[0], ctx->picinfo.fb_sz[1]);
> +}
> +
> +static int vdec_vp8_slice_alloc_working_buf(struct vdec_vp8_slice_inst *inst)
> +{
> +	int err;
> +	struct mtk_vcodec_mem *mem;
> +
> +	mem = &inst->seg_id_buf;
> +	mem->size = VP8_SEG_ID_SZ;
> +	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +	if (err) {
> +		mtk_vcodec_err(inst, "Cannot allocate working buffer");
> +		return err;
> +	}
> +	inst->vsi->dec.seg_id_buf_dma = (u64)mem->dma_addr;
> +
> +	mem = &inst->wrap_y_buf;
> +	mem->size = VP8_PP_WRAPY_SZ;
> +	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +	if (err) {
> +		mtk_vcodec_err(inst, "cannot allocate WRAP Y buffer");
> +		return err;
> +	}
> +	inst->vsi->dec.wrap_y_dma = (u64)mem->dma_addr;
> +
> +	mem = &inst->wrap_c_buf;
> +	mem->size = VP8_PP_WRAPC_SZ;
> +	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +	if (err) {
> +		mtk_vcodec_err(inst, "cannot allocate WRAP C buffer");
> +		return err;
> +	}
> +	inst->vsi->dec.wrap_c_dma = (u64)mem->dma_addr;
> +
> +	mem = &inst->vld_wrapper_buf;
> +	mem->size = VP8_VLD_PRED_SZ;
> +	err = mtk_vcodec_mem_alloc(inst->ctx, mem);
> +	if (err) {
> +		mtk_vcodec_err(inst, "cannot allocate vld wrapper buffer");
> +		return err;
> +	}
> +	inst->vsi->dec.vld_wrapper_dma = (u64)mem->dma_addr;
> +
> +	return 0;
> +}
> +
> +static void vdec_vp8_slice_free_working_buf(struct vdec_vp8_slice_inst *inst)
> +{
> +	struct mtk_vcodec_mem *mem;
> +
> +	mem = &inst->seg_id_buf;
> +	if (mem->va)
> +		mtk_vcodec_mem_free(inst->ctx, mem);
> +	inst->vsi->dec.seg_id_buf_dma = 0;
> +
> +	mem = &inst->wrap_y_buf;
> +	if (mem->va)
> +		mtk_vcodec_mem_free(inst->ctx, mem);
> +	inst->vsi->dec.wrap_y_dma = 0;
> +
> +	mem = &inst->wrap_c_buf;
> +	if (mem->va)
> +		mtk_vcodec_mem_free(inst->ctx, mem);
> +	inst->vsi->dec.wrap_c_dma = 0;
> +
> +	mem = &inst->vld_wrapper_buf;
> +	if (mem->va)
> +		mtk_vcodec_mem_free(inst->ctx, mem);
> +	inst->vsi->dec.vld_wrapper_dma = 0;
> +}
> +
> +static u64 vdec_vp8_slice_get_ref_by_ts(const struct v4l2_ctrl_vp8_frame *frame_header,
> +					int index)
> +{
> +	switch (index) {
> +	case 0:
> +		return frame_header->last_frame_ts;
> +	case 1:
> +		return frame_header->golden_frame_ts;
> +	case 2:
> +		return frame_header->alt_frame_ts;
> +	default:
> +		break;
> +	}
> +
> +	return -1;
> +}
> +
> +static int vdec_vp8_slice_get_decode_parameters(struct vdec_vp8_slice_inst *inst)
> +{
> +	const struct v4l2_ctrl_vp8_frame *frame_header;
> +	struct mtk_vcodec_ctx *ctx = inst->ctx;
> +	struct vb2_queue *vq;
> +	struct vb2_buffer *vb;
> +	u64 referenct_ts;
> +	int index, vb2_index;
> +
> +	frame_header = vdec_vp8_slice_get_ctrl_ptr(inst->ctx, V4L2_CID_STATELESS_VP8_FRAME);
> +	if (IS_ERR(frame_header))
> +		return PTR_ERR(frame_header);
> +
> +	vq = v4l2_m2m_get_vq(ctx->m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> +	for (index = 0; index < 3; index++) {
> +		referenct_ts = vdec_vp8_slice_get_ref_by_ts(frame_header, index);
> +		vb2_index = vb2_find_timestamp(vq, referenct_ts, 0);
> +		if (vb2_index < 0) {
> +			if (!V4L2_VP8_FRAME_IS_KEY_FRAME(frame_header))
> +				mtk_vcodec_err(inst, "reference invalid: index(%d) ts(%lld)",
> +					       index, referenct_ts);
> +			inst->vsi->vp8_dpb_info[index].reference_flag = 0;
> +			continue;
> +		}
> +		inst->vsi->vp8_dpb_info[index].reference_flag = 1;
> +
> +		vb = vq->bufs[vb2_index];
> +		inst->vsi->vp8_dpb_info[index].y_dma_addr =
> +			vb2_dma_contig_plane_dma_addr(vb, 0);
> +		if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
> +			inst->vsi->vp8_dpb_info[index].c_dma_addr =
> +				vb2_dma_contig_plane_dma_addr(vb, 1);
> +		else
> +			inst->vsi->vp8_dpb_info[index].c_dma_addr =
> +				inst->vsi->vp8_dpb_info[index].y_dma_addr +
> +				ctx->picinfo.fb_sz[0];
> +	}
> +
> +	inst->vsi->dec.frame_header_type = frame_header->flags >> 1;
> +
> +	return 0;
> +}
> +
> +static int vdec_vp8_slice_init(struct mtk_vcodec_ctx *ctx)
> +{
> +	struct vdec_vp8_slice_inst *inst;
> +	int err;
> +
> +	inst = kzalloc(sizeof(*inst), GFP_KERNEL);
> +	if (!inst)
> +		return -ENOMEM;
> +
> +	inst->ctx = ctx;
> +
> +	inst->vpu.id = SCP_IPI_VDEC_LAT;
> +	inst->vpu.core_id = SCP_IPI_VDEC_CORE;
> +	inst->vpu.ctx = ctx;
> +	inst->vpu.codec_type = ctx->current_codec;
> +	inst->vpu.capture_type = ctx->capture_fourcc;
> +
> +	err = vpu_dec_init(&inst->vpu);
> +	if (err) {
> +		mtk_vcodec_err(inst, "vdec_vp8 init err=%d", err);
> +		goto error_free_inst;
> +	}
> +
> +	inst->vsi = inst->vpu.vsi;
> +	err = vdec_vp8_slice_alloc_working_buf(inst);
> +	if (err)
> +		goto error_deinit;
> +
> +	mtk_vcodec_debug(inst, "vp8 struct size = %d vsi: %d\n",
> +			 (int)sizeof(struct v4l2_ctrl_vp8_frame),
> +			 (int)sizeof(struct vdec_vp8_slice_vsi));
> +	mtk_vcodec_debug(inst, "vp8:%p, codec_type = 0x%x vsi: 0x%p",
> +			 inst, inst->vpu.codec_type, inst->vpu.vsi);
> +
> +	ctx->drv_handle = inst;
> +	return 0;
> +
> +error_deinit:
> +	vpu_dec_deinit(&inst->vpu);
> +error_free_inst:
> +	kfree(inst);
> +	return err;
> +}
> +
> +static int vdec_vp8_slice_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> +				 struct vdec_fb *fb, bool *res_chg)
> +{
> +	struct vdec_vp8_slice_inst *inst = h_vdec;
> +	struct vdec_vpu_inst *vpu = &inst->vpu;
> +	struct mtk_video_dec_buf *src_buf_info, *dst_buf_info;
> +	unsigned int data;
> +	u64 y_fb_dma, c_fb_dma;
> +	int err, timeout;
> +
> +	/* Resolution changes are never initiated by us */
> +	*res_chg = false;
> +
> +	/* bs NULL means flush decoder */
> +	if (!bs)
> +		return vpu_dec_reset(vpu);
> +
> +	src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
> +
> +	fb = inst->ctx->dev->vdec_pdata->get_cap_buffer(inst->ctx);
> +	dst_buf_info = container_of(fb, struct mtk_video_dec_buf, frame_buffer);
> +
> +	y_fb_dma = fb ? (u64)fb->base_y.dma_addr : 0;
> +	if (inst->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
> +		c_fb_dma = y_fb_dma +
> +			inst->ctx->picinfo.buf_w * inst->ctx->picinfo.buf_h;
> +	else
> +		c_fb_dma = fb ? (u64)fb->base_c.dma_addr : 0;
> +
> +	inst->vsi->dec.bs_dma = (u64)bs->dma_addr;
> +	inst->vsi->dec.bs_sz = bs->size;
> +	inst->vsi->dec.cur_y_fb_dma = y_fb_dma;
> +	inst->vsi->dec.cur_c_fb_dma = c_fb_dma;
> +
> +	mtk_vcodec_debug(inst, "frame[%d] bs(%zu 0x%llx) y/c(0x%llx 0x%llx)",
> +			 inst->ctx->decoded_frame_cnt,
> +			 bs->size, (u64)bs->dma_addr,
> +			 y_fb_dma, c_fb_dma);
> +
> +	v4l2_m2m_buf_copy_metadata(&src_buf_info->m2m_buf.vb,
> +				   &dst_buf_info->m2m_buf.vb, true);
> +
> +	err = vdec_vp8_slice_get_decode_parameters(inst);
> +	if (err)
> +		goto error;
> +
> +	err = vpu_dec_start(vpu, &data, 1);
> +	if (err) {
> +		mtk_vcodec_debug(inst, "vp8 dec start err!");
> +		goto error;
> +	}
> +
> +	if (inst->vsi->dec.resolution_changed) {
> +		mtk_vcodec_debug(inst, "- resolution_changed -");
> +		*res_chg = true;
> +		return 0;
> +	}
> +
> +	/* wait decode done interrupt */
> +	timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
> +					       50, MTK_VDEC_CORE);
> +
> +	err = vpu_dec_end(vpu);
> +	if (err || timeout)
> +		mtk_vcodec_debug(inst, "vp8 dec error timeout:%d err: %d pic_%d",
> +				 timeout, err, inst->ctx->decoded_frame_cnt);
> +
> +	mtk_vcodec_debug(inst, "pic[%d] crc: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x",
> +			 inst->ctx->decoded_frame_cnt,
> +			 inst->vsi->dec.crc[0], inst->vsi->dec.crc[1],
> +			 inst->vsi->dec.crc[2], inst->vsi->dec.crc[3],
> +			 inst->vsi->dec.crc[4], inst->vsi->dec.crc[5],
> +			 inst->vsi->dec.crc[6], inst->vsi->dec.crc[7]);
> +
> +	inst->ctx->decoded_frame_cnt++;
> +error:
> +	inst->ctx->dev->vdec_pdata->cap_to_disp(inst->ctx, fb, !!err);
> +	return err;
> +}
> +
> +static int vdec_vp8_slice_get_param(void *h_vdec, enum vdec_get_param_type type, void *out)
> +{
> +	struct vdec_vp8_slice_inst *inst = h_vdec;
> +
> +	switch (type) {
> +	case GET_PARAM_PIC_INFO:
> +		vdec_vp8_slice_get_pic_info(inst);
> +		break;
> +	case GET_PARAM_CROP_INFO:
> +		vdec_vp8_slice_get_crop_info(inst, out);
> +		break;
> +	case GET_PARAM_DPB_SIZE:
> +		*((unsigned int *)out) = VP8_DPB_SIZE;
> +		break;
> +	default:
> +		mtk_vcodec_err(inst, "invalid get parameter type=%d", type);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static void vdec_vp8_slice_deinit(void *h_vdec)
> +{
> +	struct vdec_vp8_slice_inst *inst = h_vdec;
> +
> +	mtk_vcodec_debug_enter(inst);
> +
> +	vpu_dec_deinit(&inst->vpu);
> +	vdec_vp8_slice_free_working_buf(inst);
> +	kfree(inst);
> +}
> +
> +const struct vdec_common_if vdec_vp8_slice_if = {
> +	.init		= vdec_vp8_slice_init,
> +	.decode		= vdec_vp8_slice_decode,
> +	.get_param	= vdec_vp8_slice_get_param,
> +	.deinit		= vdec_vp8_slice_deinit,
> +};
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> index c17a7815e1bb..9db9a57da2c1 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> @@ -32,6 +32,10 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
>  		ctx->dec_if = &vdec_h264_if;
>  		ctx->hw_id = MTK_VDEC_CORE;
>  		break;
> +	case V4L2_PIX_FMT_VP8_FRAME:
> +		ctx->dec_if = &vdec_vp8_slice_if;
> +		ctx->hw_id = MTK_VDEC_CORE;
> +		break;
>  	case V4L2_PIX_FMT_VP8:
>  		ctx->dec_if = &vdec_vp8_if;
>  		ctx->hw_id = MTK_VDEC_CORE;
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> index 6ce848e74167..e3adf8f36342 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> @@ -58,6 +58,7 @@ extern const struct vdec_common_if vdec_h264_if;
>  extern const struct vdec_common_if vdec_h264_slice_if;
>  extern const struct vdec_common_if vdec_h264_slice_lat_if;
>  extern const struct vdec_common_if vdec_vp8_if;
> +extern const struct vdec_common_if vdec_vp8_slice_if;
>  extern const struct vdec_common_if vdec_vp9_if;
>  
>  /**


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding
  2022-02-23  3:40 ` [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding Yunfei Dong
@ 2022-03-01 22:22   ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-03-01 22:22 UTC (permalink / raw)
  To: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> Add support for VP9 decoding using the stateless API,
> as supported by MT8192. And the drivers is lat and core architecture.

You already have a reviewed tag, but I'm under the impression that there is a
fair amount of duplication with the helper library v4l2-vp9:

  include/media/v4l2-vp9.h
  drivers/media/v4l2-core/v4l2-vp9.c

Can you at least give it a look and comment on why you can't use/adapt it for
this driver ?

> 
> Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> Signed-off-by: George Sun <george.sun@mediatek.com>
> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> ---
>  drivers/media/platform/mtk-vcodec/Makefile    |    1 +
>  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |   26 +-
>  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |    1 +
>  .../mtk-vcodec/vdec/vdec_vp9_req_lat_if.c     | 1971 +++++++++++++++++
>  .../media/platform/mtk-vcodec/vdec_drv_if.c   |    4 +
>  .../media/platform/mtk-vcodec/vdec_drv_if.h   |    1 +
>  6 files changed, 2001 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c
> 
> diff --git a/drivers/media/platform/mtk-vcodec/Makefile b/drivers/media/platform/mtk-vcodec/Makefile
> index b457daf2d196..93e7a343b5b0 100644
> --- a/drivers/media/platform/mtk-vcodec/Makefile
> +++ b/drivers/media/platform/mtk-vcodec/Makefile
> @@ -9,6 +9,7 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
>  		vdec/vdec_vp8_if.o \
>  		vdec/vdec_vp8_req_if.o \
>  		vdec/vdec_vp9_if.o \
> +		vdec/vdec_vp9_req_lat_if.o \
>  		vdec/vdec_h264_req_if.o \
>  		vdec/vdec_h264_req_common.o \
>  		vdec/vdec_h264_req_multi_if.o \
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> index 2a0164ddc708..3770e8117488 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> @@ -91,13 +91,28 @@ static const struct mtk_stateless_control mtk_stateless_controls[] = {
>  			.max = V4L2_MPEG_VIDEO_VP8_PROFILE_3,
>  		},
>  		.codec_type = V4L2_PIX_FMT_VP8_FRAME,
> -	}
> +	},
> +	{
> +		.cfg = {
> +			.id = V4L2_CID_STATELESS_VP9_FRAME,
> +		},
> +		.codec_type = V4L2_PIX_FMT_VP9_FRAME,
> +	},
> +	{
> +		.cfg = {
> +			.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE,
> +			.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
> +			.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0,
> +			.max = V4L2_MPEG_VIDEO_VP9_PROFILE_3,
> +		},
> +		.codec_type = V4L2_PIX_FMT_VP9_FRAME,
> +	},
>  };
>  
>  #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
>  
> -static struct mtk_video_fmt mtk_video_formats[4];
> -static struct mtk_codec_framesizes mtk_vdec_framesizes[2];
> +static struct mtk_video_fmt mtk_video_formats[5];
> +static struct mtk_codec_framesizes mtk_vdec_framesizes[3];
>  
>  static struct mtk_video_fmt default_out_format;
>  static struct mtk_video_fmt default_cap_format;
> @@ -366,6 +381,7 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
>  	switch (fourcc) {
>  	case V4L2_PIX_FMT_H264_SLICE:
>  	case V4L2_PIX_FMT_VP8_FRAME:
> +	case V4L2_PIX_FMT_VP9_FRAME:
>  		mtk_video_formats[count_formats].fourcc = fourcc;
>  		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
>  		mtk_video_formats[count_formats].num_planes = 1;
> @@ -413,6 +429,10 @@ static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx *ctx)
>  		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP8_FRAME, ctx);
>  		out_format_count++;
>  	}
> +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_VP9_FRAME) {
> +		mtk_vcodec_add_formats(V4L2_PIX_FMT_VP9_FRAME, ctx);
> +		out_format_count++;
> +	}
>  
>  	if (cap_format_count)
>  		default_cap_format = mtk_video_formats[cap_format_count - 1];
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> index c68297db225e..ea58f11e7659 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> @@ -355,6 +355,7 @@ enum mtk_vdec_format_types {
>  	MTK_VDEC_FORMAT_MT21C = 0x40,
>  	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
>  	MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
> +	MTK_VDEC_FORMAT_VP9_FRAME = 0x400,
>  };
>  
>  /**
> diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c
> new file mode 100644
> index 000000000000..c678170c7ca3
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_req_lat_if.c
> @@ -0,0 +1,1971 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + * Author: George Sun <george.sun@mediatek.com>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <media/videobuf2-dma-contig.h>
> +
> +#include "../mtk_vcodec_util.h"
> +#include "../mtk_vcodec_dec.h"
> +#include "../mtk_vcodec_intr.h"
> +#include "../vdec_drv_base.h"
> +#include "../vdec_drv_if.h"
> +#include "../vdec_vpu_if.h"
> +
> +/* reset_frame_context defined in VP9 spec */
> +#define VP9_RESET_FRAME_CONTEXT_NONE0 0
> +#define VP9_RESET_FRAME_CONTEXT_NONE1 1
> +#define VP9_RESET_FRAME_CONTEXT_SPEC 2
> +#define VP9_RESET_FRAME_CONTEXT_ALL 3
> +
> +#define VP9_TILE_BUF_SIZE 4096
> +#define VP9_PROB_BUF_SIZE 2560
> +#define VP9_COUNTS_BUF_SIZE 16384
> +
> +#define HDR_FLAG(x) (!!((hdr)->flags & V4L2_VP9_FRAME_FLAG_##x))
> +#define LF_FLAG(x) (!!((lf)->flags & V4L2_VP9_LOOP_FILTER_FLAG_##x))
> +#define SEG_FLAG(x) (!!((seg)->flags & V4L2_VP9_SEGMENTATION_FLAG_##x))
> +
> +/*
> + * struct vdec_vp9_slice_frame_ctx - vp9 prob tables footprint
> + */
> +struct vdec_vp9_slice_frame_ctx {
> +	struct {
> +		u8 probs[6][3];
> +		u8 padding[2];
> +	} coef_probs[4][2][2][6];
> +
> +	u8 y_mode_prob[4][16];
> +	u8 switch_interp_prob[4][16];
> +	u8 seg[32];  /* ignore */
> +	u8 comp_inter_prob[16];
> +	u8 comp_ref_prob[16];
> +	u8 single_ref_prob[5][2];
> +	u8 single_ref_prob_padding[6];
> +
> +	u8 joint[3];
> +	u8 joint_padding[13];
> +	struct {
> +		u8 sign;
> +		u8 classes[10];
> +		u8 padding[5];
> +	} sign_classes[2];
> +	struct {
> +		u8 class0[1];
> +		u8 bits[10];
> +		u8 padding[5];
> +	} class0_bits[2];
> +	struct {
> +		u8 class0_fp[2][3];
> +		u8 fp[3];
> +		u8 class0_hp;
> +		u8 hp;
> +		u8 padding[5];
> +	} class0_fp_hp[2];
> +
> +	u8 uv_mode_prob[10][16];
> +	u8 uv_mode_prob_padding[2][16];
> +
> +	u8 partition_prob[16][4];
> +
> +	u8 inter_mode_probs[7][4];
> +	u8 skip_probs[4];
> +
> +	u8 tx_p8x8[2][4];
> +	u8 tx_p16x16[2][4];
> +	u8 tx_p32x32[2][4];
> +	u8 intra_inter_prob[8];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_frame_counts - vp9 counts tables footprint
> + */
> +struct vdec_vp9_slice_frame_counts {
> +	union {
> +		struct {
> +			u32 band_0[3];
> +			u32 padding0[1];
> +			u32 band_1_5[5][6];
> +			u32 padding1[2];
> +		} eob_branch[4][2][2];
> +		u32 eob_branch_space[256 * 4];
> +	};
> +
> +	struct {
> +		u32 band_0[3][4];
> +		u32 band_1_5[5][6][4];
> +	} coef_probs[4][2][2];
> +
> +	u32 intra_inter[4][2];
> +	u32 comp_inter[5][2];
> +	u32 comp_inter_padding[2];
> +	u32 comp_ref[5][2];
> +	u32 comp_ref_padding[2];
> +	u32 single_ref[5][2][2];
> +	u32 inter_mode[7][4];
> +	u32 y_mode[4][12];
> +	u32 uv_mode[10][10];
> +	u32 partition[16][4];
> +	u32 switchable_interp[4][4];
> +
> +	u32 tx_p8x8[2][2];
> +	u32 tx_p16x16[2][4];
> +	u32 tx_p32x32[2][4];
> +
> +	u32 skip[3][4];
> +
> +	u32 joint[4];
> +
> +	struct {
> +		u32 sign[2];
> +		u32 class0[2];
> +		u32 classes[12];
> +		u32 bits[10][2];
> +		u32 padding[4];
> +		u32 class0_fp[2][4];
> +		u32 fp[4];
> +		u32 class0_hp[2];
> +		u32 hp[2];
> +	} mvcomp[2];
> +
> +	u32 reserved[126][4];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_uncompressed_header - vp9 uncompressed header syntax
> + *                                             used for decoding
> + */
> +struct vdec_vp9_slice_uncompressed_header {
> +	u8 profile;
> +	u8 last_frame_type;
> +	u8 frame_type;
> +
> +	u8 last_show_frame;
> +	u8 show_frame;
> +	u8 error_resilient_mode;
> +
> +	u8 bit_depth;
> +	u8 padding0[1];
> +	u16 last_frame_width;
> +	u16 last_frame_height;
> +	u16 frame_width;
> +	u16 frame_height;
> +
> +	u8 intra_only;
> +	u8 reset_frame_context;
> +	u8 ref_frame_sign_bias[4];
> +	u8 allow_high_precision_mv;
> +	u8 interpolation_filter;
> +
> +	u8 refresh_frame_context;
> +	u8 frame_parallel_decoding_mode;
> +	u8 frame_context_idx;
> +
> +	/* loop_filter_params */
> +	u8 loop_filter_level;
> +	u8 loop_filter_sharpness;
> +	u8 loop_filter_delta_enabled;
> +	s8 loop_filter_ref_deltas[4];
> +	s8 loop_filter_mode_deltas[2];
> +
> +	/* quantization_params */
> +	u8 base_q_idx;
> +	s8 delta_q_y_dc;
> +	s8 delta_q_uv_dc;
> +	s8 delta_q_uv_ac;
> +
> +	/* segmentation_params */
> +	u8 segmentation_enabled;
> +	u8 segmentation_update_map;
> +	u8 segmentation_tree_probs[7];
> +	u8 padding1[1];
> +	u8 segmentation_temporal_udpate;
> +	u8 segmentation_pred_prob[3];
> +	u8 segmentation_update_data;
> +	u8 segmentation_abs_or_delta_update;
> +	u8 feature_enabled[8];
> +	s16 feature_value[8][4];
> +
> +	/* tile_info */
> +	u8 tile_cols_log2;
> +	u8 tile_rows_log2;
> +	u8 padding2[2];
> +
> +	u16 uncompressed_header_size;
> +	u16 header_size_in_bytes;
> +
> +	/* LAT OUT, CORE IN */
> +	u32 dequant[8][4];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_compressed_header - vp9 compressed header syntax
> + *                                           used for decoding.
> + */
> +struct vdec_vp9_slice_compressed_header {
> +	u8 tx_mode;
> +	u8 ref_mode;
> +	u8 comp_fixed_ref;
> +	u8 comp_var_ref[2];
> +	u8 padding[3];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_tiles - vp9 tile syntax
> + */
> +struct vdec_vp9_slice_tiles {
> +	u32 size[4][64];
> +	u32 mi_rows[4];
> +	u32 mi_cols[64];
> +	u8 actual_rows;
> +	u8 padding[7];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_reference - vp9 reference frame information
> + */
> +struct vdec_vp9_slice_reference {
> +	u16 frame_width;
> +	u16 frame_height;
> +	u8 bit_depth;
> +	u8 subsampling_x;
> +	u8 subsampling_y;
> +	u8 padding;
> +};
> +
> +/*
> + * struct vdec_vp9_slice_frame - vp9 syntax used for decoding
> + */
> +struct vdec_vp9_slice_frame {
> +	struct vdec_vp9_slice_uncompressed_header uh;
> +	struct vdec_vp9_slice_compressed_header ch;
> +	struct vdec_vp9_slice_tiles tiles;
> +	struct vdec_vp9_slice_reference ref[3];
> +};
> +
> +/*
> + * struct vdec_vp9_slice_init_vsi - VSI used to initialize instance
> + */
> +struct vdec_vp9_slice_init_vsi {
> +	unsigned int architecture;
> +	unsigned int reserved;
> +	u64 core_vsi;
> +	/* default frame context's position in MicroP */
> +	u64 default_frame_ctx;
> +};
> +
> +/*
> + * struct vdec_vp9_slice_mem - memory address and size
> + */
> +struct vdec_vp9_slice_mem {
> +	union {
> +		u64 buf;
> +		dma_addr_t dma_addr;
> +	};
> +	union {
> +		size_t size;
> +		dma_addr_t dma_addr_end;
> +		u64 padding;
> +	};
> +};
> +
> +/*
> + * struct vdec_vp9_slice_bs - input buffer for decoding
> + */
> +struct vdec_vp9_slice_bs {
> +	struct vdec_vp9_slice_mem buf;
> +	struct vdec_vp9_slice_mem frame;
> +};
> +
> +/*
> + * struct vdec_vp9_slice_fb - frame buffer for decoding
> + */
> +struct vdec_vp9_slice_fb {
> +	struct vdec_vp9_slice_mem y;
> +	struct vdec_vp9_slice_mem c;
> +};
> +
> +/*
> + * struct vdec_vp9_slice_state - decoding state
> + */
> +struct vdec_vp9_slice_state {
> +	int err;
> +	unsigned int full;
> +	unsigned int timeout;
> +	unsigned int perf;
> +
> +	unsigned int crc[12];
> +};
> +
> +/**
> + * struct vdec_vp9_slice_vsi - exchange decoding information
> + *                             between Main CPU and MicroP
> + * @bs          : input buffer
> + * @fb          : output buffer
> + * @ref         : 3 reference buffers
> + * @mv          : mv working buffer
> + * @seg         : segmentation working buffer
> + * @tile        : tile buffer
> + * @prob        : prob table buffer, used to set/update prob table
> + * @counts      : counts table buffer, used to update prob table
> + * @ube         : general buffer
> + * @trans       : trans buffer position in general buffer
> + * @err_map     : error buffer
> + * @row_info    : row info buffer
> + * @frame       : decoding syntax
> + * @state       : decoding state
> + */
> +struct vdec_vp9_slice_vsi {
> +	/* used in LAT stage */
> +	struct vdec_vp9_slice_bs bs;
> +	/* used in Core stage */
> +	struct vdec_vp9_slice_fb fb;
> +	struct vdec_vp9_slice_fb ref[3];
> +
> +	struct vdec_vp9_slice_mem mv[2];
> +	struct vdec_vp9_slice_mem seg[2];
> +	struct vdec_vp9_slice_mem tile;
> +	struct vdec_vp9_slice_mem prob;
> +	struct vdec_vp9_slice_mem counts;
> +
> +	/* LAT stage's output, Core stage's input */
> +	struct vdec_vp9_slice_mem ube;
> +	struct vdec_vp9_slice_mem trans;
> +	struct vdec_vp9_slice_mem err_map;
> +	struct vdec_vp9_slice_mem row_info;
> +
> +	/* decoding parameters */
> +	struct vdec_vp9_slice_frame frame;
> +
> +	struct vdec_vp9_slice_state state;
> +};
> +
> +/**
> + * struct vdec_vp9_slice_pfc - per-frame context that contains a local vsi.
> + *                             pass it from lat to core
> + * @vsi         : local vsi. copy to/from remote vsi before/after decoding
> + * @ref_idx     : reference buffer index
> + * @seq         : picture sequence
> + * @state       : decoding state
> + */
> +struct vdec_vp9_slice_pfc {
> +	struct vdec_vp9_slice_vsi vsi;
> +
> +	u64 ref_idx[3];
> +
> +	int seq;
> +
> +	/* LAT/Core CRC */
> +	struct vdec_vp9_slice_state state[2];
> +};
> +
> +/*
> + * enum vdec_vp9_slice_resolution_level
> + */
> +enum vdec_vp9_slice_resolution_level {
> +	VP9_RES_NONE,
> +	VP9_RES_FHD,
> +	VP9_RES_4K,
> +	VP9_RES_8K,
> +};
> +
> +/*
> + * struct vdec_vp9_slice_ref - picture's width & height should kept
> + *                             for later decoding as reference picture
> + */
> +struct vdec_vp9_slice_ref {
> +	unsigned int width;
> +	unsigned int height;
> +};
> +
> +/**
> + * struct vdec_vp9_slice_instance - represent one vp9 instance
> + * @ctx         : pointer to codec's context
> + * @vpu         : VPU instance
> + * @seq         : global picture sequence
> + * @level       : level of current resolution
> + * @width       : width of last picture
> + * @height      : height of last picture
> + * @frame_type  : frame_type of last picture
> + * @irq         : irq to Main CPU or MicroP
> + * @show_frame  : show_frame of last picture
> + * @dpb         : picture information (width/height) for reference
> + * @mv          : mv working buffer
> + * @seg         : segmentation working buffer
> + * @tile        : tile buffer
> + * @prob        : prob table buffer, used to set/update prob table
> + * @counts      : counts table buffer, used to update prob table
> + * @frame_ctx   : 4 frame context according to VP9 Spec
> + * @dirty       : state of each frame context
> + * @init_vsi    : vsi used for initialized VP9 instance
> + * @vsi         : vsi used for decoding/flush ...
> + * @core_vsi    : vsi used for Core stage
> + */
> +struct vdec_vp9_slice_instance {
> +	struct mtk_vcodec_ctx *ctx;
> +	struct vdec_vpu_inst vpu;
> +
> +	int seq;
> +
> +	enum vdec_vp9_slice_resolution_level level;
> +
> +	/* for resolution change and get_pic_info */
> +	unsigned int width;
> +	unsigned int height;
> +
> +	/* for last_frame_type */
> +	unsigned int frame_type;
> +	unsigned int irq;
> +
> +	unsigned int show_frame;
> +
> +	/* maintain vp9 reference frame state */
> +	struct vdec_vp9_slice_ref dpb[VB2_MAX_FRAME];
> +
> +	/*
> +	 * normal working buffers
> +	 * mv[0]/seg[0]/tile/prob/counts is used for LAT
> +	 * mv[1]/seg[1] is used for CORE
> +	 */
> +	struct mtk_vcodec_mem mv[2];
> +	struct mtk_vcodec_mem seg[2];
> +	struct mtk_vcodec_mem tile;
> +	struct mtk_vcodec_mem prob;
> +	struct mtk_vcodec_mem counts;
> +
> +	/* 4 prob tables */
> +	struct vdec_vp9_slice_frame_ctx frame_ctx[4];
> +	unsigned char dirty[4];
> +
> +	/* MicroP vsi */
> +	union {
> +		struct vdec_vp9_slice_init_vsi *init_vsi;
> +		struct vdec_vp9_slice_vsi *vsi;
> +	};
> +	struct vdec_vp9_slice_vsi *core_vsi;
> +};
> +
> +/*
> + * (2, (0, (1, 3)))
> + * max level = 2
> + */
> +static const signed char vdec_vp9_slice_inter_mode_tree[6] = {
> +	-2, 2, 0, 4, -1, -3
> +};
> +
> +/* max level = 6 */
> +static const signed char vdec_vp9_slice_intra_mode_tree[18] = {
> +	0, 2, -9, 4, -1, 6, 8, 12, -2, 10, -4, -5, -3, 14, -8, 16, -6, -7
> +};
> +
> +/* max level = 2 */
> +static const signed char vdec_vp9_slice_partition_tree[6] = {
> +	0, 2, -1, 4, -2, -3
> +};
> +
> +/* max level = 1 */
> +static const signed char vdec_vp9_slice_switchable_interp_tree[4] = {
> +	0, 2, -1, -2
> +};
> +
> +/* max level = 2 */
> +static const signed char vdec_vp9_slice_mv_joint_tree[6] = {
> +	0, 2, -1, 4, -2, -3
> +};
> +
> +/* max level = 6 */
> +static const signed char vdec_vp9_slice_mv_class_tree[20] = {
> +	0, 2, -1, 4, 6, 8, -2, -3, 10, 12,
> +	-4, -5, -6, 14, 16, 18, -7, -8, -9, -10
> +};
> +
> +/* max level = 0 */
> +static const signed char vdec_vp9_slice_mv_class0_tree[2] = {
> +	0, -1
> +};
> +
> +/* max level = 2 */
> +static const signed char vdec_vp9_slice_mv_fp_tree[6] = {
> +	0, 2, -1, 4, -2, -3
> +};
> +
> +/*
> + * all VP9 instances could share this default frame context.
> + */
> +static struct vdec_vp9_slice_frame_ctx *vdec_vp9_slice_default_frame_ctx;
> +static DEFINE_MUTEX(vdec_vp9_slice_frame_ctx_lock);
> +
> +static int vdec_vp9_slice_core_decode(struct vdec_lat_buf *lat_buf);
> +
> +static int vdec_vp9_slice_init_default_frame_ctx(struct vdec_vp9_slice_instance *instance)
> +{
> +	struct vdec_vp9_slice_frame_ctx *remote_frame_ctx;
> +	struct vdec_vp9_slice_frame_ctx *frame_ctx;
> +	struct mtk_vcodec_ctx *ctx;
> +	struct vdec_vp9_slice_init_vsi *vsi;
> +	int ret = 0;
> +
> +	ctx = instance->ctx;
> +	vsi = instance->vpu.vsi;
> +	if (!ctx || !vsi)
> +		return -EINVAL;
> +
> +	remote_frame_ctx = mtk_vcodec_fw_map_dm_addr(ctx->dev->fw_handler,
> +						     (u32)vsi->default_frame_ctx);
> +	if (!remote_frame_ctx) {
> +		mtk_vcodec_err(instance, "failed to map default frame ctx\n");
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&vdec_vp9_slice_frame_ctx_lock);
> +	if (vdec_vp9_slice_default_frame_ctx)
> +		goto out;
> +
> +	frame_ctx = kmalloc(sizeof(*frame_ctx), GFP_KERNEL);
> +	if (!frame_ctx) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	memcpy_fromio(frame_ctx, remote_frame_ctx, sizeof(*frame_ctx));
> +	vdec_vp9_slice_default_frame_ctx = frame_ctx;
> +
> +out:
> +	mutex_unlock(&vdec_vp9_slice_frame_ctx_lock);
> +
> +	return ret;
> +}
> +
> +static int vdec_vp9_slice_alloc_working_buffer(struct vdec_vp9_slice_instance *instance,
> +					       struct vdec_vp9_slice_vsi *vsi)
> +{
> +	struct mtk_vcodec_ctx *ctx = instance->ctx;
> +	enum vdec_vp9_slice_resolution_level level;
> +	/* super blocks */
> +	unsigned int max_sb_w;
> +	unsigned int max_sb_h;
> +	unsigned int max_w;
> +	unsigned int max_h;
> +	unsigned int w;
> +	unsigned int h;
> +	size_t size;
> +	int ret;
> +	int i;
> +
> +	w = vsi->frame.uh.frame_width;
> +	h = vsi->frame.uh.frame_height;
> +
> +	if (w > VCODEC_DEC_4K_CODED_WIDTH ||
> +	    h > VCODEC_DEC_4K_CODED_HEIGHT) {
> +		/* 8K? */
> +		return -EINVAL;
> +	} else if (w > MTK_VDEC_MAX_W || h > MTK_VDEC_MAX_H) {
> +		/* 4K */
> +		level = VP9_RES_4K;
> +		max_w = VCODEC_DEC_4K_CODED_WIDTH;
> +		max_h = VCODEC_DEC_4K_CODED_HEIGHT;
> +	} else {
> +		/* FHD */
> +		level = VP9_RES_FHD;
> +		max_w = MTK_VDEC_MAX_W;
> +		max_h = MTK_VDEC_MAX_H;
> +	}
> +
> +	if (level == instance->level)
> +		return 0;
> +
> +	mtk_vcodec_debug(instance, "resolution level changed, from %u to %u, %ux%u",
> +			 instance->level, level, w, h);
> +
> +	max_sb_w = DIV_ROUND_UP(max_w, 64);
> +	max_sb_h = DIV_ROUND_UP(max_h, 64);
> +	ret = -ENOMEM;
> +
> +	/*
> +	 * Lat-flush must wait core idle, otherwise core will
> +	 * use released buffers
> +	 */
> +
> +	size = (max_sb_w * max_sb_h + 2) * 576;
> +	for (i = 0; i < 2; i++) {
> +		if (instance->mv[i].va)
> +			mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> +		instance->mv[i].size = size;
> +		if (mtk_vcodec_mem_alloc(ctx, &instance->mv[i]))
> +			goto err;
> +	}
> +
> +	size = (max_sb_w * max_sb_h * 32) + 256;
> +	for (i = 0; i < 2; i++) {
> +		if (instance->seg[i].va)
> +			mtk_vcodec_mem_free(ctx, &instance->seg[i]);
> +		instance->seg[i].size = size;
> +		if (mtk_vcodec_mem_alloc(ctx, &instance->seg[i]))
> +			goto err;
> +	}
> +
> +	if (!instance->tile.va) {
> +		instance->tile.size = VP9_TILE_BUF_SIZE;
> +		if (mtk_vcodec_mem_alloc(ctx, &instance->tile))
> +			goto err;
> +	}
> +
> +	if (!instance->prob.va) {
> +		instance->prob.size = VP9_PROB_BUF_SIZE;
> +		if (mtk_vcodec_mem_alloc(ctx, &instance->prob))
> +			goto err;
> +	}
> +
> +	if (!instance->counts.va) {
> +		instance->counts.size = VP9_COUNTS_BUF_SIZE;
> +		if (mtk_vcodec_mem_alloc(ctx, &instance->counts))
> +			goto err;
> +	}
> +
> +	instance->level = level;
> +	return 0;
> +
> +err:
> +	instance->level = VP9_RES_NONE;
> +	return ret;
> +}
> +
> +static void vdec_vp9_slice_free_working_buffer(struct vdec_vp9_slice_instance *instance)
> +{
> +	struct mtk_vcodec_ctx *ctx = instance->ctx;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(instance->mv); i++) {
> +		if (instance->mv[i].va)
> +			mtk_vcodec_mem_free(ctx, &instance->mv[i]);
> +	}
> +	for (i = 0; i < ARRAY_SIZE(instance->seg); i++) {
> +		if (instance->seg[i].va)
> +			mtk_vcodec_mem_free(ctx, &instance->seg[i]);
> +	}
> +	if (instance->tile.va)
> +		mtk_vcodec_mem_free(ctx, &instance->tile);
> +	if (instance->prob.va)
> +		mtk_vcodec_mem_free(ctx, &instance->prob);
> +	if (instance->counts.va)
> +		mtk_vcodec_mem_free(ctx, &instance->counts);
> +
> +	instance->level = VP9_RES_NONE;
> +}
> +
> +static void vdec_vp9_slice_vsi_from_remote(struct vdec_vp9_slice_vsi *vsi,
> +					   struct vdec_vp9_slice_vsi *remote_vsi,
> +					   int skip)
> +{
> +	struct vdec_vp9_slice_frame *rf;
> +	struct vdec_vp9_slice_frame *f;
> +
> +	/*
> +	 * compressed header
> +	 * dequant
> +	 * buffer position
> +	 * decode state
> +	 */
> +	if (!skip) {
> +		rf = &remote_vsi->frame;
> +		f = &vsi->frame;
> +		memcpy_fromio(&f->ch, &rf->ch, sizeof(f->ch));
> +		memcpy_fromio(&f->uh.dequant, &rf->uh.dequant, sizeof(f->uh.dequant));
> +		memcpy_fromio(&vsi->trans, &remote_vsi->trans, sizeof(vsi->trans));
> +	}
> +
> +	memcpy_fromio(&vsi->state, &remote_vsi->state, sizeof(vsi->state));
> +}
> +
> +static void vdec_vp9_slice_vsi_to_remote(struct vdec_vp9_slice_vsi *vsi,
> +					 struct vdec_vp9_slice_vsi *remote_vsi)
> +{
> +	memcpy_toio(remote_vsi, vsi, sizeof(*vsi));
> +}
> +
> +static int vdec_vp9_slice_tile_offset(int idx, int mi_num, int tile_log2)
> +{
> +	int sbs = (mi_num + 7) >> 3;
> +	int offset = ((idx * sbs) >> tile_log2) << 3;
> +
> +	return offset < mi_num ? offset : mi_num;
> +}
> +
> +static int vdec_vp9_slice_setup_lat_from_src_buf(struct vdec_vp9_slice_instance *instance,
> +						 struct vdec_lat_buf *lat_buf)
> +{
> +	struct vb2_v4l2_buffer *src;
> +	struct vb2_v4l2_buffer *dst;
> +
> +	src = v4l2_m2m_next_src_buf(instance->ctx->m2m_ctx);
> +	if (!src)
> +		return -EINVAL;
> +
> +	dst = &lat_buf->ts_info;
> +	v4l2_m2m_buf_copy_metadata(src, dst, true);
> +	return 0;
> +}
> +
> +static void vdec_vp9_slice_setup_hdr(struct vdec_vp9_slice_instance *instance,
> +				     struct vdec_vp9_slice_uncompressed_header *uh,
> +				     struct v4l2_ctrl_vp9_frame *hdr)
> +{
> +	int i;
> +
> +	uh->profile = hdr->profile;
> +	uh->last_frame_type = instance->frame_type;
> +	uh->frame_type = !HDR_FLAG(KEY_FRAME);
> +	uh->last_show_frame = instance->show_frame;
> +	uh->show_frame = HDR_FLAG(SHOW_FRAME);
> +	uh->error_resilient_mode = HDR_FLAG(ERROR_RESILIENT);
> +	uh->bit_depth = hdr->bit_depth;
> +	uh->last_frame_width = instance->width;
> +	uh->last_frame_height = instance->height;
> +	uh->frame_width = hdr->frame_width_minus_1 + 1;
> +	uh->frame_height = hdr->frame_height_minus_1 + 1;
> +	uh->intra_only = HDR_FLAG(INTRA_ONLY);
> +	/* map v4l2 enum to values defined in VP9 spec for firmware */
> +	switch (hdr->reset_frame_context) {
> +	case V4L2_VP9_RESET_FRAME_CTX_NONE:
> +		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_NONE0;
> +		break;
> +	case V4L2_VP9_RESET_FRAME_CTX_SPEC:
> +		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_SPEC;
> +		break;
> +	case V4L2_VP9_RESET_FRAME_CTX_ALL:
> +		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_ALL;
> +		break;
> +	default:
> +		uh->reset_frame_context = VP9_RESET_FRAME_CONTEXT_NONE0;
> +		break;
> +	}
> +	/*
> +	 * ref_frame_sign_bias specifies the intended direction
> +	 * of the motion vector in time for each reference frame.
> +	 * - INTRA_FRAME = 0,
> +	 * - LAST_FRAME = 1,
> +	 * - GOLDEN_FRAME = 2,
> +	 * - ALTREF_FRAME = 3,
> +	 * ref_frame_sign_bias[INTRA_FRAME] is always 0
> +	 * and VDA only passes another 3 directions
> +	 */
> +	uh->ref_frame_sign_bias[0] = 0;
> +	for (i = 0; i < 3; i++)
> +		uh->ref_frame_sign_bias[i + 1] =
> +			!!(hdr->ref_frame_sign_bias & (1 << i));
> +	uh->allow_high_precision_mv = HDR_FLAG(ALLOW_HIGH_PREC_MV);
> +	uh->interpolation_filter = hdr->interpolation_filter;
> +	uh->refresh_frame_context = HDR_FLAG(REFRESH_FRAME_CTX);
> +	uh->frame_parallel_decoding_mode = HDR_FLAG(PARALLEL_DEC_MODE);
> +	uh->frame_context_idx = hdr->frame_context_idx;
> +
> +	/* tile info */
> +	uh->tile_cols_log2 = hdr->tile_cols_log2;
> +	uh->tile_rows_log2 = hdr->tile_rows_log2;
> +
> +	uh->uncompressed_header_size = hdr->uncompressed_header_size;
> +	uh->header_size_in_bytes = hdr->compressed_header_size;
> +}
> +
> +static void vdec_vp9_slice_setup_frame_ctx(struct vdec_vp9_slice_instance *instance,
> +					   struct vdec_vp9_slice_uncompressed_header *uh,
> +					   struct v4l2_ctrl_vp9_frame *hdr)
> +{
> +	int error_resilient_mode;
> +	int reset_frame_context;
> +	int key_frame;
> +	int intra_only;
> +	int i;
> +
> +	key_frame = HDR_FLAG(KEY_FRAME);
> +	intra_only = HDR_FLAG(INTRA_ONLY);
> +	error_resilient_mode = HDR_FLAG(ERROR_RESILIENT);
> +	reset_frame_context = uh->reset_frame_context;
> +
> +	/*
> +	 * according to "6.2 Uncompressed header syntax" in
> +	 * "VP9 Bitstream & Decoding Process Specification",
> +	 * reset @frame_context_idx when (FrameIsIntra || error_resilient_mode)
> +	 */
> +	if (key_frame || intra_only || error_resilient_mode) {
> +		/*
> +		 * @reset_frame_context specifies
> +		 * whether the frame context should be
> +		 * reset to default values:
> +		 * 0 or 1 means do not reset any frame context
> +		 * 2 resets just the context specified in the frame header
> +		 * 3 resets all contexts
> +		 */
> +		if (key_frame || error_resilient_mode ||
> +		    reset_frame_context == 3) {
> +			/* use default table */
> +			for (i = 0; i < 4; i++)
> +				instance->dirty[i] = 0;
> +		} else if (reset_frame_context == 2) {
> +			instance->dirty[uh->frame_context_idx] = 0;
> +		}
> +		uh->frame_context_idx = 0;
> +	}
> +}
> +
> +static void vdec_vp9_slice_setup_loop_filter(struct vdec_vp9_slice_uncompressed_header *uh,
> +					     struct v4l2_vp9_loop_filter *lf)
> +{
> +	int i;
> +
> +	uh->loop_filter_level = lf->level;
> +	uh->loop_filter_sharpness = lf->sharpness;
> +	uh->loop_filter_delta_enabled = LF_FLAG(DELTA_ENABLED);
> +	for (i = 0; i < 4; i++)
> +		uh->loop_filter_ref_deltas[i] = lf->ref_deltas[i];
> +	for (i = 0; i < 2; i++)
> +		uh->loop_filter_mode_deltas[i] = lf->mode_deltas[i];
> +}
> +
> +static void vdec_vp9_slice_setup_quantization(struct vdec_vp9_slice_uncompressed_header *uh,
> +					      struct v4l2_vp9_quantization *quant)
> +{
> +	uh->base_q_idx = quant->base_q_idx;
> +	uh->delta_q_y_dc = quant->delta_q_y_dc;
> +	uh->delta_q_uv_dc = quant->delta_q_uv_dc;
> +	uh->delta_q_uv_ac = quant->delta_q_uv_ac;
> +}
> +
> +static void vdec_vp9_slice_setup_segmentation(struct vdec_vp9_slice_uncompressed_header *uh,
> +					      struct v4l2_vp9_segmentation *seg)
> +{
> +	int i;
> +	int j;
> +
> +	uh->segmentation_enabled = SEG_FLAG(ENABLED);
> +	uh->segmentation_update_map = SEG_FLAG(UPDATE_MAP);
> +	for (i = 0; i < 7; i++)
> +		uh->segmentation_tree_probs[i] = seg->tree_probs[i];
> +	uh->segmentation_temporal_udpate = SEG_FLAG(TEMPORAL_UPDATE);
> +	for (i = 0; i < 3; i++)
> +		uh->segmentation_pred_prob[i] = seg->pred_probs[i];
> +	uh->segmentation_update_data = SEG_FLAG(UPDATE_DATA);
> +	uh->segmentation_abs_or_delta_update = SEG_FLAG(ABS_OR_DELTA_UPDATE);
> +	for (i = 0; i < 8; i++) {
> +		uh->feature_enabled[i] = seg->feature_enabled[i];
> +		for (j = 0; j < 4; j++)
> +			uh->feature_value[i][j] = seg->feature_data[i][j];
> +	}
> +}
> +
> +static int vdec_vp9_slice_setup_tile(struct vdec_vp9_slice_vsi *vsi,
> +				     struct v4l2_ctrl_vp9_frame *hdr)
> +{
> +	unsigned int rows_log2;
> +	unsigned int cols_log2;
> +	unsigned int rows;
> +	unsigned int cols;
> +	unsigned int mi_rows;
> +	unsigned int mi_cols;
> +	struct vdec_vp9_slice_tiles *tiles;
> +	int offset;
> +	int start;
> +	int end;
> +	int i;
> +
> +	rows_log2 = hdr->tile_rows_log2;
> +	cols_log2 = hdr->tile_cols_log2;
> +	rows = 1 << rows_log2;
> +	cols = 1 << cols_log2;
> +	tiles = &vsi->frame.tiles;
> +	tiles->actual_rows = 0;
> +
> +	if (rows > 4 || cols > 64)
> +		return -EINVAL;
> +
> +	/* setup mi rows/cols information */
> +	mi_rows = (hdr->frame_height_minus_1 + 1 + 7) >> 3;
> +	mi_cols = (hdr->frame_width_minus_1 + 1 + 7) >> 3;
> +
> +	for (i = 0; i < rows; i++) {
> +		start = vdec_vp9_slice_tile_offset(i, mi_rows, rows_log2);
> +		end = vdec_vp9_slice_tile_offset(i + 1, mi_rows, rows_log2);
> +		offset = end - start;
> +		tiles->mi_rows[i] = (offset + 7) >> 3;
> +		if (tiles->mi_rows[i])
> +			tiles->actual_rows++;
> +	}
> +
> +	for (i = 0; i < cols; i++) {
> +		start = vdec_vp9_slice_tile_offset(i, mi_cols, cols_log2);
> +		end = vdec_vp9_slice_tile_offset(i + 1, mi_cols, cols_log2);
> +		offset = end - start;
> +		tiles->mi_cols[i] = (offset + 7) >> 3;
> +	}
> +
> +	return 0;
> +}
> +
> +static void vdec_vp9_slice_setup_state(struct vdec_vp9_slice_vsi *vsi)
> +{
> +	memset(&vsi->state, 0, sizeof(vsi->state));
> +}
> +
> +static void vdec_vp9_slice_setup_ref_idx(struct vdec_vp9_slice_pfc *pfc,
> +					 struct v4l2_ctrl_vp9_frame *hdr)
> +{
> +	pfc->ref_idx[0] = hdr->last_frame_ts;
> +	pfc->ref_idx[1] = hdr->golden_frame_ts;
> +	pfc->ref_idx[2] = hdr->alt_frame_ts;
> +}
> +
> +static int vdec_vp9_slice_setup_pfc(struct vdec_vp9_slice_instance *instance,
> +				    struct vdec_vp9_slice_pfc *pfc)
> +{
> +	struct v4l2_ctrl_vp9_frame *hdr;
> +	struct vdec_vp9_slice_uncompressed_header *uh;
> +	struct v4l2_ctrl *hdr_ctrl;
> +	struct vdec_vp9_slice_vsi *vsi;
> +	int ret;
> +
> +	/* frame header */
> +	hdr_ctrl = v4l2_ctrl_find(&instance->ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_FRAME);
> +	if (!hdr_ctrl || !hdr_ctrl->p_cur.p)
> +		return -EINVAL;
> +
> +	hdr = hdr_ctrl->p_cur.p;
> +	vsi = &pfc->vsi;
> +	uh = &vsi->frame.uh;
> +
> +	/* setup vsi information */
> +	vdec_vp9_slice_setup_hdr(instance, uh, hdr);
> +	vdec_vp9_slice_setup_frame_ctx(instance, uh, hdr);
> +	vdec_vp9_slice_setup_loop_filter(uh, &hdr->lf);
> +	vdec_vp9_slice_setup_quantization(uh, &hdr->quant);
> +	vdec_vp9_slice_setup_segmentation(uh, &hdr->seg);
> +	ret = vdec_vp9_slice_setup_tile(vsi, hdr);
> +	if (ret)
> +		return ret;
> +	vdec_vp9_slice_setup_state(vsi);
> +
> +	/* core stage needs buffer index to get ref y/c ... */
> +	vdec_vp9_slice_setup_ref_idx(pfc, hdr);
> +
> +	pfc->seq = instance->seq;
> +	instance->seq++;
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_lat_buffer(struct vdec_vp9_slice_instance *instance,
> +					   struct vdec_vp9_slice_vsi *vsi,
> +					   struct mtk_vcodec_mem *bs,
> +					   struct vdec_lat_buf *lat_buf)
> +{
> +	int i;
> +
> +	vsi->bs.buf.dma_addr = bs->dma_addr;
> +	vsi->bs.buf.size = bs->size;
> +	vsi->bs.frame.dma_addr = bs->dma_addr;
> +	vsi->bs.frame.size = bs->size;
> +
> +	for (i = 0; i < 2; i++) {
> +		vsi->mv[i].dma_addr = instance->mv[i].dma_addr;
> +		vsi->mv[i].size = instance->mv[i].size;
> +	}
> +	for (i = 0; i < 2; i++) {
> +		vsi->seg[i].dma_addr = instance->seg[i].dma_addr;
> +		vsi->seg[i].size = instance->seg[i].size;
> +	}
> +	vsi->tile.dma_addr = instance->tile.dma_addr;
> +	vsi->tile.size = instance->tile.size;
> +	vsi->prob.dma_addr = instance->prob.dma_addr;
> +	vsi->prob.size = instance->prob.size;
> +	vsi->counts.dma_addr = instance->counts.dma_addr;
> +	vsi->counts.size = instance->counts.size;
> +
> +	vsi->ube.dma_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
> +	vsi->ube.size = lat_buf->ctx->msg_queue.wdma_addr.size;
> +	vsi->trans.dma_addr = lat_buf->ctx->msg_queue.wdma_wptr_addr;
> +	/* used to store trans end */
> +	vsi->trans.dma_addr_end = lat_buf->ctx->msg_queue.wdma_rptr_addr;
> +	vsi->err_map.dma_addr = lat_buf->wdma_err_addr.dma_addr;
> +	vsi->err_map.size = lat_buf->wdma_err_addr.size;
> +
> +	vsi->row_info.buf = 0;
> +	vsi->row_info.size = 0;
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_prob_buffer(struct vdec_vp9_slice_instance *instance,
> +					    struct vdec_vp9_slice_vsi *vsi)
> +{
> +	struct vdec_vp9_slice_frame_ctx *frame_ctx;
> +	struct vdec_vp9_slice_uncompressed_header *uh;
> +
> +	uh = &vsi->frame.uh;
> +
> +	mtk_vcodec_debug(instance, "ctx dirty %u idx %d\n",
> +			 instance->dirty[uh->frame_context_idx],
> +			 uh->frame_context_idx);
> +
> +	if (instance->dirty[uh->frame_context_idx])
> +		frame_ctx = &instance->frame_ctx[uh->frame_context_idx];
> +	else
> +		frame_ctx = vdec_vp9_slice_default_frame_ctx;
> +	memcpy(instance->prob.va, frame_ctx, sizeof(*frame_ctx));
> +
> +	return 0;
> +}
> +
> +static void vdec_vp9_slice_setup_seg_buffer(struct vdec_vp9_slice_instance *instance,
> +					    struct vdec_vp9_slice_vsi *vsi,
> +					    struct mtk_vcodec_mem *buf)
> +{
> +	struct vdec_vp9_slice_uncompressed_header *uh;
> +
> +	/* reset segment buffer */
> +	uh = &vsi->frame.uh;
> +	if (uh->frame_type == 0 ||
> +	    uh->intra_only ||
> +	    uh->error_resilient_mode ||
> +	    uh->frame_width != instance->width ||
> +	    uh->frame_height != instance->height) {
> +		mtk_vcodec_debug(instance, "reset seg\n");
> +		memset(buf->va, 0, buf->size);
> +	}
> +}
> +
> +/*
> + * parse tiles according to `6.4 Decode tiles syntax`
> + * in "vp9-bitstream-specification"
> + *
> + * frame contains uncompress header, compressed header and several tiles.
> + * this function parses tiles' position and size, stores them to tile buffer
> + * for decoding.
> + */
> +static int vdec_vp9_slice_setup_tile_buffer(struct vdec_vp9_slice_instance *instance,
> +					    struct vdec_vp9_slice_vsi *vsi,
> +					    struct mtk_vcodec_mem *bs)
> +{
> +	struct vdec_vp9_slice_uncompressed_header *uh;
> +	unsigned int rows_log2;
> +	unsigned int cols_log2;
> +	unsigned int rows;
> +	unsigned int cols;
> +	unsigned int mi_row;
> +	unsigned int mi_col;
> +	unsigned int offset;
> +	unsigned int pa;
> +	unsigned int size;
> +	struct vdec_vp9_slice_tiles *tiles;
> +	unsigned char *pos;
> +	unsigned char *end;
> +	unsigned char *va;
> +	unsigned int *tb;
> +	int i;
> +	int j;
> +
> +	uh = &vsi->frame.uh;
> +	rows_log2 = uh->tile_rows_log2;
> +	cols_log2 = uh->tile_cols_log2;
> +	rows = 1 << rows_log2;
> +	cols = 1 << cols_log2;
> +
> +	if (rows > 4 || cols > 64) {
> +		mtk_vcodec_err(instance, "tile_rows %u tile_cols %u\n",
> +			       rows, cols);
> +		return -EINVAL;
> +	}
> +
> +	offset = uh->uncompressed_header_size +
> +		uh->header_size_in_bytes;
> +	if (bs->size <= offset) {
> +		mtk_vcodec_err(instance, "bs size %zu tile offset %u\n",
> +			       bs->size, offset);
> +		return -EINVAL;
> +	}
> +
> +	tiles = &vsi->frame.tiles;
> +	/* setup tile buffer */
> +
> +	va = (unsigned char *)bs->va;
> +	pos = va + offset;
> +	end = va + bs->size;
> +	/* truncated */
> +	pa = (unsigned int)bs->dma_addr + offset;
> +	tb = instance->tile.va;
> +	for (i = 0; i < rows; i++) {
> +		for (j = 0; j < cols; j++) {
> +			if (i == rows - 1 &&
> +			    j == cols - 1) {
> +				size = (unsigned int)(end - pos);
> +			} else {
> +				if (end - pos < 4)
> +					return -EINVAL;
> +
> +				size = (pos[0] << 24) | (pos[1] << 16) |
> +					(pos[2] << 8) | pos[3];
> +				pos += 4;
> +				pa += 4;
> +				offset += 4;
> +				if (end - pos < size)
> +					return -EINVAL;
> +			}
> +			tiles->size[i][j] = size;
> +			if (tiles->mi_rows[i]) {
> +				*tb++ = (size << 3) + ((offset << 3) & 0x7f);
> +				*tb++ = pa & ~0xf;
> +				*tb++ = (pa << 3) & 0x7f;
> +				mi_row = (tiles->mi_rows[i] - 1) & 0x1ff;
> +				mi_col = (tiles->mi_cols[j] - 1) & 0x3f;
> +				*tb++ = (mi_row << 6) + mi_col;
> +			}
> +			pos += size;
> +			pa += size;
> +			offset += size;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
> +				    struct mtk_vcodec_mem *bs,
> +				    struct vdec_lat_buf *lat_buf,
> +				    struct vdec_vp9_slice_pfc *pfc)
> +{
> +	struct vdec_vp9_slice_vsi *vsi = &pfc->vsi;
> +	int ret;
> +
> +	ret = vdec_vp9_slice_setup_lat_from_src_buf(instance, lat_buf);
> +	if (ret)
> +		goto err;
> +
> +	ret = vdec_vp9_slice_setup_pfc(instance, pfc);
> +	if (ret)
> +		goto err;
> +
> +	ret = vdec_vp9_slice_alloc_working_buffer(instance, vsi);
> +	if (ret)
> +		goto err;
> +
> +	ret = vdec_vp9_slice_setup_lat_buffer(instance, vsi, bs, lat_buf);
> +	if (ret)
> +		goto err;
> +
> +	vdec_vp9_slice_setup_seg_buffer(instance, vsi, &instance->seg[0]);
> +
> +	/* setup prob/tile buffers for LAT */
> +
> +	ret = vdec_vp9_slice_setup_prob_buffer(instance, vsi);
> +	if (ret)
> +		goto err;
> +
> +	ret = vdec_vp9_slice_setup_tile_buffer(instance, vsi, bs);
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	return ret;
> +}
> +
> +/* implement merge prob process defined in 8.4.1 */
> +static unsigned char vdec_vp9_slice_merge_prob(unsigned char pre, unsigned int ct0,
> +					       unsigned int ct1, unsigned int cs,
> +					       unsigned int uf)
> +{
> +	unsigned int den;
> +	unsigned int prob;
> +	unsigned int count;
> +	unsigned int factor;
> +
> +	/*
> +	 * The variable den representing the total times
> +	 * this boolean has been decoded is set equal to ct0 + ct1.
> +	 */
> +	den = ct0 + ct1;
> +	if (!den)
> +		return pre;  /* => count = 0 => factor = 0 */
> +	/*
> +	 * The variable prob estimating the probability that
> +	 * the boolean is decoded as a 0 is set equal to
> +	 * (den == 0) ? 128 : Clip3(1, 255, (ct0 * 256 + (den >> 1)) / den).
> +	 */
> +	prob = ((ct0 << 8) + (den >> 1)) / den;
> +	prob = prob < 1 ? 1 : (prob > 255 ? 255 : prob);
> +	/* The variable count is set equal to Min(ct0 + ct1, countSat) */
> +	count = den < cs ? den : cs;
> +	/*
> +	 * The variable factor is set equal to
> +	 * maxUpdateFactor * count / countSat.
> +	 */
> +	factor = uf * count / cs;
> +	/*
> +	 * The return variable outProb is set equal to
> +	 * Round2(preProb * (256 - factor) + prob * factor, 8).
> +	 */
> +	return pre + (((prob - pre) * factor + 128) >> 8);
> +}
> +
> +static inline unsigned char vdec_vp9_slice_adapt_prob(unsigned char pre, unsigned int ct0,
> +						      unsigned int ct1)
> +{
> +	return vdec_vp9_slice_merge_prob(pre, ct0, ct1, 20, 128);
> +}
> +
> +/* implement merge probs process defined in 8.4.2 */
> +static unsigned int vdec_vp9_slice_merge_probs(const signed char *tree, int location,
> +					       unsigned char *pre_probs, unsigned int *counts,
> +					       unsigned char *probs, unsigned int cs,
> +					       unsigned int uf)
> +{
> +	int left = tree[location];
> +	int right = tree[location + 1];
> +	unsigned int left_count;
> +	unsigned int right_count;
> +
> +	if (left <= 0)
> +		left_count = counts[-left];
> +	else
> +		left_count = vdec_vp9_slice_merge_probs(tree, left, pre_probs, counts,
> +							probs, cs, uf);
> +
> +	if (right <= 0)
> +		right_count = counts[-right];
> +	else
> +		right_count = vdec_vp9_slice_merge_probs(tree, right, pre_probs, counts,
> +							 probs, cs, uf);
> +
> +	/* merge left and right */
> +	probs[location >> 1] =
> +		vdec_vp9_slice_merge_prob(pre_probs[location >> 1],
> +					  left_count, right_count, cs, uf);
> +	return left_count + right_count;
> +}
> +
> +static inline void vdec_vp9_slice_adapt_probs(const signed char *tree,
> +					      unsigned char *pre_probs,
> +					      unsigned int *counts,
> +					      unsigned char *probs)
> +{
> +	vdec_vp9_slice_merge_probs(tree, 0, pre_probs, counts, probs, 20, 128);
> +}
> +
> +/* 8.4 Probability adaptation process */
> +static void vdec_vp9_slice_adapt_table(struct vdec_vp9_slice_vsi *vsi,
> +				       struct vdec_vp9_slice_frame_ctx *ctx,
> +				       struct vdec_vp9_slice_frame_ctx *pre_ctx,
> +				       struct vdec_vp9_slice_frame_counts *counts)
> +{
> +	unsigned char *pp;
> +	unsigned char *p;
> +	unsigned int *c;
> +	unsigned int *e;
> +	unsigned int uf;
> +	int t, i, j, k, l;
> +
> +	uf = 128;
> +	if (!vsi->frame.uh.frame_type || vsi->frame.uh.intra_only ||
> +	    vsi->frame.uh.last_frame_type)
> +		uf = 112;
> +
> +	p = (unsigned char *)&ctx->coef_probs;
> +	pp = (unsigned char *)&pre_ctx->coef_probs;
> +	c = (unsigned int *)&counts->coef_probs;
> +	e = (unsigned int *)&counts->eob_branch;
> +
> +	/* 8.4.3 Coefficient probability adaption process */
> +	for (t = 0; t < 16; t++) {
> +		for (((k) = 0); ((k) < 6); ((k)++)) {
> +			for (l = 0; l < (k == 0 ? 3 : 6); l++) {
> +				p[0] = vdec_vp9_slice_merge_prob(pp[0], c[3], e[0]
> +								 - c[3], 24, uf);
> +				p[1] = vdec_vp9_slice_merge_prob(pp[1],	c[0], c[1]
> +								 + c[2], 24, uf);
> +				p[2] = vdec_vp9_slice_merge_prob(pp[2], c[1],
> +								 c[2], 24, uf);
> +				p += 3;
> +				pp += 3;
> +				c += 4;
> +				e++;
> +			}
> +			if (k == 0) {
> +				/* 3 * 3 unused values and 2 bytes padding */
> +				p += 11;
> +				pp += 11;
> +				e++;
> +			} else {
> +				/* extra 2 bytes could make 4 bytes align (3 * 6 + 2) */
> +				p += 2;
> +				pp += 2;
> +				/* 5 * 6=30, extra 2 int */
> +				if (k == 5)
> +					e += 2;
> +			}
> +		}
> +	}
> +
> +	if (!vsi->frame.uh.frame_type || vsi->frame.uh.intra_only)
> +		return;
> +
> +	/* 8.4.4 Non coefficient probability adaption process */
> +
> +	for (i = 0; i < 4; i++) {
> +		ctx->intra_inter_prob[i] =
> +			vdec_vp9_slice_adapt_prob(pre_ctx->intra_inter_prob[i],
> +						  counts->intra_inter[i][0],
> +						  counts->intra_inter[i][1]);
> +	}
> +
> +	for (i = 0; i < 5; i++) {
> +		ctx->comp_inter_prob[i] =
> +			vdec_vp9_slice_adapt_prob(pre_ctx->comp_inter_prob[i],
> +						  counts->comp_inter[i][0],
> +						  counts->comp_inter[i][1]);
> +	}
> +
> +	for (i = 0; i < 5; i++) {
> +		ctx->comp_ref_prob[i] =
> +			vdec_vp9_slice_adapt_prob(pre_ctx->comp_ref_prob[i],
> +						  counts->comp_ref[i][0],
> +						  counts->comp_ref[i][1]);
> +	}
> +
> +	for (i = 0; i < 5; i++) {
> +		for (j = 0; j < 2; j++) {
> +			ctx->single_ref_prob[i][j] =
> +				vdec_vp9_slice_adapt_prob(pre_ctx->single_ref_prob[i][j],
> +							  counts->single_ref[i][j][0],
> +							  counts->single_ref[i][j][1]);
> +		}
> +	}
> +
> +	for (i = 0; i < 7; i++) {
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_inter_mode_tree,
> +					   &pre_ctx->inter_mode_probs[i][0],
> +					   &counts->inter_mode[i][0],
> +					   &ctx->inter_mode_probs[i][0]);
> +	}
> +
> +	for (i = 0; i < 4; i++) {
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_intra_mode_tree,
> +					   &pre_ctx->y_mode_prob[i][0],
> +					   &counts->y_mode[i][0],
> +					   &ctx->y_mode_prob[i][0]);
> +	}
> +
> +	for (i = 0; i < 10; i++) {
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_intra_mode_tree,
> +					   &pre_ctx->uv_mode_prob[i][0],
> +					   &counts->uv_mode[i][0],
> +					   &ctx->uv_mode_prob[i][0]);
> +	}
> +
> +	for (i = 0; i < 16; i++) {
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_partition_tree,
> +					   &pre_ctx->partition_prob[i][0],
> +					   &counts->partition[i][0],
> +					   &ctx->partition_prob[i][0]);
> +	}
> +
> +	if (vsi->frame.uh.interpolation_filter == 4) {
> +		for (i = 0; i < 4; i++) {
> +			vdec_vp9_slice_adapt_probs(vdec_vp9_slice_switchable_interp_tree,
> +						   &pre_ctx->switch_interp_prob[i][0],
> +						   &counts->switchable_interp[i][0],
> +						   &ctx->switch_interp_prob[i][0]);
> +		}
> +	}
> +
> +	if (vsi->frame.ch.tx_mode == 4) {
> +		for (i = 0; i < 2; i++) {
> +			ctx->tx_p8x8[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p8x8[i][0],
> +								       counts->tx_p8x8[i][0],
> +								       counts->tx_p8x8[i][1]);
> +			ctx->tx_p16x16[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p16x16[i][0],
> +									 counts->tx_p16x16[i][0],
> +									 counts->tx_p16x16[i][1] +
> +									 counts->tx_p16x16[i][2]);
> +			ctx->tx_p16x16[i][1] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p16x16[i][1],
> +									 counts->tx_p16x16[i][1],
> +									 counts->tx_p16x16[i][2]);
> +			ctx->tx_p32x32[i][0] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][0],
> +									 counts->tx_p32x32[i][0],
> +									 counts->tx_p32x32[i][1] +
> +									 counts->tx_p32x32[i][2] +
> +									 counts->tx_p32x32[i][3]);
> +			ctx->tx_p32x32[i][1] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][1],
> +									 counts->tx_p32x32[i][1],
> +									 counts->tx_p32x32[i][2] +
> +									 counts->tx_p32x32[i][3]);
> +			ctx->tx_p32x32[i][2] = vdec_vp9_slice_adapt_prob(pre_ctx->tx_p32x32[i][2],
> +									 counts->tx_p32x32[i][2],
> +									 counts->tx_p32x32[i][3]);
> +		}
> +	}
> +
> +	for (i = 0; i < 3; i++) {
> +		ctx->skip_probs[i] = vdec_vp9_slice_adapt_prob(pre_ctx->skip_probs[i],
> +							       counts->skip[i][0],
> +							       counts->skip[i][1]);
> +	}
> +
> +	vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_joint_tree,
> +				   &pre_ctx->joint[0],
> +				   &counts->joint[0],
> +				   &ctx->joint[0]);
> +
> +	for (i = 0; i < 2; i++) {
> +		ctx->sign_classes[i].sign = vdec_vp9_slice_adapt_prob(pre_ctx->sign_classes[i].sign,
> +								      counts->mvcomp[i].sign[0],
> +								      counts->mvcomp[i].sign[1]);
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_class_tree,
> +					   &pre_ctx->sign_classes[i].classes[0],
> +					   &counts->mvcomp[i].classes[0],
> +					   &ctx->sign_classes[i].classes[0]);
> +
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_class0_tree,
> +					   pre_ctx->class0_bits[i].class0,
> +					   counts->mvcomp[i].class0,
> +					   ctx->class0_bits[i].class0);
> +		for (j = 0; j < 10; j++) {
> +			ctx->class0_bits[i].bits[j] =
> +				vdec_vp9_slice_adapt_prob(pre_ctx->class0_bits[i].bits[j],
> +							  counts->mvcomp[i].bits[j][0],
> +							  counts->mvcomp[i].bits[j][1]);
> +		}
> +
> +		for (j = 0; j < 2; ++j) {
> +			vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_fp_tree,
> +						   pre_ctx->class0_fp_hp[i].class0_fp[j],
> +						   counts->mvcomp[i].class0_fp[j],
> +						   ctx->class0_fp_hp[i].class0_fp[j]);
> +		}
> +		vdec_vp9_slice_adapt_probs(vdec_vp9_slice_mv_fp_tree,
> +					   pre_ctx->class0_fp_hp[i].fp,
> +					   counts->mvcomp[i].fp,
> +					   ctx->class0_fp_hp[i].fp);
> +		if (vsi->frame.uh.allow_high_precision_mv) {
> +			ctx->class0_fp_hp[i].class0_hp =
> +				vdec_vp9_slice_adapt_prob(pre_ctx->class0_fp_hp[i].class0_hp,
> +							  counts->mvcomp[i].class0_hp[0],
> +							  counts->mvcomp[i].class0_hp[1]);
> +			ctx->class0_fp_hp[i].hp =
> +				vdec_vp9_slice_adapt_prob(pre_ctx->class0_fp_hp[i].hp,
> +							  counts->mvcomp[i].hp[0],
> +							  counts->mvcomp[i].hp[1]);
> +		}
> +	}
> +}
> +
> +static int vdec_vp9_slice_update_prob(struct vdec_vp9_slice_instance *instance,
> +				      struct vdec_vp9_slice_vsi *vsi)
> +{
> +	struct vdec_vp9_slice_frame_ctx *pre_frame_ctx;
> +	struct vdec_vp9_slice_frame_ctx *frame_ctx;
> +	struct vdec_vp9_slice_frame_counts *counts;
> +	struct vdec_vp9_slice_uncompressed_header *uh;
> +
> +	uh = &vsi->frame.uh;
> +	pre_frame_ctx = &instance->frame_ctx[uh->frame_context_idx];
> +	frame_ctx = (struct vdec_vp9_slice_frame_ctx *)instance->prob.va;
> +	counts = (struct vdec_vp9_slice_frame_counts *)instance->counts.va;
> +
> +	if (!uh->refresh_frame_context)
> +		return 0;
> +
> +	if (!uh->frame_parallel_decoding_mode) {
> +		/* uh->error_resilient_mode must be 0 */
> +		vdec_vp9_slice_adapt_table(vsi,	frame_ctx,
> +					   /* use default frame ctx? */
> +					   instance->dirty[uh->frame_context_idx] ?
> +					   pre_frame_ctx :
> +					   vdec_vp9_slice_default_frame_ctx,
> +					   counts);
> +	}
> +
> +	memcpy(pre_frame_ctx, frame_ctx, sizeof(*frame_ctx));
> +	instance->dirty[uh->frame_context_idx] = 1;
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_update_lat(struct vdec_vp9_slice_instance *instance,
> +				     struct vdec_lat_buf *lat_buf,
> +				     struct vdec_vp9_slice_pfc *pfc)
> +{
> +	struct vdec_vp9_slice_vsi *vsi;
> +
> +	vsi = &pfc->vsi;
> +	memcpy(&pfc->state[0], &vsi->state, sizeof(vsi->state));
> +
> +	mtk_vcodec_debug(instance, "Frame %u LAT CRC 0x%08x\n",
> +			 pfc->seq, vsi->state.crc[0]);
> +
> +	/* buffer full, need to re-decode */
> +	if (vsi->state.full) {
> +		/* buffer not enough */
> +		if (vsi->trans.dma_addr_end - vsi->trans.dma_addr ==
> +			vsi->ube.size)
> +			return -ENOMEM;
> +		return -EAGAIN;
> +	}
> +
> +	vdec_vp9_slice_update_prob(instance, vsi);
> +
> +	instance->width = vsi->frame.uh.frame_width;
> +	instance->height = vsi->frame.uh.frame_height;
> +	instance->frame_type = vsi->frame.uh.frame_type;
> +	instance->show_frame = vsi->frame.uh.show_frame;
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_core_to_dst_buf(struct vdec_vp9_slice_instance *instance,
> +						struct vdec_lat_buf *lat_buf)
> +{
> +	struct vb2_v4l2_buffer *src;
> +	struct vb2_v4l2_buffer *dst;
> +
> +	dst = v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx);
> +	if (!dst)
> +		return -EINVAL;
> +
> +	src = &lat_buf->ts_info;
> +	dst->vb2_buf.timestamp = src->vb2_buf.timestamp;
> +	dst->timecode = src->timecode;
> +	dst->field = src->field;
> +	dst->flags = src->flags;
> +	dst->vb2_buf.copied_timestamp = src->vb2_buf.copied_timestamp;
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_core_buffer(struct vdec_vp9_slice_instance *instance,
> +					    struct vdec_vp9_slice_pfc *pfc,
> +					    struct vdec_vp9_slice_vsi *vsi,
> +					    struct vdec_fb *fb,
> +					    struct vdec_lat_buf *lat_buf)
> +{
> +	struct vb2_buffer *vb;
> +	struct vb2_queue *vq;
> +	struct vdec_vp9_slice_reference *ref;
> +	int plane;
> +	int size;
> +	int idx;
> +	int w;
> +	int h;
> +	int i;
> +
> +	plane = instance->ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes;
> +	w = vsi->frame.uh.frame_width;
> +	h = vsi->frame.uh.frame_height;
> +	size = ALIGN(w, 64) * ALIGN(h, 64);
> +
> +	/* frame buffer */
> +	vsi->fb.y.dma_addr = fb->base_y.dma_addr;
> +	if (plane == 1)
> +		vsi->fb.c.dma_addr = fb->base_y.dma_addr + size;
> +	else
> +		vsi->fb.c.dma_addr = fb->base_c.dma_addr;
> +
> +	/* reference buffers */
> +	vq = v4l2_m2m_get_vq(instance->ctx->m2m_ctx,
> +			     V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
> +	if (!vq)
> +		return -EINVAL;
> +
> +	/* get current output buffer */
> +	vb = &v4l2_m2m_next_dst_buf(instance->ctx->m2m_ctx)->vb2_buf;
> +	if (!vb)
> +		return -EINVAL;
> +
> +	/* update internal buffer's width/height */
> +	for (i = 0; i < vq->num_buffers; i++) {
> +		if (vb == vq->bufs[i]) {
> +			instance->dpb[i].width = w;
> +			instance->dpb[i].height = h;
> +			break;
> +		}
> +	}
> +
> +	/*
> +	 * get buffer's width/height from instance
> +	 * get buffer address from vb2buf
> +	 */
> +	for (i = 0; i < 3; i++) {
> +		ref = &vsi->frame.ref[i];
> +		idx = vb2_find_timestamp(vq, pfc->ref_idx[i], 0);
> +		if (idx < 0) {
> +			ref->frame_width = w;
> +			ref->frame_height = h;
> +			memset(&vsi->ref[i], 0, sizeof(vsi->ref[i]));
> +		} else {
> +			ref->frame_width = instance->dpb[idx].width;
> +			ref->frame_height = instance->dpb[idx].height;
> +			vb = vq->bufs[idx];
> +			vsi->ref[i].y.dma_addr =
> +				vb2_dma_contig_plane_dma_addr(vb, 0);
> +			if (plane == 1)
> +				vsi->ref[i].c.dma_addr =
> +					vsi->ref[i].y.dma_addr + size;
> +			else
> +				vsi->ref[i].c.dma_addr =
> +					vb2_dma_contig_plane_dma_addr(vb, 1);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_setup_core(struct vdec_vp9_slice_instance *instance,
> +				     struct vdec_fb *fb,
> +				     struct vdec_lat_buf *lat_buf,
> +				     struct vdec_vp9_slice_pfc *pfc)
> +{
> +	struct vdec_vp9_slice_vsi *vsi = &pfc->vsi;
> +	int ret;
> +
> +	vdec_vp9_slice_setup_state(vsi);
> +
> +	ret = vdec_vp9_slice_setup_core_to_dst_buf(instance, lat_buf);
> +	if (ret)
> +		goto err;
> +
> +	ret = vdec_vp9_slice_setup_core_buffer(instance, pfc, vsi, fb, lat_buf);
> +	if (ret)
> +		goto err;
> +
> +	vdec_vp9_slice_setup_seg_buffer(instance, vsi, &instance->seg[1]);
> +
> +	return 0;
> +
> +err:
> +	return ret;
> +}
> +
> +static int vdec_vp9_slice_update_core(struct vdec_vp9_slice_instance *instance,
> +				      struct vdec_lat_buf *lat_buf,
> +				      struct vdec_vp9_slice_pfc *pfc)
> +{
> +	struct vdec_vp9_slice_vsi *vsi;
> +
> +	vsi = &pfc->vsi;
> +	memcpy(&pfc->state[1], &vsi->state, sizeof(vsi->state));
> +
> +	mtk_vcodec_debug(instance, "Frame %u Y_CRC %08x %08x %08x %08x\n",
> +			 pfc->seq,
> +			 vsi->state.crc[0], vsi->state.crc[1],
> +			 vsi->state.crc[2], vsi->state.crc[3]);
> +	mtk_vcodec_debug(instance, "Frame %u C_CRC %08x %08x %08x %08x\n",
> +			 pfc->seq,
> +			 vsi->state.crc[4], vsi->state.crc[5],
> +			 vsi->state.crc[6], vsi->state.crc[7]);
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_init(struct mtk_vcodec_ctx *ctx)
> +{
> +	struct vdec_vp9_slice_instance *instance;
> +	struct vdec_vp9_slice_init_vsi *vsi;
> +	int ret;
> +
> +	instance = kzalloc(sizeof(*instance), GFP_KERNEL);
> +	if (!instance)
> +		return -ENOMEM;
> +
> +	instance->ctx = ctx;
> +	instance->vpu.id = SCP_IPI_VDEC_LAT;
> +	instance->vpu.core_id = SCP_IPI_VDEC_CORE;
> +	instance->vpu.ctx = ctx;
> +	instance->vpu.codec_type = ctx->current_codec;
> +
> +	ret = vpu_dec_init(&instance->vpu);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "failed to init vpu dec, ret %d\n", ret);
> +		goto error_vpu_init;
> +	}
> +
> +	/* init vsi and global flags */
> +
> +	vsi = instance->vpu.vsi;
> +	if (!vsi) {
> +		mtk_vcodec_err(instance, "failed to get VP9 vsi\n");
> +		ret = -EINVAL;
> +		goto error_vsi;
> +	}
> +	instance->init_vsi = vsi;
> +	instance->core_vsi = mtk_vcodec_fw_map_dm_addr(ctx->dev->fw_handler,
> +						       (u32)vsi->core_vsi);
> +	if (!instance->core_vsi) {
> +		mtk_vcodec_err(instance, "failed to get VP9 core vsi\n");
> +		ret = -EINVAL;
> +		goto error_vsi;
> +	}
> +
> +	instance->irq = 1;
> +
> +	ret = vdec_vp9_slice_init_default_frame_ctx(instance);
> +	if (ret)
> +		goto error_default_frame_ctx;
> +
> +	ctx->drv_handle = instance;
> +
> +	return 0;
> +
> +error_default_frame_ctx:
> +error_vsi:
> +	vpu_dec_deinit(&instance->vpu);
> +error_vpu_init:
> +	kfree(instance);
> +	return ret;
> +}
> +
> +static void vdec_vp9_slice_deinit(void *h_vdec)
> +{
> +	struct vdec_vp9_slice_instance *instance = h_vdec;
> +
> +	if (!instance)
> +		return;
> +
> +	vpu_dec_deinit(&instance->vpu);
> +	vdec_vp9_slice_free_working_buffer(instance);
> +	vdec_msg_queue_deinit(&instance->ctx->msg_queue, instance->ctx);
> +	kfree(instance);
> +}
> +
> +static int vdec_vp9_slice_flush(void *h_vdec, struct mtk_vcodec_mem *bs,
> +				struct vdec_fb *fb, bool *res_chg)
> +{
> +	struct vdec_vp9_slice_instance *instance = h_vdec;
> +
> +	mtk_vcodec_debug(instance, "flush ...\n");
> +
> +	vdec_msg_queue_wait_lat_buf_full(&instance->ctx->msg_queue);
> +	return vpu_dec_reset(&instance->vpu);
> +}
> +
> +static void vdec_vp9_slice_get_pic_info(struct vdec_vp9_slice_instance *instance)
> +{
> +	struct mtk_vcodec_ctx *ctx = instance->ctx;
> +	unsigned int data[3];
> +
> +	mtk_vcodec_debug(instance, "w %u h %u\n",
> +			 ctx->picinfo.pic_w, ctx->picinfo.pic_h);
> +
> +	data[0] = ctx->picinfo.pic_w;
> +	data[1] = ctx->picinfo.pic_h;
> +	data[2] = ctx->capture_fourcc;
> +	vpu_dec_get_param(&instance->vpu, data, 3, GET_PARAM_PIC_INFO);
> +
> +	ctx->picinfo.buf_w = ALIGN(ctx->picinfo.pic_w, 64);
> +	ctx->picinfo.buf_h = ALIGN(ctx->picinfo.pic_h, 64);
> +	ctx->picinfo.fb_sz[0] = instance->vpu.fb_sz[0];
> +	ctx->picinfo.fb_sz[1] = instance->vpu.fb_sz[1];
> +}
> +
> +static void vdec_vp9_slice_get_dpb_size(struct vdec_vp9_slice_instance *instance,
> +					unsigned int *dpb_sz)
> +{
> +	/* refer VP9 specification */
> +	*dpb_sz = 9;
> +}
> +
> +static void vdec_vp9_slice_get_crop_info(struct vdec_vp9_slice_instance *instance,
> +					 struct v4l2_rect *cr)
> +{
> +	struct mtk_vcodec_ctx *ctx = instance->ctx;
> +
> +	cr->left = 0;
> +	cr->top = 0;
> +	cr->width = ctx->picinfo.pic_w;
> +	cr->height = ctx->picinfo.pic_h;
> +
> +	mtk_vcodec_debug(instance, "l=%d, t=%d, w=%d, h=%d\n",
> +			 cr->left, cr->top, cr->width, cr->height);
> +}
> +
> +static int vdec_vp9_slice_get_param(void *h_vdec, enum vdec_get_param_type type, void *out)
> +{
> +	struct vdec_vp9_slice_instance *instance = h_vdec;
> +
> +	switch (type) {
> +	case GET_PARAM_PIC_INFO:
> +		vdec_vp9_slice_get_pic_info(instance);
> +		break;
> +	case GET_PARAM_DPB_SIZE:
> +		vdec_vp9_slice_get_dpb_size(instance, out);
> +		break;
> +	case GET_PARAM_CROP_INFO:
> +		vdec_vp9_slice_get_crop_info(instance, out);
> +		break;
> +	default:
> +		mtk_vcodec_err(instance, "invalid get parameter type=%d\n",
> +			       type);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_lat_decode(void *h_vdec, struct mtk_vcodec_mem *bs,
> +				     struct vdec_fb *fb, bool *res_chg)
> +{
> +	struct vdec_vp9_slice_instance *instance = h_vdec;
> +	struct vdec_lat_buf *lat_buf;
> +	struct vdec_vp9_slice_pfc *pfc;
> +	struct vdec_vp9_slice_vsi *vsi;
> +	struct mtk_vcodec_ctx *ctx;
> +	int ret;
> +
> +	if (!instance || !instance->ctx)
> +		return -EINVAL;
> +	ctx = instance->ctx;
> +
> +	/* init msgQ for the first time */
> +	if (vdec_msg_queue_init(&ctx->msg_queue, ctx,
> +				vdec_vp9_slice_core_decode,
> +				sizeof(*pfc)))
> +		return -ENOMEM;
> +
> +	/* bs NULL means flush decoder */
> +	if (!bs)
> +		return vdec_vp9_slice_flush(h_vdec, bs, fb, res_chg);
> +
> +	lat_buf = vdec_msg_queue_dqbuf(&instance->ctx->msg_queue.lat_ctx);
> +	if (!lat_buf) {
> +		mtk_vcodec_err(instance, "Failed to get VP9 lat buf\n");
> +		return -EBUSY;
> +	}
> +	pfc = (struct vdec_vp9_slice_pfc *)lat_buf->private_data;
> +	if (!pfc)
> +		return -EINVAL;
> +	vsi = &pfc->vsi;
> +
> +	ret = vdec_vp9_slice_setup_lat(instance, bs, lat_buf, pfc);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "Failed to setup VP9 lat ret %d\n", ret);
> +		return ret;
> +	}
> +	vdec_vp9_slice_vsi_to_remote(vsi, instance->vsi);
> +
> +	ret = vpu_dec_start(&instance->vpu, 0, 0);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "Failed to dec VP9 ret %d\n", ret);
> +		return ret;
> +	}
> +
> +	if (instance->irq) {
> +		ret = mtk_vcodec_wait_for_done_ctx(ctx,	MTK_INST_IRQ_RECEIVED,
> +						   WAIT_INTR_TIMEOUT_MS, MTK_VDEC_LAT0);
> +		/* update remote vsi if decode timeout */
> +		if (ret) {
> +			mtk_vcodec_err(instance, "VP9 decode timeout %d\n", ret);
> +			writel(1, &instance->vsi->state.timeout);
> +		}
> +		vpu_dec_end(&instance->vpu);
> +	}
> +
> +	vdec_vp9_slice_vsi_from_remote(vsi, instance->vsi, 0);
> +	ret = vdec_vp9_slice_update_lat(instance, lat_buf, pfc);
> +
> +	/* LAT trans full, no more UBE or decode timeout */
> +	if (ret) {
> +		mtk_vcodec_err(instance, "VP9 decode error: %d\n", ret);
> +		return ret;
> +	}
> +
> +	mtk_vcodec_debug(instance, "lat dma 1 0x%llx 0x%llx\n",
> +			 pfc->vsi.trans.dma_addr, pfc->vsi.trans.dma_addr_end);
> +
> +	vdec_msg_queue_update_ube_wptr(&ctx->msg_queue,
> +				       vsi->trans.dma_addr_end +
> +				       ctx->msg_queue.wdma_addr.dma_addr);
> +	vdec_msg_queue_qbuf(&ctx->dev->msg_queue_core_ctx, lat_buf);
> +
> +	return 0;
> +}
> +
> +static int vdec_vp9_slice_core_decode(struct vdec_lat_buf *lat_buf)
> +{
> +	struct vdec_vp9_slice_instance *instance;
> +	struct vdec_vp9_slice_pfc *pfc;
> +	struct mtk_vcodec_ctx *ctx = NULL;
> +	struct vdec_fb *fb = NULL;
> +	int ret = -EINVAL;
> +
> +	if (!lat_buf)
> +		goto err;
> +
> +	pfc = lat_buf->private_data;
> +	ctx = lat_buf->ctx;
> +	if (!pfc || !ctx)
> +		goto err;
> +
> +	instance = ctx->drv_handle;
> +	if (!instance)
> +		goto err;
> +
> +	fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
> +	if (!fb) {
> +		ret = -EBUSY;
> +		goto err;
> +	}
> +
> +	ret = vdec_vp9_slice_setup_core(instance, fb, lat_buf, pfc);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "vdec_vp9_slice_setup_core\n");
> +		goto err;
> +	}
> +	vdec_vp9_slice_vsi_to_remote(&pfc->vsi, instance->core_vsi);
> +
> +	ret = vpu_dec_core(&instance->vpu);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "vpu_dec_core\n");
> +		goto err;
> +	}
> +
> +	if (instance->irq) {
> +		ret = mtk_vcodec_wait_for_done_ctx(ctx, MTK_INST_IRQ_RECEIVED,
> +						   WAIT_INTR_TIMEOUT_MS, MTK_VDEC_CORE);
> +		/* update remote vsi if decode timeout */
> +		if (ret) {
> +			mtk_vcodec_err(instance, "VP9 core timeout\n");
> +			writel(1, &instance->core_vsi->state.timeout);
> +		}
> +		vpu_dec_core_end(&instance->vpu);
> +	}
> +
> +	vdec_vp9_slice_vsi_from_remote(&pfc->vsi, instance->core_vsi, 1);
> +	ret = vdec_vp9_slice_update_core(instance, lat_buf, pfc);
> +	if (ret) {
> +		mtk_vcodec_err(instance, "vdec_vp9_slice_update_core\n");
> +		goto err;
> +	}
> +
> +	pfc->vsi.trans.dma_addr_end += ctx->msg_queue.wdma_addr.dma_addr;
> +	mtk_vcodec_debug(instance, "core dma_addr_end 0x%llx\n", pfc->vsi.trans.dma_addr_end);
> +	vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, pfc->vsi.trans.dma_addr_end);
> +	ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, 0);
> +
> +	return 0;
> +
> +err:
> +	if (ctx) {
> +		/* always update read pointer */
> +		vdec_msg_queue_update_ube_rptr(&ctx->msg_queue, pfc->vsi.trans.dma_addr_end);
> +
> +		if (fb)
> +			ctx->dev->vdec_pdata->cap_to_disp(ctx, fb, 1);
> +	}
> +	return ret;
> +}
> +
> +const struct vdec_common_if vdec_vp9_slice_lat_if = {
> +	.init		= vdec_vp9_slice_init,
> +	.decode		= vdec_vp9_slice_lat_decode,
> +	.get_param	= vdec_vp9_slice_get_param,
> +	.deinit		= vdec_vp9_slice_deinit,
> +};
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> index 9db9a57da2c1..2d3a45781359 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.c
> @@ -44,6 +44,10 @@ int vdec_if_init(struct mtk_vcodec_ctx *ctx, unsigned int fourcc)
>  		ctx->dec_if = &vdec_vp9_if;
>  		ctx->hw_id = MTK_VDEC_CORE;
>  		break;
> +	case V4L2_PIX_FMT_VP9_FRAME:
> +		ctx->dec_if = &vdec_vp9_slice_lat_if;
> +		ctx->hw_id = MTK_VDEC_LAT0;
> +		break;
>  	default:
>  		return -EINVAL;
>  	}
> diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> index e3adf8f36342..e383a04db7b8 100644
> --- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> +++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
> @@ -60,6 +60,7 @@ extern const struct vdec_common_if vdec_h264_slice_lat_if;
>  extern const struct vdec_common_if vdec_vp8_if;
>  extern const struct vdec_common_if vdec_vp8_slice_if;
>  extern const struct vdec_common_if vdec_vp9_if;
> +extern const struct vdec_common_if vdec_vp9_slice_lat_if;
>  
>  /**
>   * vdec_if_init() - initialize decode driver


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-02-28 21:29   ` Nicolas Dufresne
@ 2022-03-02  1:47     ` yunfei.dong
  2022-06-17  6:46     ` Chen-Yu Tsai
  1 sibling, 0 replies; 36+ messages in thread
From: yunfei.dong @ 2022-03-02  1:47 UTC (permalink / raw)
  To: Nicolas Dufresne, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Hi Nicolas,

Thanks for your comments, I will fix this patch according your
suggestion.

On Mon, 2022-02-28 at 16:29 -0500, Nicolas Dufresne wrote:
> Hi Yunfei,
> 
> this patch does not work unless userland calls enum_framesizes, which
> is
> completely optional. See comment and suggestion below.
> 
> Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> > Supported max resolution for different platforms are not the same:
> > 2K
> > or 4K, getting it according to dec_capability.
> > 
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
> > ---
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 29 +++++++++++--
> > ------
> >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 +++
> >  2 files changed, 21 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > index 130ecef2e766..304f5afbd419 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > @@ -152,13 +152,15 @@ void mtk_vcodec_dec_set_default_params(struct
> > mtk_vcodec_ctx *ctx)
> >  	q_data->coded_height = DFT_CFG_HEIGHT;
> >  	q_data->fmt = ctx->dev->vdec_pdata->default_cap_fmt;
> >  	q_data->field = V4L2_FIELD_NONE;
> > +	ctx->max_width = MTK_VDEC_MAX_W;
> > +	ctx->max_height = MTK_VDEC_MAX_H;
> >  
> >  	v4l_bound_align_image(&q_data->coded_width,
> >  				MTK_VDEC_MIN_W,
> > -				MTK_VDEC_MAX_W, 4,
> > +				ctx->max_width, 4,
> >  				&q_data->coded_height,
> >  				MTK_VDEC_MIN_H,
> > -				MTK_VDEC_MAX_H, 5, 6);
> > +				ctx->max_height, 5, 6);
> >  
> >  	q_data->sizeimage[0] = q_data->coded_width * q_data-
> > >coded_height;
> >  	q_data->bytesperline[0] = q_data->coded_width;
> > @@ -217,7 +219,7 @@ static int vidioc_vdec_subscribe_evt(struct
> > v4l2_fh *fh,
> >  	}
> >  }
> >  
> > -static int vidioc_try_fmt(struct v4l2_format *f,
> > +static int vidioc_try_fmt(struct mtk_vcodec_ctx *ctx, struct
> > v4l2_format *f,
> >  			  const struct mtk_video_fmt *fmt)
> >  {
> >  	struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
> > @@ -225,9 +227,9 @@ static int vidioc_try_fmt(struct v4l2_format
> > *f,
> >  	pix_fmt_mp->field = V4L2_FIELD_NONE;
> >  
> >  	pix_fmt_mp->width =
> > -		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W,
> > MTK_VDEC_MAX_W);
> > +		clamp(pix_fmt_mp->width, MTK_VDEC_MIN_W, ctx-
> > >max_width);
> >  	pix_fmt_mp->height =
> > -		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H,
> > MTK_VDEC_MAX_H);
> > +		clamp(pix_fmt_mp->height, MTK_VDEC_MIN_H, ctx-
> > >max_height);
> >  
> >  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> >  		pix_fmt_mp->num_planes = 1;
> > @@ -245,16 +247,16 @@ static int vidioc_try_fmt(struct v4l2_format
> > *f,
> >  		tmp_h = pix_fmt_mp->height;
> >  		v4l_bound_align_image(&pix_fmt_mp->width,
> >  					MTK_VDEC_MIN_W,
> > -					MTK_VDEC_MAX_W, 6,
> > +					ctx->max_width, 6,
> >  					&pix_fmt_mp->height,
> >  					MTK_VDEC_MIN_H,
> > -					MTK_VDEC_MAX_H, 6, 9);
> > +					ctx->max_height, 6, 9);
> >  
> >  		if (pix_fmt_mp->width < tmp_w &&
> > -			(pix_fmt_mp->width + 64) <= MTK_VDEC_MAX_W)
> > +			(pix_fmt_mp->width + 64) <= ctx->max_width)
> >  			pix_fmt_mp->width += 64;
> >  		if (pix_fmt_mp->height < tmp_h &&
> > -			(pix_fmt_mp->height + 64) <= MTK_VDEC_MAX_H)
> > +			(pix_fmt_mp->height + 64) <= ctx->max_height)
> >  			pix_fmt_mp->height += 64;
> >  
> >  		mtk_v4l2_debug(0,
> > @@ -294,7 +296,7 @@ static int vidioc_try_fmt_vid_cap_mplane(struct
> > file *file, void *priv,
> >  		fmt = mtk_vdec_find_format(f, dec_pdata);
> >  	}
> >  
> > -	return vidioc_try_fmt(f, fmt);
> > +	return vidioc_try_fmt(ctx, f, fmt);
> >  }
> >  
> >  static int vidioc_try_fmt_vid_out_mplane(struct file *file, void
> > *priv,
> > @@ -317,7 +319,7 @@ static int vidioc_try_fmt_vid_out_mplane(struct
> > file *file, void *priv,
> >  		return -EINVAL;
> >  	}
> >  
> > -	return vidioc_try_fmt(f, fmt);
> > +	return vidioc_try_fmt(ctx, f, fmt);
> >  }
> >  
> >  static int vidioc_vdec_g_selection(struct file *file, void *priv,
> > @@ -445,7 +447,7 @@ static int vidioc_vdec_s_fmt(struct file *file,
> > void *priv,
> >  		return -EINVAL;
> >  
> >  	q_data->fmt = fmt;
> > -	vidioc_try_fmt(f, q_data->fmt);
> > +	vidioc_try_fmt(ctx, f, q_data->fmt);
> >  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> >  		q_data->sizeimage[0] = pix_mp->plane_fmt[0].sizeimage;
> >  		q_data->coded_width = pix_mp->width;
> > @@ -545,6 +547,9 @@ static int vidioc_enum_framesizes(struct file
> > *file, void *priv,
> >  				fsize->stepwise.min_height,
> >  				fsize->stepwise.max_height,
> >  				fsize->stepwise.step_height);
> > +
> > +		ctx->max_width = fsize->stepwise.max_width;
> > +		ctx->max_height = fsize->stepwise.max_height;
> 
> The spec does not require calling enum_fmt, so changing the maximum
> here is
> incorrect (and fail with GStreamer). If userland never enum the
> framesizes, the
> resolution get limited to 1080p.
> 
> As this only depends and the OUTPUT format and the device being
> open()
> (condition being dev_capability being set and OUTPUT format being
> known / not
> VP8), you could initialize the cxt max inside s_fmt(OUTPUT) instead,
> which is a
> mandatory call. I have tested this change to verify this:
> 
I will fix it in your suggestion, thanks.

Best Regards,
Yunfei Dong
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> index 044e3dfbdd8c..3e7c571526a4 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> @@ -484,6 +484,14 @@ static int vidioc_vdec_s_fmt(struct file *file,
> void *priv,
>  	if (fmt == NULL)
>  		return -EINVAL;
>  
> +	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE &&
> +	    !(ctx->dev->dec_capability & VCODEC_CAPABILITY_4K_DISABLED)
> &&
> +	    fmt->fourcc != V4L2_PIX_FMT_VP8_FRAME) {
> +		mtk_v4l2_debug(3, "4K is enabled");
> +		ctx->max_width = VCODEC_DEC_4K_CODED_WIDTH;
> +		ctx->max_height = VCODEC_DEC_4K_CODED_HEIGHT;
> +	}
> +
>  	q_data->fmt = fmt;
>  	vidioc_try_fmt(ctx, f, q_data->fmt);
>  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> @@ -574,15 +582,9 @@ static int vidioc_enum_framesizes(struct file
> *file, void *priv,
>  
>  		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
>  		fsize->stepwise = dec_pdata-
> >vdec_framesizes[i].stepwise;
> -		if (!(ctx->dev->dec_capability &
> -				VCODEC_CAPABILITY_4K_DISABLED) &&
> -				fsize->pixel_format !=
> V4L2_PIX_FMT_VP8_FRAME) {
> -			mtk_v4l2_debug(3, "4K is enabled");
> -			fsize->stepwise.max_width =
> -					VCODEC_DEC_4K_CODED_WIDTH;
> -			fsize->stepwise.max_height =
> -					VCODEC_DEC_4K_CODED_HEIGHT;
> -		}
> +		fsize->stepwise.max_width = ctx->max_width;
> +		fsize->stepwise.max_height = ctx->max_height;
> +
>  		mtk_v4l2_debug(1, "%x, %d %d %d %d %d %d",
>  				ctx->dev->dec_capability,
>  				fsize->stepwise.min_width,
> @@ -592,8 +594,6 @@ static int vidioc_enum_framesizes(struct file
> *file, void *priv,
>  				fsize->stepwise.max_height,
>  				fsize->stepwise.step_height);
>  
> -		ctx->max_width = fsize->stepwise.max_width;
> -		ctx->max_height = fsize->stepwise.max_height;
>  		return 0;
>  	}
>  
> 
> 
> >  		return 0;
> >  	}
> >  
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > index bb7b8e914d24..6d27e4d41ede 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > @@ -284,6 +284,8 @@ struct vdec_pic_info {
> >   *	  mtk_video_dec_buf.
> >   * @hw_id: hardware index used to identify different hardware.
> >   *
> > + * @max_width: hardware supported max width
> > + * @max_height: hardware supported max height
> >   * @msg_queue: msg queue used to store lat buffer information.
> >   */
> >  struct mtk_vcodec_ctx {
> > @@ -329,6 +331,8 @@ struct mtk_vcodec_ctx {
> >  	struct mutex lock;
> >  	int hw_id;
> >  
> > +	unsigned int max_width;
> > +	unsigned int max_height;
> >  	struct vdec_msg_queue msg_queue;
> >  };
> >  
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp
  2022-03-01 14:44   ` Nicolas Dufresne
@ 2022-03-02  2:26     ` yunfei.dong
  0 siblings, 0 replies; 36+ messages in thread
From: yunfei.dong @ 2022-03-02  2:26 UTC (permalink / raw)
  To: Nicolas Dufresne, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Hi Nicolas,

Thanks for you suggestion.
On Tue, 2022-03-01 at 09:44 -0500, Nicolas Dufresne wrote:
> Thanks for your patch, though perhaps it could be improved, see
> comment below.
> 
> Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> > Different capture buffer format has different buffer size, need to
> > get
> > real buffer size according to buffer type from scp.
> > 
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > ---
> >  .../media/platform/mtk-vcodec/vdec_ipi_msg.h  | 36 ++++++++++++++
> >  .../media/platform/mtk-vcodec/vdec_vpu_if.c   | 49
> > +++++++++++++++++++
> >  .../media/platform/mtk-vcodec/vdec_vpu_if.h   | 15 ++++++
> >  3 files changed, 100 insertions(+)
> > 
> > diff --git a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> > b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> > index bf54d6d9a857..47070be2a991 100644
> > --- a/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> > +++ b/drivers/media/platform/mtk-vcodec/vdec_ipi_msg.h
> > @@ -20,6 +20,7 @@ enum vdec_ipi_msgid {
> >  	AP_IPIMSG_DEC_RESET = 0xA004,
> >  	AP_IPIMSG_DEC_CORE = 0xA005,
> >  	AP_IPIMSG_DEC_CORE_END = 0xA006,
> > +	AP_IPIMSG_DEC_GET_PARAM = 0xA007,
> >  
> >  	VPU_IPIMSG_DEC_INIT_ACK = 0xB000,
> >  	VPU_IPIMSG_DEC_START_ACK = 0xB001,
> > @@ -28,6 +29,7 @@ enum vdec_ipi_msgid {
> >  	VPU_IPIMSG_DEC_RESET_ACK = 0xB004,
> >  	VPU_IPIMSG_DEC_CORE_ACK = 0xB005,
> >  	VPU_IPIMSG_DEC_CORE_END_ACK = 0xB006,
> > +	VPU_IPIMSG_DEC_GET_PARAM_ACK = 0xB007,
> >  };
> >  
> >  /**
> > @@ -114,4 +116,38 @@ struct vdec_vpu_ipi_init_ack {
> >  	uint32_t inst_id;
> >  };
> >  
> > +/**
> > + * struct vdec_ap_ipi_get_param - for AP_IPIMSG_DEC_GET_PARAM
> > + * @msg_id	: AP_IPIMSG_DEC_GET_PARAM
> > + * @inst_id     : instance ID. Used if the ABI version >= 2.
> > + * @data	: picture information
> > + * @param_type	: get param type
> > + * @codec_type	: Codec fourcc
> > + */
> > +struct vdec_ap_ipi_get_param {
> > +	u32 msg_id;
> > +	u32 inst_id;
> > +	u32 data[4];
> > +	u32 param_type;
> > +	u32 codec_type;
> > +};
> > +
> > +/**
> > + * struct vdec_vpu_ipi_get_param_ack - for
> > VPU_IPIMSG_DEC_GET_PARAM_ACK
> > + * @msg_id	: VPU_IPIMSG_DEC_GET_PARAM_ACK
> > + * @status	: VPU execution result
> > + * @ap_inst_addr	: AP vcodec_vpu_inst instance address
> > + * @data     : picture information from SCP.
> > + * @param_type	: get param type
> > + * @reserved : reserved param
> > + */
> > +struct vdec_vpu_ipi_get_param_ack {
> > +	u32 msg_id;
> > +	s32 status;
> > +	u64 ap_inst_addr;
> > +	u32 data[4];
> > +	u32 param_type;
> > +	u32 reserved;
> > +};
> > +
> >  #endif
> > diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> > b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> > index 7210061c772f..35f4d5583084 100644
> > --- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> > +++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
> > @@ -6,6 +6,7 @@
> >  
> >  #include "mtk_vcodec_drv.h"
> >  #include "mtk_vcodec_util.h"
> > +#include "vdec_drv_if.h"
> >  #include "vdec_ipi_msg.h"
> >  #include "vdec_vpu_if.h"
> >  #include "mtk_vcodec_fw.h"
> > @@ -54,6 +55,26 @@ static void handle_init_ack_msg(const struct
> > vdec_vpu_ipi_init_ack *msg)
> >  	}
> >  }
> >  
> > +static void handle_get_param_msg_ack(const struct
> > vdec_vpu_ipi_get_param_ack *msg)
> > +{
> > +	struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
> > +					(unsigned long)msg-
> > >ap_inst_addr;
> > +
> > +	mtk_vcodec_debug(vpu, "+ ap_inst_addr = 0x%llx", msg-
> > >ap_inst_addr);
> > +
> > +	/* param_type is enum vdec_get_param_type */
> > +	switch (msg->param_type) {
> > +	case GET_PARAM_PIC_INFO:
> > +		vpu->fb_sz[0] = msg->data[0];
> > +		vpu->fb_sz[1] = msg->data[1];
> > +		break;
> > +	default:
> > +		mtk_vcodec_err(vpu, "invalid get param type=%d", msg-
> > >param_type);
> > +		vpu->failure = 1;
> > +		break;
> > +	}
> > +}
> > +
> >  /*
> >   * vpu_dec_ipi_handler - Handler for VPU ipi message.
> >   *
> > @@ -89,6 +110,9 @@ static void vpu_dec_ipi_handler(void *data,
> > unsigned int len, void *priv)
> >  		case VPU_IPIMSG_DEC_CORE_END_ACK:
> >  			break;
> >  
> > +		case VPU_IPIMSG_DEC_GET_PARAM_ACK:
> > +			handle_get_param_msg_ack(data);
> > +			break;
> >  		default:
> >  			mtk_vcodec_err(vpu, "invalid msg=%X", msg-
> > >msg_id);
> >  			break;
> > @@ -217,6 +241,31 @@ int vpu_dec_start(struct vdec_vpu_inst *vpu,
> > uint32_t *data, unsigned int len)
> >  	return err;
> >  }
> >  
> > +int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
> > +		      unsigned int len, unsigned int param_type)
> > +{
> > +	struct vdec_ap_ipi_get_param msg;
> > +	int err;
> > +
> > +	mtk_vcodec_debug_enter(vpu);
> > +
> > +	if (len > ARRAY_SIZE(msg.data)) {
> > +		mtk_vcodec_err(vpu, "invalid len = %d\n", len);
> > +		return -EINVAL;
> > +	}
> > +
> > +	memset(&msg, 0, sizeof(msg));
> > +	msg.msg_id = AP_IPIMSG_DEC_GET_PARAM;
> > +	msg.inst_id = vpu->inst_id;
> > +	memcpy(msg.data, data, sizeof(unsigned int) * len);
> > +	msg.param_type = param_type;
> > +	msg.codec_type = vpu->codec_type;
> > +
> > +	err = vcodec_vpu_send_msg(vpu, (void *)&msg, sizeof(msg));
> > +	mtk_vcodec_debug(vpu, "- ret=%d", err);
> > +	return err;
> > +}
> > +
> >  int vpu_dec_core(struct vdec_vpu_inst *vpu)
> >  {
> >  	return vcodec_send_ap_ipi(vpu, AP_IPIMSG_DEC_CORE);
> > diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> > b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> > index 4cb3c7f5a3ad..d1feba41dd39 100644
> > --- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> > +++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.h
> > @@ -28,6 +28,8 @@ struct mtk_vcodec_ctx;
> >   * @wq          : wait queue to wait VPU message ack
> >   * @handler     : ipi handler for each decoder
> >   * @codec_type     : use codec type to separate different codecs
> > + * @capture_type    : used capture type to separate different
> > capture format
> > + * @fb_sz  : frame buffer size of each plane
> >   */
> >  struct vdec_vpu_inst {
> >  	int id;
> > @@ -42,6 +44,8 @@ struct vdec_vpu_inst {
> >  	wait_queue_head_t wq;
> >  	mtk_vcodec_ipi_handler handler;
> >  	unsigned int codec_type;
> > +	unsigned int capture_type;
> 
> This structure member is added in this patch, but never set or used.
> 
This member will be used in patch 13/14/15 used to record capture type,
I will remove this member to patch 13 when first to use it.

Best Regards,
Yunfei Dong
> > +	unsigned int fb_sz[2];
> >  };
> >  
> >  /**
> > @@ -104,4 +108,15 @@ int vpu_dec_core(struct vdec_vpu_inst *vpu);
> >   */
> >  int vpu_dec_core_end(struct vdec_vpu_inst *vpu);
> >  
> > +/**
> > + * vpu_dec_get_param - get param from scp
> > + *
> > + * @vpu : instance for vdec_vpu_inst
> > + * @data: meta data to pass bitstream info to VPU decoder
> > + * @len : meta data length
> > + * @param_type : get param type
> > + */
> > +int vpu_dec_get_param(struct vdec_vpu_inst *vpu, uint32_t *data,
> > +		      unsigned int len, unsigned int param_type);
> > +
> >  #endif
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes
  2022-03-01 14:34   ` Nicolas Dufresne
@ 2022-03-04  7:27     ` yunfei.dong
  0 siblings, 0 replies; 36+ messages in thread
From: yunfei.dong @ 2022-03-04  7:27 UTC (permalink / raw)
  To: Nicolas Dufresne, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa
  Cc: George Sun, Xiaoyong Lu, Hsin-Yi Wang, Fritz Koenig,
	Dafna Hirschfeld, Daniel Vetter, dri-devel, Irui Wang, Steve Cho,
	linux-media, devicetree, linux-kernel, linux-arm-kernel,
	srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Hi Nicolas,

Thanks for your suggestion.

On Tue, 2022-03-01 at 09:34 -0500, Nicolas Dufresne wrote:
> Le mercredi 23 février 2022 à 11:40 +0800, Yunfei Dong a écrit :
> > Supported output and capture format types for mt8192 are different
> > with mt8183. Needs to get format types according to decoder
> > capability.
> 
> This patch is both refactoring and changing the behaviour. Can you
> please split
> the non-functional changes from the functional one. This ensure we
> can proceed
> with a good review of the functional changes.
> 
I will split this patch. Thanks.

> regards,
> Nicolas
> 
Best Regards,
Yunfei Dong
> > 
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > ---
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      |   8 +-
> >  .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |  13 +-
> >  .../mtk-vcodec/mtk_vcodec_dec_stateless.c     | 117 +++++++++++++-
> > ----
> >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  13 +-
> >  4 files changed, 107 insertions(+), 44 deletions(-)
> > 
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > index 304f5afbd419..bae43938ee37 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > @@ -26,7 +26,7 @@ mtk_vdec_find_format(struct v4l2_format *f,
> >  	const struct mtk_video_fmt *fmt;
> >  	unsigned int k;
> >  
> > -	for (k = 0; k < dec_pdata->num_formats; k++) {
> > +	for (k = 0; k < *dec_pdata->num_formats; k++) {
> >  		fmt = &dec_pdata->vdec_formats[k];
> >  		if (fmt->fourcc == f->fmt.pix_mp.pixelformat)
> >  			return fmt;
> > @@ -525,7 +525,7 @@ static int vidioc_enum_framesizes(struct file
> > *file, void *priv,
> >  	if (fsize->index != 0)
> >  		return -EINVAL;
> >  
> > -	for (i = 0; i < dec_pdata->num_framesizes; ++i) {
> > +	for (i = 0; i < *dec_pdata->num_framesizes; ++i) {
> >  		if (fsize->pixel_format != dec_pdata-
> > >vdec_framesizes[i].fourcc)
> >  			continue;
> >  
> > @@ -564,7 +564,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc
> > *f, void *priv,
> >  	const struct mtk_video_fmt *fmt;
> >  	int i, j = 0;
> >  
> > -	for (i = 0; i < dec_pdata->num_formats; i++) {
> > +	for (i = 0; i < *dec_pdata->num_formats; i++) {
> >  		if (output_queue &&
> >  		    dec_pdata->vdec_formats[i].type != MTK_FMT_DEC)
> >  			continue;
> > @@ -577,7 +577,7 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc
> > *f, void *priv,
> >  		++j;
> >  	}
> >  
> > -	if (i == dec_pdata->num_formats)
> > +	if (i == *dec_pdata->num_formats)
> >  		return -EINVAL;
> >  
> >  	fmt = &dec_pdata->vdec_formats[i];
> > diff --git a/drivers/media/platform/mtk-
> > vcodec/mtk_vcodec_dec_stateful.c b/drivers/media/platform/mtk-
> > vcodec/mtk_vcodec_dec_stateful.c
> > index 7966c132be8f..3f33beb9c551 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateful.c
> > @@ -37,7 +37,9 @@ static const struct mtk_video_fmt
> > mtk_video_formats[] = {
> >  	},
> >  };
> >  
> > -#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > +static const unsigned int num_supported_formats =
> > +	ARRAY_SIZE(mtk_video_formats);
> > +
> >  #define DEFAULT_OUT_FMT_IDX 0
> >  #define DEFAULT_CAP_FMT_IDX 3
> >  
> > @@ -59,7 +61,8 @@ static const struct mtk_codec_framesizes
> > mtk_vdec_framesizes[] = {
> >  	},
> >  };
> >  
> > -#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > +static const unsigned int num_supported_framesize =
> > +	ARRAY_SIZE(mtk_vdec_framesizes);
> >  
> >  /*
> >   * This function tries to clean all display buffers, the buffers
> > will return
> > @@ -235,7 +238,7 @@ static void mtk_vdec_update_fmt(struct
> > mtk_vcodec_ctx *ctx,
> >  	unsigned int k;
> >  
> >  	dst_q_data = &ctx->q_data[MTK_Q_DATA_DST];
> > -	for (k = 0; k < NUM_FORMATS; k++) {
> > +	for (k = 0; k < num_supported_formats; k++) {
> >  		fmt = &mtk_video_formats[k];
> >  		if (fmt->fourcc == pixelformat) {
> >  			mtk_v4l2_debug(1, "Update cap fourcc(%d ->
> > %d)",
> > @@ -617,11 +620,11 @@ const struct mtk_vcodec_dec_pdata
> > mtk_vdec_8173_pdata = {
> >  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
> >  	.vdec_vb2_ops = &mtk_vdec_frame_vb2_ops,
> >  	.vdec_formats = mtk_video_formats,
> > -	.num_formats = NUM_FORMATS,
> > +	.num_formats = &num_supported_formats,
> >  	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
> >  	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
> >  	.vdec_framesizes = mtk_vdec_framesizes,
> > -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> > +	.num_framesizes = &num_supported_framesize,
> >  	.worker = mtk_vdec_worker,
> >  	.flush_decoder = mtk_vdec_flush_decoder,
> >  	.is_subdev_supported = false,
> > diff --git a/drivers/media/platform/mtk-
> > vcodec/mtk_vcodec_dec_stateless.c b/drivers/media/platform/mtk-
> > vcodec/mtk_vcodec_dec_stateless.c
> > index 6d481410bf89..e51d935bd21d 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_stateless.c
> > @@ -81,33 +81,23 @@ static const struct mtk_stateless_control
> > mtk_stateless_controls[] = {
> >  
> >  #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
> >  
> > -static const struct mtk_video_fmt mtk_video_formats[] = {
> > -	{
> > -		.fourcc = V4L2_PIX_FMT_H264_SLICE,
> > -		.type = MTK_FMT_DEC,
> > -		.num_planes = 1,
> > -	},
> > -	{
> > -		.fourcc = V4L2_PIX_FMT_MM21,
> > -		.type = MTK_FMT_FRAME,
> > -		.num_planes = 2,
> > -	},
> > +static struct mtk_video_fmt mtk_video_formats[2];
> > +static struct mtk_codec_framesizes mtk_vdec_framesizes[1];
> > +
> > +static struct mtk_video_fmt default_out_format;
> > +static struct mtk_video_fmt default_cap_format;
> > +static unsigned int num_formats;
> > +static unsigned int num_framesizes;
> > +
> > +static struct v4l2_frmsize_stepwise stepwise_fhd = {
> > +	.min_width = MTK_VDEC_MIN_W,
> > +	.max_width = MTK_VDEC_MAX_W,
> > +	.step_width = 16,
> > +	.min_height = MTK_VDEC_MIN_H,
> > +	.max_height = MTK_VDEC_MAX_H,
> > +	.step_height = 16
> >  };
> >  
> > -#define NUM_FORMATS ARRAY_SIZE(mtk_video_formats)
> > -#define DEFAULT_OUT_FMT_IDX    0
> > -#define DEFAULT_CAP_FMT_IDX    1
> > -
> > -static const struct mtk_codec_framesizes mtk_vdec_framesizes[] = {
> > -	{
> > -		.fourcc	= V4L2_PIX_FMT_H264_SLICE,
> > -		.stepwise = {  MTK_VDEC_MIN_W, MTK_VDEC_MAX_W, 16,
> > -				MTK_VDEC_MIN_H, MTK_VDEC_MAX_H, 16 },
> > -	},
> > -};
> > -
> > -#define NUM_SUPPORTED_FRAMESIZE ARRAY_SIZE(mtk_vdec_framesizes)
> > -
> >  static void mtk_vdec_stateless_out_to_done(struct mtk_vcodec_ctx
> > *ctx,
> >  					   struct mtk_vcodec_mem *bs,
> > int error)
> >  {
> > @@ -350,6 +340,62 @@ const struct media_device_ops
> > mtk_vcodec_media_ops = {
> >  	.req_queue	= v4l2_m2m_request_queue,
> >  };
> >  
> > +static void mtk_vcodec_add_formats(unsigned int fourcc,
> > +				   struct mtk_vcodec_ctx *ctx)
> > +{
> > +	struct mtk_vcodec_dev *dev = ctx->dev;
> > +	const struct mtk_vcodec_dec_pdata *pdata = dev->vdec_pdata;
> > +	int count_formats = *pdata->num_formats;
> > +	int count_framesizes = *pdata->num_framesizes;
> > +
> > +	switch (fourcc) {
> > +	case V4L2_PIX_FMT_H264_SLICE:
> > +			[count_formats].fourcc = fourcc;
> > +		mtk_video_formats[count_formats].type = MTK_FMT_DEC;
> > +		mtk_video_formats[count_formats].num_planes = 1;
> > +
> > +		mtk_vdec_framesizes[count_framesizes].fourcc = fourcc;
> > +		mtk_vdec_framesizes[count_framesizes].stepwise =
> > stepwise_fhd;
> > +		num_framesizes++;
> > +		break;
> > +	case V4L2_PIX_FMT_MM21:
> > +		mtk_video_formats[count_formats].fourcc = fourcc;
> > +		mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
> > +		mtk_video_formats[count_formats].num_planes = 2;
> > +		break;
> > +	default:
> > +		mtk_v4l2_err("Can not add unsupported format type");
> > +		return;
> > +	}
> > +
> > +	num_formats++;
> > +	mtk_v4l2_debug(3, "num_formats: %d num_frames:%d
> > dec_capability: 0x%x",
> > +		       count_formats, count_framesizes, ctx->dev-
> > >dec_capability);
> > +}
> > +
> > +static void mtk_vcodec_get_supported_formats(struct mtk_vcodec_ctx
> > *ctx)
> > +{
> > +	int cap_format_count = 0, out_format_count = 0;
> > +
> > +	if (num_formats && num_framesizes)
> > +		return;
> > +
> > +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_MM21) {
> > +		mtk_vcodec_add_formats(V4L2_PIX_FMT_MM21, ctx);
> > +		cap_format_count++;
> > +	}
> > +	if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_H264_SLICE) {
> > +		mtk_vcodec_add_formats(V4L2_PIX_FMT_H264_SLICE, ctx);
> > +		out_format_count++;
> > +	}
> > +
> > +	if (cap_format_count)
> > +		default_cap_format = mtk_video_formats[cap_format_count
> > - 1];
> > +	if (out_format_count)
> > +		default_out_format =
> > +			mtk_video_formats[cap_format_count +
> > out_format_count - 1];
> > +}
> > +
> >  static void mtk_init_vdec_params(struct mtk_vcodec_ctx *ctx)
> >  {
> >  	struct vb2_queue *src_vq;
> > @@ -360,6 +406,11 @@ static void mtk_init_vdec_params(struct
> > mtk_vcodec_ctx *ctx)
> >  	if (ctx->dev->vdec_pdata->hw_arch != MTK_VDEC_PURE_SINGLE_CORE)
> >  		v4l2_m2m_set_dst_buffered(ctx->m2m_ctx, 1);
> >  
> > +	if (!ctx->dev->vdec_pdata->is_subdev_supported)
> > +		ctx->dev->dec_capability |=
> > +			MTK_VDEC_FORMAT_H264_SLICE |
> > MTK_VDEC_FORMAT_MM21;
> > +	mtk_vcodec_get_supported_formats(ctx);
> > +
> >  	/* Support request api for output plane */
> >  	src_vq->supports_requests = true;
> >  	src_vq->requires_requests = true;
> > @@ -393,11 +444,11 @@ const struct mtk_vcodec_dec_pdata
> > mtk_vdec_8183_pdata = {
> >  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
> >  	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
> >  	.vdec_formats = mtk_video_formats,
> > -	.num_formats = NUM_FORMATS,
> > -	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
> > -	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
> > +	.num_formats = &num_formats,
> > +	.default_out_fmt = &default_out_format,
> > +	.default_cap_fmt = &default_cap_format,
> >  	.vdec_framesizes = mtk_vdec_framesizes,
> > -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> > +	.num_framesizes = &num_framesizes,
> >  	.uses_stateless_api = true,
> >  	.worker = mtk_vdec_worker,
> >  	.flush_decoder = mtk_vdec_flush_decoder,
> > @@ -413,11 +464,11 @@ const struct mtk_vcodec_dec_pdata
> > mtk_lat_sig_core_pdata = {
> >  	.ctrls_setup = mtk_vcodec_dec_ctrls_setup,
> >  	.vdec_vb2_ops = &mtk_vdec_request_vb2_ops,
> >  	.vdec_formats = mtk_video_formats,
> > -	.num_formats = NUM_FORMATS,
> > -	.default_out_fmt = &mtk_video_formats[DEFAULT_OUT_FMT_IDX],
> > -	.default_cap_fmt = &mtk_video_formats[DEFAULT_CAP_FMT_IDX],
> > +	.num_formats = &num_formats,
> > +	.default_out_fmt = &default_out_format,
> > +	.default_cap_fmt = &default_cap_format,
> >  	.vdec_framesizes = mtk_vdec_framesizes,
> > -	.num_framesizes = NUM_SUPPORTED_FRAMESIZE,
> > +	.num_framesizes = &num_framesizes,
> >  	.uses_stateless_api = true,
> >  	.worker = mtk_vdec_worker,
> >  	.flush_decoder = mtk_vdec_flush_decoder,
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > index 9fcaf69549dd..270c73c05285 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
> > @@ -344,6 +344,15 @@ enum mtk_vdec_hw_arch {
> >  	MTK_VDEC_LAT_SINGLE_CORE,
> >  };
> >  
> > +/*
> > + * struct mtk_vdec_format_types - Structure used to get supported
> > + *		  format types according to decoder capability
> > + */
> > +enum mtk_vdec_format_types {
> > +	MTK_VDEC_FORMAT_MM21 = 0x20,
> > +	MTK_VDEC_FORMAT_H264_SLICE = 0x100,
> > +};
> > +
> >  /**
> >   * struct mtk_vcodec_dec_pdata - compatible data for each IC
> >   * @init_vdec_params: init vdec params
> > @@ -379,12 +388,12 @@ struct mtk_vcodec_dec_pdata {
> >  	struct vb2_ops *vdec_vb2_ops;
> >  
> >  	const struct mtk_video_fmt *vdec_formats;
> > -	const int num_formats;
> > +	const int *num_formats;
> >  	const struct mtk_video_fmt *default_out_fmt;
> >  	const struct mtk_video_fmt *default_cap_fmt;
> >  
> >  	const struct mtk_codec_framesizes *vdec_framesizes;
> > -	const int num_framesizes;
> > +	const int *num_framesizes;
> >  
> >  	enum mtk_vdec_hw_arch hw_arch;
> >  
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-02-28 21:29   ` Nicolas Dufresne
  2022-03-02  1:47     ` yunfei.dong
@ 2022-06-17  6:46     ` Chen-Yu Tsai
  2022-06-21 15:33       ` Nicolas Dufresne
  1 sibling, 1 reply; 36+ messages in thread
From: Chen-Yu Tsai @ 2022-06-17  6:46 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa, George Sun, Xiaoyong Lu,
	Hsin-Yi Wang, Fritz Koenig, Dafna Hirschfeld, Daniel Vetter,
	dri-devel, Irui Wang, Steve Cho, linux-media, devicetree,
	linux-kernel, linux-arm-kernel, srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Hi,

On Mon, Feb 28, 2022 at 04:29:15PM -0500, Nicolas Dufresne wrote:
> Hi Yunfei,
> 
> this patch does not work unless userland calls enum_framesizes, which is
> completely optional. See comment and suggestion below.
> 
> Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> > Supported max resolution for different platforms are not the same: 2K
> > or 4K, getting it according to dec_capability.
> > 
> > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
> > ---
> >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 29 +++++++++++--------
> >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 +++
> >  2 files changed, 21 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > index 130ecef2e766..304f5afbd419 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > @@ -445,7 +447,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
> >  		return -EINVAL;
> >  
> >  	q_data->fmt = fmt;
> > -	vidioc_try_fmt(f, q_data->fmt);
> > +	vidioc_try_fmt(ctx, f, q_data->fmt);
> >  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> >  		q_data->sizeimage[0] = pix_mp->plane_fmt[0].sizeimage;
> >  		q_data->coded_width = pix_mp->width;
> > @@ -545,6 +547,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
> >  				fsize->stepwise.min_height,
> >  				fsize->stepwise.max_height,
> >  				fsize->stepwise.step_height);
> > +
> > +		ctx->max_width = fsize->stepwise.max_width;
> > +		ctx->max_height = fsize->stepwise.max_height;
> 
> The spec does not require calling enum_fmt, so changing the maximum here is
> incorrect (and fail with GStreamer). If userland never enum the framesizes, the
> resolution get limited to 1080p.
> 
> As this only depends and the OUTPUT format and the device being open()
> (condition being dev_capability being set and OUTPUT format being known / not
> VP8), you could initialize the cxt max inside s_fmt(OUTPUT) instead, which is a
> mandatory call. I have tested this change to verify this:
> 
> 
> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> index 044e3dfbdd8c..3e7c571526a4 100644
> --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> @@ -484,6 +484,14 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
>  	if (fmt == NULL)
>  		return -EINVAL;
>  
> +	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE &&
> +	    !(ctx->dev->dec_capability & VCODEC_CAPABILITY_4K_DISABLED) &&
> +	    fmt->fourcc != V4L2_PIX_FMT_VP8_FRAME) {
> +		mtk_v4l2_debug(3, "4K is enabled");
> +		ctx->max_width = VCODEC_DEC_4K_CODED_WIDTH;
> +		ctx->max_height = VCODEC_DEC_4K_CODED_HEIGHT;
> +	}
> +
>  	q_data->fmt = fmt;
>  	vidioc_try_fmt(ctx, f, q_data->fmt);
>  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> @@ -574,15 +582,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
>  
>  		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
>  		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
> -		if (!(ctx->dev->dec_capability &
> -				VCODEC_CAPABILITY_4K_DISABLED) &&
> -				fsize->pixel_format != V4L2_PIX_FMT_VP8_FRAME) {
> -			mtk_v4l2_debug(3, "4K is enabled");
> -			fsize->stepwise.max_width =
> -					VCODEC_DEC_4K_CODED_WIDTH;
> -			fsize->stepwise.max_height =
> -					VCODEC_DEC_4K_CODED_HEIGHT;
> -		}
> +		fsize->stepwise.max_width = ctx->max_width;
> +		fsize->stepwise.max_height = ctx->max_height;
> +

Recent testing on ChromeOS suggests this doesn't work. The spec implies
that querying capabilities could happen before the output format is set.
And also, supported frame sizes are detected for each given format,
which may not be the one current set.

So the if block above has to be reintroduced in some form. I'll take a
look at this.


Regards
ChenYu

>  		mtk_v4l2_debug(1, "%x, %d %d %d %d %d %d",
>  				ctx->dev->dec_capability,
>  				fsize->stepwise.min_width,
> @@ -592,8 +594,6 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
>  				fsize->stepwise.max_height,
>  				fsize->stepwise.step_height);
>  
> -		ctx->max_width = fsize->stepwise.max_width;
> -		ctx->max_height = fsize->stepwise.max_height;
>  		return 0;
>  	}
>  
> 
> 
> >  		return 0;
> >  	}
> >  

[...]


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability
  2022-06-17  6:46     ` Chen-Yu Tsai
@ 2022-06-21 15:33       ` Nicolas Dufresne
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Dufresne @ 2022-06-21 15:33 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Yunfei Dong, Alexandre Courbot, Hans Verkuil, Tzung-Bi Shih,
	AngeloGioacchino Del Regno, Benjamin Gaignard, Tiffany Lin,
	Andrew-CT Chen, Mauro Carvalho Chehab, Rob Herring,
	Matthias Brugger, Tomasz Figa, George Sun, Xiaoyong Lu,
	Hsin-Yi Wang, Fritz Koenig, Dafna Hirschfeld, Daniel Vetter,
	dri-devel, Irui Wang, Steve Cho, linux-media, devicetree,
	linux-kernel, linux-arm-kernel, srv_heupstream, linux-mediatek,
	Project_Global_Chrome_Upstream_Group

Le vendredi 17 juin 2022 à 14:46 +0800, Chen-Yu Tsai a écrit :
> Hi,
> 
> On Mon, Feb 28, 2022 at 04:29:15PM -0500, Nicolas Dufresne wrote:
> > Hi Yunfei,
> > 
> > this patch does not work unless userland calls enum_framesizes, which is
> > completely optional. See comment and suggestion below.
> > 
> > Le mercredi 23 février 2022 à 11:39 +0800, Yunfei Dong a écrit :
> > > Supported max resolution for different platforms are not the same: 2K
> > > or 4K, getting it according to dec_capability.
> > > 
> > > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
> > > Reviewed-by: Tzung-Bi Shih<tzungbi@google.com>
> > > ---
> > >  .../platform/mtk-vcodec/mtk_vcodec_dec.c      | 29 +++++++++++--------
> > >  .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  4 +++
> > >  2 files changed, 21 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > > index 130ecef2e766..304f5afbd419 100644
> > > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > > @@ -445,7 +447,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
> > >  		return -EINVAL;
> > >  
> > >  	q_data->fmt = fmt;
> > > -	vidioc_try_fmt(f, q_data->fmt);
> > > +	vidioc_try_fmt(ctx, f, q_data->fmt);
> > >  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> > >  		q_data->sizeimage[0] = pix_mp->plane_fmt[0].sizeimage;
> > >  		q_data->coded_width = pix_mp->width;
> > > @@ -545,6 +547,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
> > >  				fsize->stepwise.min_height,
> > >  				fsize->stepwise.max_height,
> > >  				fsize->stepwise.step_height);
> > > +
> > > +		ctx->max_width = fsize->stepwise.max_width;
> > > +		ctx->max_height = fsize->stepwise.max_height;
> > 
> > The spec does not require calling enum_fmt, so changing the maximum here is
> > incorrect (and fail with GStreamer). If userland never enum the framesizes, the
> > resolution get limited to 1080p.
> > 
> > As this only depends and the OUTPUT format and the device being open()
> > (condition being dev_capability being set and OUTPUT format being known / not
> > VP8), you could initialize the cxt max inside s_fmt(OUTPUT) instead, which is a
> > mandatory call. I have tested this change to verify this:
> > 
> > 
> > diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > index 044e3dfbdd8c..3e7c571526a4 100644
> > --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
> > @@ -484,6 +484,14 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
> >  	if (fmt == NULL)
> >  		return -EINVAL;
> >  
> > +	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE &&
> > +	    !(ctx->dev->dec_capability & VCODEC_CAPABILITY_4K_DISABLED) &&
> > +	    fmt->fourcc != V4L2_PIX_FMT_VP8_FRAME) {
> > +		mtk_v4l2_debug(3, "4K is enabled");
> > +		ctx->max_width = VCODEC_DEC_4K_CODED_WIDTH;
> > +		ctx->max_height = VCODEC_DEC_4K_CODED_HEIGHT;
> > +	}
> > +
> >  	q_data->fmt = fmt;
> >  	vidioc_try_fmt(ctx, f, q_data->fmt);
> >  	if (f->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
> > @@ -574,15 +582,9 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
> >  
> >  		fsize->type = V4L2_FRMSIZE_TYPE_STEPWISE;
> >  		fsize->stepwise = dec_pdata->vdec_framesizes[i].stepwise;
> > -		if (!(ctx->dev->dec_capability &
> > -				VCODEC_CAPABILITY_4K_DISABLED) &&
> > -				fsize->pixel_format != V4L2_PIX_FMT_VP8_FRAME) {
> > -			mtk_v4l2_debug(3, "4K is enabled");
> > -			fsize->stepwise.max_width =
> > -					VCODEC_DEC_4K_CODED_WIDTH;
> > -			fsize->stepwise.max_height =
> > -					VCODEC_DEC_4K_CODED_HEIGHT;
> > -		}
> > +		fsize->stepwise.max_width = ctx->max_width;
> > +		fsize->stepwise.max_height = ctx->max_height;
> > +
> 
> Recent testing on ChromeOS suggests this doesn't work. The spec implies
> that querying capabilities could happen before the output format is set.
> And also, supported frame sizes are detected for each given format,
> which may not be the one current set.

In v4l2, formats are always set. Perhaps the problem is that we don't
automatically set ctx->max_width/height for the default format when the firmware
is up. I noticed recently the chromium always do G_FMT before S_FMT, so perhaps
it can skip S_FMT if the default format is appropriate, and that endup avoiding
the code I've just suggested. At the time I wrote that, I only had GStreamer
available to test, and it always calls S_FMT, which is mandatory, see 4.5.3.2.
Initialization step 1. But I cannot say userland would be wrong to skip if that
format was "initially" correct.

If my understanding is not correct, then perhaps you should provide a tad more
details on how this failed for you, and we can then better judge an appropriate
fix.

regards,
Nicolas

> 
> So the if block above has to be reintroduced in some form. I'll take a
> look at this.
> 
> 
> Regards
> ChenYu
> 
> >  		mtk_v4l2_debug(1, "%x, %d %d %d %d %d %d",
> >  				ctx->dev->dec_capability,
> >  				fsize->stepwise.min_width,
> > @@ -592,8 +594,6 @@ static int vidioc_enum_framesizes(struct file *file, void *priv,
> >  				fsize->stepwise.max_height,
> >  				fsize->stepwise.step_height);
> >  
> > -		ctx->max_width = fsize->stepwise.max_width;
> > -		ctx->max_height = fsize->stepwise.max_height;
> >  		return 0;
> >  	}
> >  
> > 
> > 
> > >  		return 0;
> > >  	}
> > >  
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-06-21 15:35 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23  3:39 [PATCH v7, 00/15] media: mtk-vcodec: support for M8192 decoder Yunfei Dong
2022-02-23  3:39 ` [PATCH v7, 01/15] media: mtk-vcodec: Add vdec enable/disable hardware helpers Yunfei Dong
2022-02-25  9:23   ` AngeloGioacchino Del Regno
2022-02-23  3:39 ` [PATCH v7, 02/15] media: mtk-vcodec: Using firmware type to separate different firmware architecture Yunfei Dong
2022-02-23  3:39 ` [PATCH v7, 03/15] media: mtk-vcodec: get capture queue buffer size from scp Yunfei Dong
2022-03-01 14:44   ` Nicolas Dufresne
2022-03-02  2:26     ` yunfei.dong
2022-02-23  3:39 ` [PATCH v7, 04/15] media: mtk-vcodec: Read max resolution from dec_capability Yunfei Dong
2022-02-25  9:23   ` AngeloGioacchino Del Regno
2022-02-28 21:29   ` Nicolas Dufresne
2022-03-02  1:47     ` yunfei.dong
2022-06-17  6:46     ` Chen-Yu Tsai
2022-06-21 15:33       ` Nicolas Dufresne
2022-02-23  3:39 ` [PATCH v7, 05/15] media: mtk-vcodec: Call v4l2_m2m_set_dst_buffered() set capture buffer buffered Yunfei Dong
2022-03-01 18:50   ` Nicolas Dufresne
2022-02-23  3:39 ` [PATCH v7, 06/15] media: mtk-vcodec: Refactor get and put capture buffer flow Yunfei Dong
2022-03-01 19:00   ` Nicolas Dufresne
2022-02-23  3:40 ` [PATCH v7, 07/15] media: mtk-vcodec: Refactor supported vdec formats and framesizes Yunfei Dong
2022-02-25  9:24   ` AngeloGioacchino Del Regno
2022-03-01 14:34   ` Nicolas Dufresne
2022-03-04  7:27     ` yunfei.dong
2022-02-23  3:40 ` [PATCH v7, 08/15] media: mtk-vcodec: Add format to support MT21C Yunfei Dong
2022-02-25  9:24   ` AngeloGioacchino Del Regno
2022-02-23  3:40 ` [PATCH v7, 09/15] media: mtk-vcodec: disable vp8 4K capability Yunfei Dong
2022-03-01 19:02   ` Nicolas Dufresne
2022-02-23  3:40 ` [PATCH v7, 10/15] media: mtk-vcodec: Fix v4l2-compliance fail Yunfei Dong
2022-02-23  3:40 ` [PATCH v7, 11/15] media: mtk-vcodec: record capture queue format type Yunfei Dong
2022-02-25  9:24   ` AngeloGioacchino Del Regno
2022-02-23  3:40 ` [PATCH v7, 12/15] media: mtk-vcodec: Extract H264 common code Yunfei Dong
2022-03-01 21:30   ` Nicolas Dufresne
2022-02-23  3:40 ` [PATCH v7, 13/15] media: mtk-vcodec: support stateless H.264 decoding for mt8192 Yunfei Dong
2022-03-01 22:01   ` Nicolas Dufresne
2022-02-23  3:40 ` [PATCH v7, 14/15] media: mtk-vcodec: support stateless VP8 decoding Yunfei Dong
2022-03-01 22:15   ` Nicolas Dufresne
2022-02-23  3:40 ` [PATCH v7, 15/15] media: mtk-vcodec: support stateless VP9 decoding Yunfei Dong
2022-03-01 22:22   ` Nicolas Dufresne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).