linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] media: hantro: Add support for H264 decoding
@ 2019-06-19 12:15 Boris Brezillon
  2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

Hello,

This patch series adds support H264 decoding support to the hantro
driver and doing some consilidation cleanup in the driver along the
way.

Some details about the patches forming this patchset:

* The first patch is adding support for the sort_r() variant and has
  been posted separately by Rasmus. I put it back there because Andrew
  told me to repost it with the patch series using this new variant.
  As mentioned in the patch itself, I'd like this patch to be merged
  as soon as possible to avoid the synchronisation burden that might
  appear if we decide to delay it.

* Patch 2 is needed to properly propagate the output buf timestamp to
  the capture buf one, which is required for intra-frame references.

* Patches 3 to 6 are consolidating the code by providing helpers that
  can be used by all hantro backend and simplifying the ctrl
  initialization logic. We also constify the controls array.

* Patches 7 to 8 are adding common H264 decoding bits and patch 9 is
  enabling H264 decoding on rk3288

Now, a few words about the dependencies. Unfortunately there are a lot,
and that'd be great to have some of them merged.

* This series is based on top of Ezequiel's VP8 work [1].
* It depends on [2] which defines/described the H264 decoding mode
  control.
* Depends on [3] since I'm using vb2_get_buffer() to retrieve a
  reference buffer
* The final dep is a fix I sent this morning allowing me to simplify the
  ctrl initialization logic

Regards,

Boris

Boris Brezillon (5):
  media: hantro: Move copy_metadata() before doing a decode operation
  media: hantro: Constify the control array
  media: hantro: Simplify the controls creation logic
  media: hantro: Add hantro_get_{src,dst}_buf() helpers
  media: hantro: Add helpers to prepare/finish a run

Hertz Wong (3):
  media: hantro: Add core bits to support H264 decoding
  media: hantro: Add support for H264 decoding on G1
  media: hantro: Enable H264 decoding on rk3288

Rasmus Villemoes (1):
  lib/sort.c: implement sort() variant taking context argument

 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  24 +-
 drivers/staging/media/hantro/hantro_drv.c     |  95 ++-
 .../staging/media/hantro/hantro_g1_h264_dec.c | 295 ++++++++
 .../media/hantro/hantro_g1_mpeg2_dec.c        |  14 +-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |  17 +-
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |  11 +-
 drivers/staging/media/hantro/hantro_h264.c    | 638 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  55 ++
 drivers/staging/media/hantro/hantro_v4l2.c    |  15 +-
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |  21 +-
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  12 +-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  14 +-
 include/linux/sort.h                          |   5 +
 lib/sort.c                                    |  34 +-
 15 files changed, 1175 insertions(+), 77 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_g1_h264_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_h264.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-07-25 12:40   ` Boris Brezillon
  2019-07-26  0:05   ` Andrew Morton
  2019-06-19 12:15 ` [PATCH 2/9] media: hantro: Move copy_metadata() before doing a decode operation Boris Brezillon
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Our list_sort() utility has always supported a context argument that
is passed through to the comparison routine. Now there's a use case
for the similar thing for sort().

This implements sort_r by simply extending the existing sort function
in the obvious way. To avoid code duplication, we want to implement
sort() in terms of sort_r(). The naive way to do that is

static int cmp_wrapper(const void *a, const void *b, const void *ctx)
{
  int (*real_cmp)(const void*, const void*) = ctx;
  return real_cmp(a, b);
}

sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }

but this would do two indirect calls for each comparison. Instead, do
as is done for the default swap functions - that only adds a cost of a
single easily predicted branch to each comparison call.

Aside from introducing support for the context argument, this also
serves as preparation for patches that will eliminate the indirect
comparison calls in common cases.

Requested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
Hi all,

Andrew, you acked the first version of this patch, but Rasmus proposed
a better solution and posted a v2. Can you review/ack this version.

Hans, Mauro, Andrew suggested to have this patch applied along with
its first user (the H264 backend of the hantro codec), so here it is.
Note that, if possible, I'd like to have this patch queued for the next
release even if the H264 bits don't get accepted as is. The rationale
here being that Rasmus told me he was planning to further improve the
sort() logic after the next -rc1 is out, and I fear his changes will
conflict with this patch, which might involve some kind synchronisation
(a topic branch) between the media maintainers and Andrew.

Let me know how you want to proceed with that.

Regards,

Boris
---
 include/linux/sort.h |  5 +++++
 lib/sort.c           | 34 ++++++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/include/linux/sort.h b/include/linux/sort.h
index 2b99a5dd073d..61b96d0ebc44 100644
--- a/include/linux/sort.h
+++ b/include/linux/sort.h
@@ -4,6 +4,11 @@
 
 #include <linux/types.h>
 
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp)(const void *, const void *, const void *),
+	    void (*swap)(void *, void *, int),
+	    const void *priv);
+
 void sort(void *base, size_t num, size_t size,
 	  int (*cmp)(const void *, const void *),
 	  void (*swap)(void *, void *, int));
diff --git a/lib/sort.c b/lib/sort.c
index cf408aec3733..d54cf97e9548 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
 		swap_func(a, b, (int)size);
 }
 
+typedef int (*cmp_func_t)(const void *, const void *);
+typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
+#define _CMP_WRAPPER ((cmp_r_func_t)0L)
+
+static int do_cmp(const void *a, const void *b,
+		  cmp_r_func_t cmp, const void *priv)
+{
+	if (cmp == _CMP_WRAPPER)
+		return ((cmp_func_t)(priv))(a, b);
+	return cmp(a, b, priv);
+}
+
 /**
  * parent - given the offset of the child, find the offset of the parent.
  * @i: the offset of the heap element whose parent is sought.  Non-zero.
@@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
 }
 
 /**
- * sort - sort an array of elements
+ * sort_r - sort an array of elements
  * @base: pointer to data to sort
  * @num: number of elements
  * @size: size of each element
  * @cmp_func: pointer to comparison function
  * @swap_func: pointer to swap function or NULL
+ * @priv: third argument passed to comparison function
  *
  * This function does a heapsort on the given array.  You may provide
  * a swap_func function if you need to do something more than a memory
@@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
  * O(n*n) worst-case behavior and extra memory requirements that make
  * it less suitable for kernel use.
  */
-void sort(void *base, size_t num, size_t size,
-	  int (*cmp_func)(const void *, const void *),
-	  void (*swap_func)(void *, void *, int size))
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp_func)(const void *, const void *, const void *),
+	    void (*swap_func)(void *, void *, int size),
+	    const void *priv)
 {
 	/* pre-scale counters for performance */
 	size_t n = num * size, a = (num/2) * size;
@@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
 		 * average, 3/4 worst-case.)
 		 */
 		for (b = a; c = 2*b + size, (d = c + size) < n;)
-			b = cmp_func(base + c, base + d) >= 0 ? c : d;
+			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
 		if (d == n)	/* Special case last leaf with no sibling */
 			b = c;
 
 		/* Now backtrack from "b" to the correct location for "a" */
-		while (b != a && cmp_func(base + a, base + b) >= 0)
+		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
 			b = parent(b, lsbit, size);
 		c = b;			/* Where "a" belongs */
 		while (b != a) {	/* Shift it into place */
@@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
 		}
 	}
 }
+EXPORT_SYMBOL(sort_r);
+
+void sort(void *base, size_t num, size_t size,
+	  int (*cmp_func)(const void *, const void *),
+	  void (*swap_func)(void *, void *, int size))
+{
+	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
+}
 EXPORT_SYMBOL(sort);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] media: hantro: Move copy_metadata() before doing a decode operation
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
  2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:15 ` [PATCH 3/9] media: hantro: Constify the control array Boris Brezillon
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

Some decoders use intra slice/frame references. The capture buffer
pointed by these references might be new and thus have invalid
timestamp which prevents the decoder logic from retrieving the
vb2_buffer object based on the output buf timestamp.
Copy all metadata (including the timestamp) before starting the decode
operation.

Suggested-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/hantro_drv.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index ab0aa7408a7d..28b0fed89dcb 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -109,8 +109,6 @@ static void hantro_job_finish(struct hantro_dev *vpu,
 	src->sequence = ctx->sequence_out++;
 	dst->sequence = ctx->sequence_cap++;
 
-	v4l2_m2m_buf_copy_metadata(src, dst, true);
-
 	ret = ctx->buf_finish(ctx, &dst->vb2_buf, bytesused);
 	if (ret)
 		result = VB2_BUF_STATE_ERROR;
@@ -154,8 +152,12 @@ void hantro_watchdog(struct work_struct *work)
 static void device_run(void *priv)
 {
 	struct hantro_ctx *ctx = priv;
+	struct vb2_v4l2_buffer *src, *dst;
 	int ret;
 
+	src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+	dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+
 	ret = clk_bulk_enable(ctx->dev->variant->num_clocks, ctx->dev->clocks);
 	if (ret)
 		goto err_cancel_job;
@@ -163,6 +165,8 @@ static void device_run(void *priv)
 	if (ret < 0)
 		goto err_cancel_job;
 
+	v4l2_m2m_buf_copy_metadata(src, dst, true);
+
 	ctx->codec_ops->run(ctx);
 	return;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] media: hantro: Constify the control array
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
  2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
  2019-06-19 12:15 ` [PATCH 2/9] media: hantro: Move copy_metadata() before doing a decode operation Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-07-05 16:05   ` Ezequiel Garcia
  2019-06-19 12:15 ` [PATCH 4/9] media: hantro: Simplify the controls creation logic Boris Brezillon
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

controls[] is not supposed to be modified at runtime, let's make it
explicit by adding a const specifier.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/hantro_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 28b0fed89dcb..db49d643ddb7 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -264,7 +264,7 @@ static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
 	.s_ctrl = hantro_s_ctrl,
 };
 
-static struct hantro_ctrl controls[] = {
+static const struct hantro_ctrl controls[] = {
 	{
 		.id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
 		.codec = HANTRO_JPEG_ENCODER,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] media: hantro: Simplify the controls creation logic
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (2 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 3/9] media: hantro: Constify the control array Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:15 ` [PATCH 5/9] media: hantro: Add hantro_get_{src,dst}_buf() helpers Boris Brezillon
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

v4l2_ctrl_new_custom() should work for any kind of control, including
standard ones. With that change, we automatically get support for
menu controls.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/hantro.h     |  2 --
 drivers/staging/media/hantro/hantro_drv.c | 28 +++++++----------------
 2 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index f903e82c7760..cff65707285d 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -109,12 +109,10 @@ enum hantro_codec_mode {
 
 /*
  * struct hantro_ctrl - helper type to declare supported controls
- * @id:		V4L2 control ID (V4L2_CID_xxx)
  * @codec:	codec id this control belong to (HANTRO_JPEG_ENCODER, etc.)
  * @cfg:	control configuration
  */
 struct hantro_ctrl {
-	unsigned int id;
 	unsigned int codec;
 	struct v4l2_ctrl_config cfg;
 };
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index db49d643ddb7..44974aaf25ca 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -266,31 +266,29 @@ static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
 
 static const struct hantro_ctrl controls[] = {
 	{
-		.id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
 		.codec = HANTRO_JPEG_ENCODER,
 		.cfg = {
+			.id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
 			.min = 5,
 			.max = 100,
 			.step = 1,
 			.def = 50,
+			.ops = &hantro_ctrl_ops,
 		},
 	}, {
-		.id = V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
 		.codec = HANTRO_MPEG2_DECODER,
 		.cfg = {
-			.elem_size = sizeof(struct v4l2_ctrl_mpeg2_slice_params),
+			.id = V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
 		},
 	}, {
-		.id = V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
 		.codec = HANTRO_MPEG2_DECODER,
 		.cfg = {
-			.elem_size = sizeof(struct v4l2_ctrl_mpeg2_quantization),
+			.id = V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
 		},
 	}, {
-		.id = V4L2_CID_MPEG_VIDEO_VP8_FRAME_HDR,
 		.codec = HANTRO_VP8_DECODER,
 		.cfg = {
-			.elem_size = sizeof(struct v4l2_ctrl_vp8_frame_header),
+			.id = V4L2_CID_MPEG_VIDEO_VP8_FRAME_HDR,
 		},
 	},
 };
@@ -306,22 +304,12 @@ static int hantro_ctrls_setup(struct hantro_dev *vpu,
 	for (i = 0; i < num_ctrls; i++) {
 		if (!(allowed_codecs & controls[i].codec))
 			continue;
-		if (!controls[i].cfg.elem_size) {
-			v4l2_ctrl_new_std(&ctx->ctrl_handler,
-					  &hantro_ctrl_ops,
-					  controls[i].id, controls[i].cfg.min,
-					  controls[i].cfg.max,
-					  controls[i].cfg.step,
-					  controls[i].cfg.def);
-		} else {
-			controls[i].cfg.id = controls[i].id;
-			v4l2_ctrl_new_custom(&ctx->ctrl_handler,
-					     &controls[i].cfg, NULL);
-		}
 
+		v4l2_ctrl_new_custom(&ctx->ctrl_handler,
+				     &controls[i].cfg, NULL);
 		if (ctx->ctrl_handler.error) {
 			vpu_err("Adding control (%d) failed %d\n",
-				controls[i].id,
+				controls[i].cfg.id,
 				ctx->ctrl_handler.error);
 			v4l2_ctrl_handler_free(&ctx->ctrl_handler);
 			return ctx->ctrl_handler.error;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] media: hantro: Add hantro_get_{src,dst}_buf() helpers
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (3 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 4/9] media: hantro: Simplify the controls creation logic Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:15 ` [PATCH 6/9] media: hantro: Add helpers to prepare/finish a run Boris Brezillon
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

And replace all calls to v4l2_m2m_next_{src,dst}_buf() by
hantro_get_{src,dst}_buf() one.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/hantro.h               | 13 +++++++++++++
 drivers/staging/media/hantro/hantro_drv.c           |  4 ++--
 drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c  |  4 ++--
 drivers/staging/media/hantro/hantro_g1_vp8_dec.c    |  8 ++++----
 drivers/staging/media/hantro/hantro_h1_jpeg_enc.c   |  4 ++--
 .../staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c   |  4 ++--
 .../staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c  |  4 ++--
 7 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index cff65707285d..01b5c8a9aea6 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -20,6 +20,7 @@
 #include <media/v4l2-ctrls.h>
 #include <media/v4l2-device.h>
 #include <media/v4l2-ioctl.h>
+#include <media/v4l2-mem2mem.h>
 #include <media/videobuf2-core.h>
 #include <media/videobuf2-dma-contig.h>
 
@@ -351,4 +352,16 @@ bool hantro_is_encoder_ctx(const struct hantro_ctx *ctx);
 void *hantro_get_ctrl(struct hantro_ctx *ctx, u32 id);
 dma_addr_t hantro_get_ref(struct vb2_queue *q, u64 ts);
 
+static inline struct vb2_v4l2_buffer *
+hantro_get_src_buf(struct hantro_ctx *ctx)
+{
+	return v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+}
+
+static inline struct vb2_v4l2_buffer *
+hantro_get_dst_buf(struct hantro_ctx *ctx)
+{
+	return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+}
+
 #endif /* HANTRO_H_ */
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 44974aaf25ca..b95f67ee3dd0 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -155,8 +155,8 @@ static void device_run(void *priv)
 	struct vb2_v4l2_buffer *src, *dst;
 	int ret;
 
-	src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	src = hantro_get_src_buf(ctx);
+	dst = hantro_get_dst_buf(ctx);
 
 	ret = clk_bulk_enable(ctx->dev->variant->num_clocks, ctx->dev->clocks);
 	if (ret)
diff --git a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
index e592c1b66375..55f861e96108 100644
--- a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
@@ -167,8 +167,8 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 	const struct v4l2_mpeg2_picture *picture;
 	u32 reg;
 
-	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
 
 	/* Apply request controls if any */
 	v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req,
diff --git a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
index b6e1bcf4c071..f406caf589ee 100644
--- a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
@@ -278,7 +278,7 @@ static void cfg_parts(struct hantro_ctx *ctx,
 	u32 count = 0;
 	u32 i;
 
-	vb2_src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+	vb2_src = hantro_get_src_buf(ctx);
 	src_dma = vb2_dma_contig_plane_dma_addr(&vb2_src->vb2_buf, 0);
 
 	/*
@@ -410,7 +410,7 @@ static void cfg_ref(struct hantro_ctx *ctx,
 	struct vb2_v4l2_buffer *vb2_dst;
 	u32 reg;
 
-	vb2_dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	vb2_dst = hantro_get_dst_buf(ctx);
 
 	/* set last frame address */
 	reg = hantro_get_ref(cap_q, hdr->last_frame_ts);
@@ -444,7 +444,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	struct vb2_v4l2_buffer *vb2_dst;
 	u32 reg;
 
-	vb2_dst = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	vb2_dst = hantro_get_dst_buf(ctx);
 
 	/* set probability table buffer address */
 	vdpu_write_relaxed(vpu, ctx->vp8_dec.prob_tbl.dma,
@@ -475,7 +475,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	u32 mb_width, mb_height;
 	u32 reg;
 
-	vb2_src = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+	vb2_src = hantro_get_src_buf(ctx);
 	v4l2_ctrl_request_setup(vb2_src->vb2_buf.req_obj.req,
 				&ctx->ctrl_handler);
 
diff --git a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
index 0c1e3043dc7e..f5adb5cbde50 100644
--- a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
+++ b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
@@ -84,8 +84,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 	struct hantro_jpeg_ctx jpeg_ctx;
 	u32 reg;
 
-	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
 
 	memset(&jpeg_ctx, 0, sizeof(jpeg_ctx));
 	jpeg_ctx.buffer = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
index ae66354d2d93..82c5af822766 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
@@ -116,8 +116,8 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 	struct media_request *src_req;
 	u32 reg;
 
-	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
 
 	src_req = src_buf->vb2_buf.req_obj.req;
 	v4l2_ctrl_request_setup(src_req, &ctx->ctrl_handler);
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
index 8685bddfbcab..451bfcceadba 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
@@ -169,8 +169,8 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 	const struct v4l2_mpeg2_picture *picture;
 	u32 reg;
 
-	src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
-	dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
 
 	/* Apply request controls if any */
 	v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] media: hantro: Add helpers to prepare/finish a run
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (4 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 5/9] media: hantro: Add hantro_get_{src,dst}_buf() helpers Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:15 ` [PATCH 7/9] media: hantro: Add core bits to support H264 decoding Boris Brezillon
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Boris Brezillon

And use them where appropriate.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
We might want to move those hantro_{prepare,finish}_run() calls to
device_run() and have a 2-step approach similar to cedrus (prepare +
trigger) at some point, but let's keep that for later.
---
 drivers/staging/media/hantro/hantro_drv.c     | 22 +++++++++++++++++++
 .../media/hantro/hantro_g1_mpeg2_dec.c        | 10 ++-------
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |  9 ++------
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |  7 ++++--
 drivers/staging/media/hantro/hantro_hw.h      |  2 ++
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  8 ++-----
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    | 10 ++-------
 7 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index b95f67ee3dd0..86302fb356ae 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -149,6 +149,28 @@ void hantro_watchdog(struct work_struct *work)
 	}
 }
 
+void hantro_prepare_run(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *src_buf;
+
+	src_buf = hantro_get_src_buf(ctx);
+	v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req,
+				&ctx->ctrl_handler);
+}
+
+void hantro_finish_run(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *src_buf;
+
+	src_buf = hantro_get_src_buf(ctx);
+	v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req,
+				   &ctx->ctrl_handler);
+
+	/* Kick the watchdog. */
+	schedule_delayed_work(&ctx->dev->watchdog_work,
+			      msecs_to_jiffies(2000));
+}
+
 static void device_run(void *priv)
 {
 	struct hantro_ctx *ctx = priv;
diff --git a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
index 55f861e96108..80f0e94f8afa 100644
--- a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
@@ -171,8 +171,7 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 	dst_buf = hantro_get_dst_buf(ctx);
 
 	/* Apply request controls if any */
-	v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req,
-				&ctx->ctrl_handler);
+	hantro_prepare_run(ctx);
 
 	slice_params = hantro_get_ctrl(ctx,
 				       V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS);
@@ -248,12 +247,7 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 					&dst_buf->vb2_buf,
 					sequence, picture, slice_params);
 
-	/* Controls no longer in-use, we can complete them */
-	v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req,
-				   &ctx->ctrl_handler);
-
-	/* Kick the watchdog and start decoding */
-	schedule_delayed_work(&vpu->watchdog_work, msecs_to_jiffies(2000));
+	hantro_finish_run(ctx);
 
 	reg = G1_REG_DEC_E(1);
 	vdpu_write(vpu, reg, G1_SWREG(1));
diff --git a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
index f406caf589ee..4b1a89dd86a9 100644
--- a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
@@ -476,8 +476,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	u32 reg;
 
 	vb2_src = hantro_get_src_buf(ctx);
-	v4l2_ctrl_request_setup(vb2_src->vb2_buf.req_obj.req,
-				&ctx->ctrl_handler);
+	hantro_prepare_run(ctx);
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HDR);
 	if (WARN_ON(!hdr))
@@ -538,11 +537,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	cfg_ref(ctx, hdr);
 	cfg_buffers(ctx, hdr);
 
-	/* Controls no longer in-use, we can complete them */
-	v4l2_ctrl_request_complete(vb2_src->vb2_buf.req_obj.req,
-				   &ctx->ctrl_handler);
-
-	schedule_delayed_work(&vpu->watchdog_work, msecs_to_jiffies(2000));
+	hantro_finish_run(ctx);
 
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
 }
diff --git a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
index f5adb5cbde50..ecd34a7db190 100644
--- a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
+++ b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
@@ -87,6 +87,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 	src_buf = hantro_get_src_buf(ctx);
 	dst_buf = hantro_get_dst_buf(ctx);
 
+	hantro_prepare_run(ctx);
+
 	memset(&jpeg_ctx, 0, sizeof(jpeg_ctx));
 	jpeg_ctx.buffer = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
 	jpeg_ctx.width = ctx->dst_fmt.width;
@@ -119,7 +121,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 		| H1_REG_ENC_CTRL_ENC_MODE_JPEG
 		| H1_REG_ENC_PIC_INTRA
 		| H1_REG_ENC_CTRL_EN_BIT;
-	/* Kick the watchdog and start encoding */
-	schedule_delayed_work(&vpu->watchdog_work, msecs_to_jiffies(2000));
+
+	hantro_finish_run(ctx);
+
 	vepu_write(vpu, reg, H1_REG_ENC_CTRL);
 }
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 7849852affde..34ef24e3a9ef 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -97,6 +97,8 @@ void hantro_watchdog(struct work_struct *work);
 void hantro_run(struct hantro_ctx *ctx);
 void hantro_irq_done(struct hantro_dev *vpu, unsigned int bytesused,
 		     enum vb2_buffer_state result);
+void hantro_prepare_run(struct hantro_ctx *ctx);
+void hantro_finish_run(struct hantro_ctx *ctx);
 
 void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
 void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
index 82c5af822766..06162f569b5e 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
@@ -113,14 +113,12 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
 	struct hantro_jpeg_ctx jpeg_ctx;
-	struct media_request *src_req;
 	u32 reg;
 
 	src_buf = hantro_get_src_buf(ctx);
 	dst_buf = hantro_get_dst_buf(ctx);
 
-	src_req = src_buf->vb2_buf.req_obj.req;
-	v4l2_ctrl_request_setup(src_req, &ctx->ctrl_handler);
+	hantro_prepare_run(ctx);
 
 	memset(&jpeg_ctx, 0, sizeof(jpeg_ctx));
 	jpeg_ctx.buffer = vb2_plane_vaddr(&dst_buf->vb2_buf, 0);
@@ -157,9 +155,7 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 		| VEPU_REG_ENCODE_FORMAT_JPEG
 		| VEPU_REG_ENCODE_ENABLE;
 
-	v4l2_ctrl_request_complete(src_req, &ctx->ctrl_handler);
-
 	/* Kick the watchdog and start encoding */
-	schedule_delayed_work(&vpu->watchdog_work, msecs_to_jiffies(2000));
+	hantro_finish_run(ctx);
 	vepu_write(vpu, reg, VEPU_REG_ENCODE_START);
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
index 451bfcceadba..e7ba5c0441cc 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
@@ -172,9 +172,7 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 	src_buf = hantro_get_src_buf(ctx);
 	dst_buf = hantro_get_dst_buf(ctx);
 
-	/* Apply request controls if any */
-	v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req,
-				&ctx->ctrl_handler);
+	hantro_prepare_run(ctx);
 
 	slice_params = hantro_get_ctrl(ctx,
 				       V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS);
@@ -254,12 +252,8 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 					 &dst_buf->vb2_buf,
 					 sequence, picture, slice_params);
 
-	/* Controls no longer in-use, we can complete them */
-	v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req,
-				   &ctx->ctrl_handler);
-
 	/* Kick the watchdog and start decoding */
-	schedule_delayed_work(&vpu->watchdog_work, msecs_to_jiffies(2000));
+	hantro_finish_run(ctx);
 
 	reg = vdpu_read(vpu, VDPU_SWREG(57)) | VDPU_REG_DEC_E(1);
 	vdpu_write(vpu, reg, VDPU_SWREG(57));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (5 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 6/9] media: hantro: Add helpers to prepare/finish a run Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-07-25 13:31   ` Rasmus Villemoes
  2019-08-01  4:06   ` Tomasz Figa
  2019-06-19 12:15 ` [PATCH 8/9] media: hantro: Add support for H264 decoding on G1 Boris Brezillon
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Hertz Wong, Boris Brezillon

From: Hertz Wong <hertz.wong@rock-chips.com>

Add helpers and patch hantro_{drv,v4l2}.c to prepare addition of H264
decoding support.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/Makefile      |   1 +
 drivers/staging/media/hantro/hantro.h      |   9 +-
 drivers/staging/media/hantro/hantro_drv.c  |  35 ++
 drivers/staging/media/hantro/hantro_h264.c | 638 +++++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h   |  52 ++
 drivers/staging/media/hantro/hantro_v4l2.c |  15 +-
 6 files changed, 748 insertions(+), 2 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_h264.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index a627aee77f75..d63e25682287 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -9,6 +9,7 @@ hantro-vpu-y += \
 		rk3399_vpu_hw_jpeg_enc.o \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		hantro_jpeg.o \
+		hantro_h264.o \
 		hantro_mpeg2.o \
 		hantro_vp8.o
 
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 01b5c8a9aea6..6ffbd9714d5c 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -26,6 +26,10 @@
 
 #include "hantro_hw.h"
 
+#define H264_MB_DIM			16
+#define H264_MB_WIDTH(w)		DIV_ROUND_UP(w, H264_MB_DIM)
+#define H264_MB_HEIGHT(h)		DIV_ROUND_UP(h, H264_MB_DIM)
+
 #define MPEG2_MB_DIM			16
 #define MPEG2_MB_WIDTH(w)		DIV_ROUND_UP(w, MPEG2_MB_DIM)
 #define MPEG2_MB_HEIGHT(h)		DIV_ROUND_UP(h, MPEG2_MB_DIM)
@@ -39,9 +43,9 @@ struct hantro_codec_ops;
 
 #define HANTRO_JPEG_ENCODER	BIT(0)
 #define HANTRO_ENCODERS		0x0000ffff
-
 #define HANTRO_MPEG2_DECODER	BIT(16)
 #define HANTRO_VP8_DECODER	BIT(17)
+#define HANTRO_H264_DECODER	BIT(18)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -98,12 +102,14 @@ struct hantro_variant {
  * enum hantro_codec_mode - codec operating mode.
  * @HANTRO_MODE_NONE:  No operating mode. Used for RAW video formats.
  * @HANTRO_MODE_JPEG_ENC: JPEG encoder.
+ * @HANTRO_MODE_H264_DEC: H264 decoder.
  * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
  * @HANTRO_MODE_VP8_DEC: VP8 decoder.
  */
 enum hantro_codec_mode {
 	HANTRO_MODE_NONE = -1,
 	HANTRO_MODE_JPEG_ENC,
+	HANTRO_MODE_H264_DEC,
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
 };
@@ -242,6 +248,7 @@ struct hantro_ctx {
 
 	/* Specific for particular codec modes. */
 	union {
+		struct hantro_h264_dec_hw_ctx h264_dec;
 		struct hantro_jpeg_enc_hw_ctx jpeg_enc;
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index 86302fb356ae..0dec1f52320f 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -312,6 +312,41 @@ static const struct hantro_ctrl controls[] = {
 		.cfg = {
 			.id = V4L2_CID_MPEG_VIDEO_VP8_FRAME_HDR,
 		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS,
+			.dims[0] = V4L2_H264_MAX_SLICES_PER_FRAME,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SPS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_PPS,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX,
+		},
+	}, {
+		.codec = HANTRO_H264_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE,
+			.max = V4L2_MPEG_VIDEO_H264_DECODING_PER_FRAME,
+			.menu_skip_mask = BIT(V4L2_MPEG_VIDEO_H264_DECODING_PER_SLICE),
+			.def = V4L2_MPEG_VIDEO_H264_DECODING_PER_FRAME,
+		},
+	}, {
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
new file mode 100644
index 000000000000..9fec1a0fd807
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_h264.c
@@ -0,0 +1,638 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip RK3288 VPU codec driver
+ *
+ * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
+ *	Hertz Wong <hertz.wong@rock-chips.com>
+ *	Herman Chen <herman.chen@rock-chips.com>
+ *
+ * Copyright (C) 2014 Google, Inc.
+ *	Tomasz Figa <tfiga@chromium.org>
+ */
+
+#include <linux/types.h>
+#include <linux/sort.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+
+/* Size with u32 units. */
+#define CABAC_INIT_BUFFER_SIZE		(460 * 2)
+#define POC_BUFFER_SIZE			34
+#define SCALING_LIST_SIZE		(6 * 16 + 6 * 64)
+
+/* Data structure describing auxiliary buffer format. */
+struct hantro_h264_dec_priv_tbl {
+	u32 cabac_table[CABAC_INIT_BUFFER_SIZE];
+	u32 poc[POC_BUFFER_SIZE];
+	u8 scaling_list[SCALING_LIST_SIZE];
+};
+
+/* Constant CABAC table. */
+static const u32 h264_cabac_table[] = {
+	0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07330000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
+	0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948, 0x0d29033e, 0x000b0137,
+	0x0045ef7f, 0xf3660052, 0xf94aeb6b, 0xe57fe17f, 0xe87fee5f, 0xe57feb72,
+	0xe27fef7b, 0xf473f07a, 0xf573f43f, 0xfe44f154, 0xf368fd46, 0xf85df65a,
+	0xe27fff4a, 0xfa61f95b, 0xec7ffc38, 0xfb52f94c, 0xea7df95d, 0xf557fd4d,
+	0xfb47fc3f, 0xfc44f454, 0xf93ef941, 0x083d0538, 0xfe420140, 0x003dfe4e,
+	0x01320734, 0x0a23002c, 0x0b26012d, 0x002e052c, 0x1f110133, 0x07321c13,
+	0x10210e3e, 0xf36cf164, 0xf365f35b, 0xf45ef658, 0xf054f656, 0xf953f357,
+	0xed5e0146, 0x0048fb4a, 0x123bf866, 0xf164005f, 0xfc4b0248, 0xf54bfd47,
+	0x0f2ef345, 0x003e0041, 0x1525f148, 0x09391036, 0x003e0c48, 0x18000f09,
+	0x08190d12, 0x0f090d13, 0x0a250c12, 0x061d1421, 0x0f1e042d, 0x013a003e,
+	0x073d0c26, 0x0b2d0f27, 0x0b2a0d2c, 0x102d0c29, 0x0a311e22, 0x122a0a37,
+	0x1133112e, 0x00591aed, 0x16ef1aef, 0x1ee71cec, 0x21e925e5, 0x21e928e4,
+	0x26ef21f5, 0x28f129fa, 0x26012911, 0x1efa1b03, 0x1a1625f0, 0x23fc26f8,
+	0x26fd2503, 0x26052a00, 0x23102716, 0x0e301b25, 0x153c0c44, 0x0261fd47,
+	0xfa2afb32, 0xfd36fe3e, 0x003a013f, 0xfe48ff4a, 0xf75bfb43, 0xfb1bfd27,
+	0xfe2c002e, 0xf040f844, 0xf64efa4d, 0xf656f45c, 0xf137f63c, 0xfa3efc41,
+	0xf449f84c, 0xf950f758, 0xef6ef561, 0xec54f54f, 0xfa49fc4a, 0xf356f360,
+	0xf561ed75, 0xf84efb21, 0xfc30fe35, 0xfd3ef347, 0xf64ff456, 0xf35af261,
+	0x0000fa5d, 0xfa54f84f, 0x0042ff47, 0x003efe3c, 0xfe3bfb4b, 0xfd3efc3a,
+	0xf742ff4f, 0x00470344, 0x0a2cf93e, 0x0f240e28, 0x101b0c1d, 0x012c1424,
+	0x1220052a, 0x01300a3e, 0x112e0940, 0xf468f561, 0xf060f958, 0xf855f955,
+	0xf755f358, 0x0442fd4d, 0xfd4cfa4c, 0x0a3aff4c, 0xff53f963, 0xf25f025f,
+	0x004cfb4a, 0x0046f54b, 0x01440041, 0xf249033e, 0x043eff44, 0xf34b0b37,
+	0x05400c46, 0x0f060613, 0x07100c0e, 0x120d0d0b, 0x0d0f0f10, 0x0c170d17,
+	0x0f140e1a, 0x0e2c1128, 0x112f1811, 0x15151916, 0x1f1b161d, 0x13230e32,
+	0x0a39073f, 0xfe4dfc52, 0xfd5e0945, 0xf46d24dd, 0x24de20e6, 0x25e22ce0,
+	0x22ee22f1, 0x28f121f9, 0x23fb2100, 0x2602210d, 0x17230d3a, 0x1dfd1a00,
+	0x161e1ff9, 0x23f122fd, 0x220324ff, 0x2205200b, 0x2305220c, 0x270b1e1d,
+	0x221a1d27, 0x13421f15, 0x1f1f1932, 0xef78ec70, 0xee72f555, 0xf15cf259,
+	0xe647f151, 0xf2500044, 0xf246e838, 0xe944e832, 0xf54a17f3, 0x1af328f1,
+	0x31f22c03, 0x2d062c22, 0x21361352, 0xfd4bff17, 0x0122012b, 0x0036fe37,
+	0x003d0140, 0x0044f75c, 0xf26af361, 0xf15af45a, 0xee58f649, 0xf74ff256,
+	0xf649f646, 0xf645fb42, 0xf740fb3a, 0x023b15f6, 0x18f51cf8, 0x1cff1d03,
+	0x1d092314, 0x1d240e43, 0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968,
+	0xfa35ff36, 0x07331721, 0x17021500, 0x01090031, 0xdb760539, 0xf34ef541,
+	0x013e0c31, 0xfc491132, 0x1240092b, 0x1d001a43, 0x105a0968, 0xd27fec68,
+	0x0143f34e, 0xf541013e, 0xfa56ef5f, 0xfa3d092d, 0xfd45fa51, 0xf5600637,
+	0x0743fb56, 0x0258003a, 0xfd4cf65e, 0x05360445, 0xfd510058, 0xf943fb4a,
+	0xfc4afb50, 0xf948013a, 0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948,
+	0x0d29033e, 0x002dfc4e, 0xfd60e57e, 0xe462e765, 0xe943e452, 0xec5ef053,
+	0xea6eeb5b, 0xee66f35d, 0xe37ff95c, 0xfb59f960, 0xf36cfd2e, 0xff41ff39,
+	0xf75dfd4a, 0xf75cf857, 0xe97e0536, 0x063c063b, 0x0645ff30, 0x0044fc45,
+	0xf858fe55, 0xfa4eff4b, 0xf94d0236, 0x0532fd44, 0x0132062a, 0xfc51013f,
+	0xfc460043, 0x0239fe4c, 0x0b230440, 0x013d0b23, 0x12190c18, 0x0d1d0d24,
+	0xf65df949, 0xfe490d2e, 0x0931f964, 0x09350235, 0x0535fe3d, 0x00380038,
+	0xf33ffb3c, 0xff3e0439, 0xfa450439, 0x0e270433, 0x0d440340, 0x013d093f,
+	0x07321027, 0x052c0434, 0x0b30fb3c, 0xff3b003b, 0x1621052c, 0x0e2bff4e,
+	0x003c0945, 0x0b1c0228, 0x032c0031, 0x002e022c, 0x0233002f, 0x0427023e,
+	0x062e0036, 0x0336023a, 0x043f0633, 0x06390735, 0x06340637, 0x0b2d0e24,
+	0x0835ff52, 0x0737fd4e, 0x0f2e161f, 0xff541907, 0x1ef91c03, 0x1c042000,
+	0x22ff1e06, 0x1e062009, 0x1f131a1b, 0x1a1e2514, 0x1c221146, 0x0143053b,
+	0x0943101e, 0x12201223, 0x161d181f, 0x1726122b, 0x14290b3f, 0x093b0940,
+	0xff5efe59, 0xf76cfa4c, 0xfe2c002d, 0x0034fd40, 0xfe3bfc46, 0xfc4bf852,
+	0xef66f74d, 0x0318002a, 0x00300037, 0xfa3bf947, 0xf453f557, 0xe277013a,
+	0xfd1dff24, 0x0126022b, 0xfa37003a, 0x0040fd4a, 0xf65a0046, 0xfc1d051f,
+	0x072a013b, 0xfe3afd48, 0xfd51f561, 0x003a0805, 0x0a0e0e12, 0x0d1b0228,
+	0x003afd46, 0xfa4ff855, 0x0000f36a, 0xf06af657, 0xeb72ee6e, 0xf262ea6e,
+	0xeb6aee67, 0xeb6be96c, 0xe670f660, 0xf45ffb5b, 0xf75dea5e, 0xfb560943,
+	0xfc50f655, 0xff46073c, 0x093a053d, 0x0c320f32, 0x12311136, 0x0a29072e,
+	0xff330731, 0x08340929, 0x062f0237, 0x0d290a2c, 0x06320535, 0x0d31043f,
+	0x0640fe45, 0xfe3b0646, 0x0a2c091f, 0x0c2b0335, 0x0e220a26, 0xfd340d28,
+	0x1120072c, 0x07260d32, 0x0a391a2b, 0x0e0b0b0e, 0x090b120b, 0x150917fe,
+	0x20f120f1, 0x22eb27e9, 0x2adf29e1, 0x2ee426f4, 0x151d2de8, 0x35d330e6,
+	0x41d52bed, 0x27f61e09, 0x121a141b, 0x0039f252, 0xfb4bed61, 0xdd7d1b00,
+	0x1c001ffc, 0x1b062208, 0x1e0a1816, 0x21131620, 0x1a1f1529, 0x1a2c172f,
+	0x10410e47, 0x083c063f, 0x11411518, 0x17141a17, 0x1b201c17, 0x1c181728,
+	0x18201c1d, 0x172a1339, 0x1635163d, 0x0b560c28, 0x0b330e3b, 0xfc4ff947,
+	0xfb45f746, 0xf842f644, 0xed49f445, 0xf046f143, 0xec3eed46, 0xf042ea41,
+	0xec3f09fe, 0x1af721f7, 0x27f929fe, 0x2d033109, 0x2d1b243b, 0xfa42f923,
+	0xf92af82d, 0xfb30f438, 0xfa3cfb3e, 0xf842f84c, 0xfb55fa51, 0xf64df951,
+	0xef50ee49, 0xfc4af653, 0xf747f743, 0xff3df842, 0xf242003b, 0x023b15f3,
+	0x21f227f9, 0x2efe3302, 0x3c063d11, 0x37222a3e, 0x14f10236, 0x034a14f1,
+	0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07331619, 0x22001000, 0xfe090429,
+	0xe3760241, 0xfa47f34f, 0x05340932, 0xfd460a36, 0x1a221316, 0x28003902,
+	0x29241a45, 0xd37ff165, 0xfc4cfa47, 0xf34f0534, 0x0645f35a, 0x0034082b,
+	0xfe45fb52, 0xf660023b, 0x024bfd57, 0xfd640138, 0xfd4afa55, 0x003bfd51,
+	0xf956fb5f, 0xff42ff4d, 0x0146fe56, 0xfb48003d, 0x0029003f, 0x003f003f,
+	0xf7530456, 0x0061f948, 0x0d29033e, 0x0d0f0733, 0x0250d97f, 0xee5bef60,
+	0xe651dd62, 0xe866e961, 0xe577e863, 0xeb6eee66, 0xdc7f0050, 0xfb59f95e,
+	0xfc5c0027, 0x0041f154, 0xdd7ffe49, 0xf468f75b, 0xe17f0337, 0x07380737,
+	0x083dfd35, 0x0044f94a, 0xf758f367, 0xf35bf759, 0xf25cf84c, 0xf457e96e,
+	0xe869f64e, 0xec70ef63, 0xb27fba7f, 0xce7fd27f, 0xfc42fb4e, 0xfc47f848,
+	0x023bff37, 0xf946fa4b, 0xf859de77, 0xfd4b2014, 0x1e16d47f, 0x0036fb3d,
+	0x003aff3c, 0xfd3df843, 0xe754f24a, 0xfb410534, 0x0239003d, 0xf745f546,
+	0x1237fc47, 0x003a073d, 0x09291219, 0x0920052b, 0x092f002c, 0x0033022e,
+	0x1326fc42, 0x0f260c2a, 0x09220059, 0x042d0a1c, 0x0a1f21f5, 0x34d5120f,
+	0x1c0023ea, 0x26e72200, 0x27ee20f4, 0x66a20000, 0x38f121fc, 0x1d0a25fb,
+	0x33e327f7, 0x34de45c6, 0x43c12cfb, 0x200737e3, 0x20010000, 0x1b2421e7,
+	0x22e224e4, 0x26e426e5, 0x22ee23f0, 0x22f220f8, 0x25fa2300, 0x1e0a1c12,
+	0x1a191d29, 0x004b0248, 0x084d0e23, 0x121f1123, 0x151e112d, 0x142a122d,
+	0x1b1a1036, 0x07421038, 0x0b490a43, 0xf674e970, 0xf147f93d, 0x0035fb42,
+	0xf54df750, 0xf754f657, 0xde7feb65, 0xfd27fb35, 0xf93df54b, 0xf14def5b,
+	0xe76be76f, 0xe47af54c, 0xf62cf634, 0xf639f73a, 0xf048f945, 0xfc45fb4a,
+	0xf7560242, 0xf7220120, 0x0b1f0534, 0xfe37fe43, 0x0049f859, 0x03340704,
+	0x0a081108, 0x10130325, 0xff3dfb49, 0xff46fc4e, 0x0000eb7e, 0xe97cec6e,
+	0xe67ee77c, 0xef69e579, 0xe575ef66, 0xe675e574, 0xdf7af65f, 0xf264f85f,
+	0xef6fe472, 0xfa59fe50, 0xfc52f755, 0xf851ff48, 0x05400143, 0x09380045,
+	0x01450745, 0xf945fa43, 0xf04dfe40, 0x023dfa43, 0xfd400239, 0xfd41fd42,
+	0x003e0933, 0xff42fe47, 0xfe4bff46, 0xf7480e3c, 0x1025002f, 0x12230b25,
+	0x0c290a29, 0x02300c29, 0x0d29003b, 0x03321328, 0x03421232, 0x13fa12fa,
+	0x0e001af4, 0x1ff021e7, 0x21ea25e4, 0x27e22ae2, 0x2fd62ddc, 0x31de29ef,
+	0x200945b9, 0x3fc142c0, 0x4db636d9, 0x34dd29f6, 0x240028ff, 0x1e0e1c1a,
+	0x17250c37, 0x0b4125df, 0x27dc28db, 0x26e22edf, 0x2ae228e8, 0x31e326f4,
+	0x28f626fd, 0x2efb1f14, 0x1d1e192c, 0x0c300b31, 0x1a2d1616, 0x17161b15,
+	0x21141a1c, 0x1e181b22, 0x122a1927, 0x12320c46, 0x15360e47, 0x0b531920,
+	0x15311536, 0xfb55fa51, 0xf64df951, 0xef50ee49, 0xfc4af653, 0xf747f743,
+	0xff3df842, 0xf242003b, 0x023b11f6, 0x20f32af7, 0x31fb3500, 0x4003440a,
+	0x421b2f39, 0xfb470018, 0xff24fe2a, 0xfe34f739, 0xfa3ffc41, 0xfc43f952,
+	0xfd51fd4c, 0xf948fa4e, 0xf448f244, 0xfd46fa4c, 0xfb42fb3e, 0x0039fc3d,
+	0xf73c0136, 0x023a11f6, 0x20f32af7, 0x31fb3500, 0x4003440a, 0x421b2f39,
+	0x14f10236, 0x034a14f1, 0x0236034a, 0xe47fe968, 0xfa35ff36, 0x07331d10,
+	0x19000e00, 0xf633fd3e, 0xe5631a10, 0xfc55e866, 0x05390639, 0xef490e39,
+	0x1428140a, 0x1d003600, 0x252a0c61, 0xe07fea75, 0xfe4afc55, 0xe8660539,
+	0xfa5df258, 0xfa2c0437, 0xf559f167, 0xeb741339, 0x143a0454, 0x0660013f,
+	0xfb55f36a, 0x053f064b, 0xfd5aff65, 0x0337fc4f, 0xfe4bf461, 0xf932013c,
+	0x0029003f, 0x003f003f, 0xf7530456, 0x0061f948, 0x0d29033e, 0x0722f758,
+	0xec7fdc7f, 0xef5bf25f, 0xe754e756, 0xf459ef5b, 0xe17ff24c, 0xee67f35a,
+	0xdb7f0b50, 0x054c0254, 0x054efa37, 0x043df253, 0xdb7ffb4f, 0xf568f55b,
+	0xe27f0041, 0xfe4f0048, 0xfc5cfa38, 0x0344f847, 0xf362fc56, 0xf458fb52,
+	0xfd48fc43, 0xf848f059, 0xf745ff3b, 0x05420439, 0xfc47fe47, 0x023aff4a,
+	0xfc2cff45, 0x003ef933, 0xfc2ffa2a, 0xfd29fa35, 0x084cf74e, 0xf5530934,
+	0x0043fb5a, 0x0143f148, 0xfb4bf850, 0xeb53eb40, 0xf31fe740, 0xe35e094b,
+	0x113ff84a, 0xfb23fe1b, 0x0d5b0341, 0xf945084d, 0xf642033e, 0xfd44ec51,
+	0x001e0107, 0xfd17eb4a, 0x1042e97c, 0x11252cee, 0x32deea7f, 0x0427002a,
+	0x07220b1d, 0x081f0625, 0x072a0328, 0x08210d2b, 0x0d24042f, 0x0337023a,
+	0x063c082c, 0x0b2c0e2a, 0x07300438, 0x04340d25, 0x0931133a, 0x0a300c2d,
+	0x00451421, 0x083f23ee, 0x21e71cfd, 0x180a1b00, 0x22f234d4, 0x27e81311,
+	0x1f19241d, 0x1821220f, 0x1e141649, 0x1422131f, 0x1b2c1310, 0x0f240f24,
+	0x151c1915, 0x1e141f0c, 0x1b10182a, 0x005d0e38, 0x0f391a26, 0xe87fe873,
+	0xea52f73e, 0x0035003b, 0xf255f359, 0xf35ef55c, 0xe37feb64, 0xf239f443,
+	0xf547f64d, 0xeb55f058, 0xe968f162, 0xdb7ff652, 0xf830f83d, 0xf842f946,
+	0xf24bf64f, 0xf753f45c, 0xee6cfc4f, 0xea45f04b, 0xfe3a013a, 0xf34ef753,
+	0xfc51f363, 0xf351fa26, 0xf33efa3a, 0xfe3bf049, 0xf64cf356, 0xf753f657,
+	0x0000ea7f, 0xe77fe778, 0xe57fed72, 0xe975e776, 0xe675e871, 0xe476e178,
+	0xdb7cf65e, 0xf166f663, 0xf36ace7f, 0xfb5c1139, 0xfb56f35e, 0xf45bfe4d,
+	0x0047ff49, 0x0440f951, 0x05400f39, 0x01430044, 0xf6430144, 0x004d0240,
+	0x0044fb4e, 0x0737053b, 0x02410e36, 0x0f2c053c, 0x0246fe4c, 0xee560c46,
+	0x0540f446, 0x0b370538, 0x00450241, 0xfa4a0536, 0x0736fa4c, 0xf552fe4d,
+	0xfe4d192a, 0x11f310f7, 0x11f41beb, 0x25e229d8, 0x2ad730d1, 0x27e02ed8,
+	0x34cd2ed7, 0x34d92bed, 0x200b3dc9, 0x38d23ece, 0x51bd2dec, 0x23fe1c0f,
+	0x22012701, 0x1e111426, 0x122d0f36, 0x004f24f0, 0x25f225ef, 0x2001220f,
+	0x1d0f1819, 0x22161f10, 0x23121f1c, 0x2129241c, 0x1b2f153e, 0x121f131a,
+	0x24181817, 0x1b10181e, 0x1f1d1629, 0x162a103c, 0x0f340e3c, 0x034ef07b,
+	0x15351638, 0x193d1521, 0x1332113d, 0xfd4ef84a, 0xf748f648, 0xee4bf447,
+	0xf53ffb46, 0xef4bf248, 0xf043f835, 0xf23bf734, 0xf54409fe, 0x1ef61ffc,
+	0x21ff2107, 0x1f0c2517, 0x1f261440, 0xf747f925, 0xf82cf531, 0xf638f43b,
+	0xf83ff743, 0xfa44f64f, 0xfd4ef84a, 0xf748f648, 0xee4bf447, 0xf53ffb46,
+	0xef4bf248, 0xf043f835, 0xf23bf734, 0xf54409fe, 0x1ef61ffc, 0x21ff2107,
+	0x1f0c2517, 0x1f261440
+};
+
+/*
+ * NOTE: The scaling lists are in zig-zag order, apply inverse scanning process
+ * to get the values in matrix order. In addition, the hardware requires bytes
+ * swapped within each subsequent 4 bytes. Both arrays below include both
+ * transformations.
+ */
+static const u32 zig_zag_4x4[] = {
+	3, 2, 7, 11, 6, 1, 0, 5, 10, 15, 14, 9, 4, 8, 13, 12
+};
+
+static const u32 zig_zag_8x8[] = {
+	3, 2, 11, 19, 10, 1, 0, 9, 18, 27, 35, 26, 17, 8, 7, 6,
+	15, 16, 25, 34, 43, 51, 42, 33, 24, 23, 14, 5, 4, 13, 22, 31,
+	32, 41, 50, 59, 58, 49, 40, 39, 30, 21, 12, 20, 29, 38, 47, 48,
+	57, 56, 55, 46, 37, 28, 36, 45, 54, 63, 62, 53, 44, 52, 61, 60
+};
+
+static void
+reorder_scaling_list(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling = ctrls->scaling;
+	const size_t num_list_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4);
+	const size_t list_len_4x4 = ARRAY_SIZE(scaling->scaling_list_4x4[0]);
+	const size_t num_list_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8);
+	const size_t list_len_8x8 = ARRAY_SIZE(scaling->scaling_list_8x8[0]);
+	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
+	u8 *dst = tbl->scaling_list;
+	const u8 *src;
+	int i, j;
+
+	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_4x4) != list_len_4x4);
+	BUILD_BUG_ON(ARRAY_SIZE(zig_zag_8x8) != list_len_8x8);
+	BUILD_BUG_ON(ARRAY_SIZE(tbl->scaling_list) !=
+		     num_list_4x4 * list_len_4x4 +
+		     num_list_8x8 * list_len_8x8);
+
+	src = &scaling->scaling_list_4x4[0][0];
+	for (i = 0; i < num_list_4x4; ++i) {
+		for (j = 0; j < list_len_4x4; ++j)
+			dst[zig_zag_4x4[j]] = src[j];
+		src += list_len_4x4;
+		dst += list_len_4x4;
+	}
+
+	src = &scaling->scaling_list_8x8[0][0];
+	for (i = 0; i < num_list_8x8; ++i) {
+		for (j = 0; j < list_len_8x8; ++j)
+			dst[zig_zag_8x8[j]] = src[j];
+		src += list_len_8x8;
+		dst += list_len_8x8;
+	}
+}
+
+static void prepare_table(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+	struct hantro_h264_dec_priv_tbl *tbl = ctx->h264_dec.priv.cpu;
+	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	int i;
+
+	/*
+	 * Prepare auxiliary buffer.
+	 *
+	 * TODO: The CABAC table never changes, but maybe it would be better
+	 * to have it as a control, which is set by userspace once?
+	 */
+	memcpy(tbl->cabac_table, h264_cabac_table, sizeof(tbl->cabac_table));
+
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
+		tbl->poc[i * 2] = dpb[i].top_field_order_cnt;
+		tbl->poc[i * 2 + 1] = dpb[i].bottom_field_order_cnt;
+	}
+
+	tbl->poc[32] = dec_param->top_field_order_cnt;
+	tbl->poc[33] = dec_param->bottom_field_order_cnt;
+
+	reorder_scaling_list(ctx);
+}
+
+struct hantro_h264_reflist_builder {
+	const struct v4l2_h264_dpb_entry *dpb;
+	s32 pocs[HANTRO_H264_DPB_SIZE];
+	u8 unordered_reflist[HANTRO_H264_DPB_SIZE];
+	s32 curpoc;
+	u8 num_valid;
+};
+
+static s32 get_poc(enum v4l2_field field, s32 top_field_order_cnt,
+		   s32 bottom_field_order_cnt)
+{
+	switch (field) {
+	case V4L2_FIELD_TOP:
+		return top_field_order_cnt;
+
+	case V4L2_FIELD_BOTTOM:
+		return bottom_field_order_cnt;
+
+	default:
+		break;
+	}
+
+	return min(top_field_order_cnt, bottom_field_order_cnt);
+}
+
+static void
+init_reflist_builder(struct hantro_ctx *ctx,
+		     struct hantro_h264_reflist_builder *b)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	struct vb2_v4l2_buffer *buf = hantro_get_dst_buf(ctx);
+	const struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
+	unsigned int i;
+
+	dec_param = ctx->h264_dec.ctrls.decode;
+
+	memset(b, 0, sizeof(*b));
+	b->dpb = dpb;
+	b->curpoc = get_poc(buf->field, dec_param->top_field_order_cnt,
+			    dec_param->bottom_field_order_cnt);
+
+	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
+		int buf_idx;
+
+		if (!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		buf_idx = vb2_find_timestamp(cap_q, dpb[i].reference_ts, 0);
+		if (buf_idx < 0)
+			continue;
+
+		buf = to_vb2_v4l2_buffer(vb2_get_buffer(cap_q, buf_idx));
+		b->pocs[i] = get_poc(buf->field, dpb[i].top_field_order_cnt,
+				     dpb[i].bottom_field_order_cnt);
+		b->unordered_reflist[b->num_valid] = i;
+		b->num_valid++;
+	}
+
+	for (i = b->num_valid; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
+		b->unordered_reflist[i] = i;
+}
+
+static int p_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/*
+	 * Short term pics in descending pic num order, long term ones in
+	 * ascending order.
+	 */
+	if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+		return b->frame_num - a->frame_num;
+
+	return a->pic_num - b->pic_num;
+}
+
+static int b0_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	s32 poca, pocb;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/* Long term pics in ascending pic num order. */
+	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+		return a->pic_num - b->pic_num;
+
+	poca = builder->pocs[idxa];
+	pocb = builder->pocs[idxb];
+
+	/*
+	 * Short term pics with POC < cur POC first in POC descending order
+	 * followed by short term pics with POC > cur POC in POC ascending
+	 * order.
+	 */
+	if ((poca < builder->curpoc) != (pocb < builder->curpoc))
+		return poca - pocb;
+	else if (poca < builder->curpoc)
+		return pocb - poca;
+
+	return poca - pocb;
+}
+
+static int b1_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
+{
+	const struct hantro_h264_reflist_builder *builder = data;
+	const struct v4l2_h264_dpb_entry *a, *b;
+	s32 poca, pocb;
+	u8 idxa, idxb;
+
+	idxa = *((u8 *)ptra);
+	idxb = *((u8 *)ptrb);
+	a = &builder->dpb[idxa];
+	b = &builder->dpb[idxb];
+
+	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
+	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
+		/* Short term pics firt. */
+		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
+			return -1;
+		else
+			return 1;
+	}
+
+	/* Long term pics in ascending pic num order. */
+	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+		return a->pic_num - b->pic_num;
+
+	poca = builder->pocs[idxa];
+	pocb = builder->pocs[idxb];
+
+	/*
+	 * Short term pics with POC > cur POC first in POC ascending order
+	 * followed by short term pics with POC > cur POC in POC descending
+	 * order.
+	 */
+	if ((poca < builder->curpoc) != (pocb < builder->curpoc))
+		return pocb - poca;
+	else if (poca < builder->curpoc)
+		return pocb - poca;
+
+	return poca - pocb;
+}
+
+static void
+build_p_ref_list(const struct hantro_h264_reflist_builder *builder,
+		 u8 *reflist)
+{
+	memcpy(reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(reflist, builder->num_valid, sizeof(*reflist),
+	       p_ref_list_cmp, NULL, builder);
+}
+
+static void
+build_b_ref_lists(const struct hantro_h264_reflist_builder *builder,
+		  u8 *b0_reflist, u8 *b1_reflist)
+{
+	memcpy(b0_reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(b0_reflist, builder->num_valid, sizeof(*b0_reflist),
+	       b0_ref_list_cmp, NULL, builder);
+
+	memcpy(b1_reflist, builder->unordered_reflist,
+	       sizeof(builder->unordered_reflist));
+	sort_r(b1_reflist, builder->num_valid, sizeof(*b1_reflist),
+	       b1_ref_list_cmp, NULL, builder);
+}
+
+static bool dpb_entry_match(const struct v4l2_h264_dpb_entry *a,
+			    const struct v4l2_h264_dpb_entry *b)
+{
+	return a->top_field_order_cnt == b->top_field_order_cnt &&
+	       a->bottom_field_order_cnt == b->bottom_field_order_cnt;
+}
+
+static void update_dpb(struct hantro_ctx *ctx)
+{
+	const struct v4l2_ctrl_h264_decode_params *dec_param;
+	DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
+	unsigned int i, j;
+
+	dec_param = ctx->h264_dec.ctrls.decode;
+
+	/* Disable all entries by default. */
+	for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
+		ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
+
+	/* Try to match new DPB entries with existing ones by their POCs. */
+	for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+
+		if (!(ndpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+			continue;
+
+		/*
+		 * To cut off some comparisons, iterate only on target DPB
+		 * entries which are not used yet.
+		 */
+		for_each_clear_bit(j, used, ARRAY_SIZE(ctx->h264_dec.dpb)) {
+			struct v4l2_h264_dpb_entry *cdpb;
+
+			cdpb = &ctx->h264_dec.dpb[j];
+			if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
+			    !dpb_entry_match(cdpb, ndpb))
+				continue;
+
+			*cdpb = *ndpb;
+			set_bit(j, used);
+			break;
+		}
+
+		if (j == ARRAY_SIZE(ctx->h264_dec.dpb))
+			set_bit(i, new);
+	}
+
+	/* For entries that could not be matched, use remaining free slots. */
+	for_each_set_bit(i, new, ARRAY_SIZE(dec_param->dpb)) {
+		const struct v4l2_h264_dpb_entry *ndpb = &dec_param->dpb[i];
+		struct v4l2_h264_dpb_entry *cdpb;
+
+		/*
+		 * Both arrays are of the same sizes, so there is no way
+		 * we can end up with no space in target array, unless
+		 * something is buggy.
+		 */
+		j = find_first_zero_bit(used, ARRAY_SIZE(ctx->h264_dec.dpb));
+		if (WARN_ON(j >= ARRAY_SIZE(ctx->h264_dec.dpb)))
+			return;
+
+		cdpb = &ctx->h264_dec.dpb[j];
+		*cdpb = *ndpb;
+		set_bit(j, used);
+	}
+}
+
+struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
+					   unsigned int dpb_idx)
+{
+	struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q;
+	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	struct vb2_buffer *buf;
+	int buf_idx = -1;
+
+	if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+		buf_idx = vb2_find_timestamp(cap_q,
+					     dpb[dpb_idx].reference_ts, 0);
+
+	if (buf_idx >= 0) {
+		buf = vb2_get_buffer(cap_q, buf_idx);
+	} else {
+		struct vb2_v4l2_buffer *dst_buf;
+
+		/*
+		 * If a DPB entry is unused or invalid, address of current
+		 * destination buffer is returned.
+		 */
+		dst_buf = hantro_get_dst_buf(ctx);
+		buf = &dst_buf->vb2_buf;
+	}
+
+	return buf;
+}
+
+int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx)
+{
+	struct hantro_h264_dec_hw_ctx *h264_ctx = &ctx->h264_dec;
+	struct hantro_h264_dec_ctrls *ctrls = &h264_ctx->ctrls;
+	struct hantro_h264_reflist_builder reflist_builder;
+
+	hantro_prepare_run(ctx);
+
+	ctrls->scaling = hantro_get_ctrl(ctx,
+				V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX);
+	if (WARN_ON(!ctrls->scaling))
+		return -EINVAL;
+
+	ctrls->decode = hantro_get_ctrl(ctx,
+				V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS);
+	if (WARN_ON(!ctrls->decode))
+		return -EINVAL;
+
+	ctrls->slices = hantro_get_ctrl(ctx,
+					V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS);
+	if (WARN_ON(!ctrls->slices))
+		return -EINVAL;
+
+	ctrls->sps = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_SPS);
+	if (WARN_ON(!ctrls->sps))
+		return -EINVAL;
+
+	ctrls->pps = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_H264_PPS);
+	if (WARN_ON(!ctrls->pps))
+		return -EINVAL;
+
+	/* Prepare data in memory. */
+	prepare_table(ctx);
+
+	/* Update the DPB with new refs. */
+	update_dpb(ctx);
+
+	/* Build the P/B{0,1} ref lists. */
+	init_reflist_builder(ctx, &reflist_builder);
+	build_p_ref_list(&reflist_builder, h264_ctx->reflists.p);
+	build_b_ref_lists(&reflist_builder, h264_ctx->reflists.b0,
+			  h264_ctx->reflists.b1);
+	return 0;
+}
+
+void hantro_h264_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_h264_dec_hw_ctx *h264_dec = &ctx->h264_dec;
+	struct hantro_aux_buf *priv = &h264_dec->priv;
+
+	dma_free_coherent(vpu->dev, priv->size, priv->cpu, priv->dma);
+}
+
+int hantro_h264_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_h264_dec_hw_ctx *h264_dec = &ctx->h264_dec;
+	struct hantro_aux_buf *priv = &h264_dec->priv;
+	struct hantro_h264_dec_priv_tbl *tbl;
+
+	priv->cpu = dma_alloc_coherent(vpu->dev, sizeof(*tbl), &priv->dma,
+				       GFP_KERNEL);
+	if (!priv->cpu)
+		return -ENOMEM;
+
+	priv->size = sizeof(*tbl);
+	tbl = priv->cpu;
+	memcpy(tbl->cabac_table, h264_cabac_table, sizeof(*tbl->cabac_table));
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 34ef24e3a9ef..cc4fbeb9011f 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -11,6 +11,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/v4l2-controls.h>
+#include <media/h264-ctrls.h>
 #include <media/mpeg2-ctrls.h>
 #include <media/vp8-ctrls.h>
 #include <media/videobuf2-core.h>
@@ -40,6 +41,51 @@ struct hantro_jpeg_enc_hw_ctx {
 	struct hantro_aux_buf bounce_buffer;
 };
 
+/* Max. number of entries in the DPB (HW limitation). */
+#define HANTRO_H264_DPB_SIZE		16
+
+/**
+ * struct hantro_h264_dec_ctrls
+ * @decode:	Decode params
+ * @scaling:	Scaling info
+ * @slice:	Slice params
+ * @sps:	SPS info
+ * @pps:	PPS info
+ */
+struct hantro_h264_dec_ctrls {
+	const struct v4l2_ctrl_h264_decode_params *decode;
+	const struct v4l2_ctrl_h264_scaling_matrix *scaling;
+	const struct v4l2_ctrl_h264_slice_params *slices;
+	const struct v4l2_ctrl_h264_sps *sps;
+	const struct v4l2_ctrl_h264_pps *pps;
+};
+
+/**
+ * struct hantro_h264_dec_reflists
+ * @p:		P reflist
+ * @b0:		B0 reflist
+ * @b1:		B1 reflist
+ */
+struct hantro_h264_dec_reflists {
+	u8 p[HANTRO_H264_DPB_SIZE];
+	u8 b0[HANTRO_H264_DPB_SIZE];
+	u8 b1[HANTRO_H264_DPB_SIZE];
+};
+
+/**
+ * struct hantro_h264_dec_hw_ctx
+ * @priv:	Private auxiliary buffer for hardware.
+ * @dpb:	DPB
+ * @reflists:	P/B0/B1 reflists
+ * @ctrls:	V4L2 controls attached to a run
+ */
+struct hantro_h264_dec_hw_ctx {
+	struct hantro_aux_buf priv;
+	struct v4l2_h264_dpb_entry dpb[HANTRO_H264_DPB_SIZE];
+	struct hantro_h264_dec_reflists reflists;
+	struct hantro_h264_dec_ctrls ctrls;
+};
+
 /**
  * struct hantro_mpeg2_dec_hw_ctx
  * @qtable:		Quantization table
@@ -105,6 +151,12 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
 int hantro_jpeg_enc_init(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 
+struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
+					   unsigned int dpb_idx);
+int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
+int hantro_h264_dec_init(struct hantro_ctx *ctx);
+void hantro_h264_dec_exit(struct hantro_ctx *ctx);
+
 void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
 void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 507369c1d17d..30cf35241bde 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -345,6 +345,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 		break;
 	case V4L2_PIX_FMT_MPEG2_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
+	case V4L2_PIX_FMT_H264_SLICE_RAW:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
@@ -519,6 +520,7 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
 {
 	struct hantro_ctx *ctx = vb2_get_drv_priv(vq);
 	struct v4l2_pix_format_mplane *pixfmt;
+	unsigned int extra_size0 = 0;
 	int i;
 
 	switch (vq->type) {
@@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
 		return -EINVAL;
 	}
 
+	/* The H264 decoder needs extra size on the output buffer. */
+	if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
+		extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
+			      DIV_ROUND_UP(pixfmt->height, 16);
+
 	if (*num_planes) {
 		if (*num_planes != pixfmt->num_planes)
 			return -EINVAL;
-		for (i = 0; i < pixfmt->num_planes; ++i)
+		/*
+		 * The application is not aware of the extra size needed
+		 * for some codecs, so amend it without failing.
+		 */
+		if (sizes[0] < (pixfmt->plane_fmt[0].sizeimage + extra_size0))
+			sizes[0] = pixfmt->plane_fmt[0].sizeimage + extra_size0;
+		for (i = 1; i < pixfmt->num_planes; ++i)
 			if (sizes[i] < pixfmt->plane_fmt[i].sizeimage)
 				return -EINVAL;
 		return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] media: hantro: Add support for H264 decoding on G1
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (6 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 7/9] media: hantro: Add core bits to support H264 decoding Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:15 ` [PATCH 9/9] media: hantro: Enable H264 decoding on rk3288 Boris Brezillon
  2019-06-19 12:23 ` [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Hertz Wong, Boris Brezillon

From: Hertz Wong <hertz.wong@rock-chips.com>

Add the G1 specific bits to support H264 decoding.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/Makefile         |   1 +
 .../staging/media/hantro/hantro_g1_h264_dec.c | 295 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |   1 +
 3 files changed, 297 insertions(+)
 create mode 100644 drivers/staging/media/hantro/hantro_g1_h264_dec.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index d63e25682287..9d66f956f2f3 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -4,6 +4,7 @@ hantro-vpu-y += \
 		hantro_drv.o \
 		hantro_v4l2.o \
 		hantro_h1_jpeg_enc.o \
+		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
new file mode 100644
index 000000000000..3a5be123202b
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip RK3288 VPU codec driver
+ *
+ * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
+ *	Hertz Wong <hertz.wong@rock-chips.com>
+ *	Herman Chen <herman.chen@rock-chips.com>
+ *
+ * Copyright (C) 2014 Google, Inc.
+ *	Tomasz Figa <tfiga@chromium.org>
+ */
+
+#include <linux/types.h>
+#include <linux/sort.h>
+
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro_g1_regs.h"
+#include "hantro_hw.h"
+#include "hantro_v4l2.h"
+
+static void set_params(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
+	const struct v4l2_ctrl_h264_slice_params *slices = ctrls->slices;
+	const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_h264_pps *pps = ctrls->pps;
+	struct vb2_v4l2_buffer *src_buf = hantro_get_src_buf(ctx);
+	struct hantro_dev *vpu = ctx->dev;
+	u32 reg;
+
+	/* Decoder control register 0. */
+	reg = G1_REG_DEC_CTRL0_DEC_AXI_WR_ID(0x0);
+	if (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)
+		reg |= G1_REG_DEC_CTRL0_SEQ_MBAFF_E;
+	reg |= G1_REG_DEC_CTRL0_PICORD_COUNT_E;
+	if (dec_param->nal_ref_idc)
+		reg |= G1_REG_DEC_CTRL0_WRITE_MVS_E;
+
+	if (!(sps->flags & V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY) &&
+	    (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD ||
+	     slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC))
+		reg |= G1_REG_DEC_CTRL0_PIC_INTERLACE_E;
+	if (slices[0].flags & V4L2_H264_SLICE_FLAG_FIELD_PIC)
+		reg |= G1_REG_DEC_CTRL0_PIC_FIELDMODE_E;
+	if (!(slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD))
+		reg |= G1_REG_DEC_CTRL0_PIC_TOPFIELD_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
+
+	/* Decoder control register 1. */
+	reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(sps->pic_width_in_mbs_minus1 + 1) |
+	      G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(sps->pic_height_in_map_units_minus1 + 1) |
+	      G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
+
+	/* Decoder control register 2. */
+	reg = G1_REG_DEC_CTRL2_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
+	      G1_REG_DEC_CTRL2_CH_QP_OFFSET2(pps->second_chroma_qp_index_offset);
+
+	/* always use the matrix sent from userspace */
+	reg |= G1_REG_DEC_CTRL2_TYPE1_QUANT_E;
+
+	if (slices[0].flags &  V4L2_H264_SLICE_FLAG_FIELD_PIC)
+		reg |= G1_REG_DEC_CTRL2_FIELDPIC_FLAG_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL2);
+
+	/* Decoder control register 3. */
+	reg = G1_REG_DEC_CTRL3_START_CODE_E |
+	      G1_REG_DEC_CTRL3_INIT_QP(pps->pic_init_qp_minus26 + 26) |
+	      G1_REG_DEC_CTRL3_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL3);
+
+	/* Decoder control register 4. */
+	reg = G1_REG_DEC_CTRL4_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) |
+	      G1_REG_DEC_CTRL4_FRAMENUM(slices[0].frame_num) |
+	      G1_REG_DEC_CTRL4_WEIGHT_BIPR_IDC(pps->weighted_bipred_idc);
+	if (pps->flags & V4L2_H264_PPS_FLAG_ENTROPY_CODING_MODE)
+		reg |= G1_REG_DEC_CTRL4_CABAC_E;
+	if (sps->flags & V4L2_H264_SPS_FLAG_DIRECT_8X8_INFERENCE)
+		reg |= G1_REG_DEC_CTRL4_DIR_8X8_INFER_E;
+	if (sps->profile_idc >= 0 && sps->chroma_format_idc == 0)
+		reg |= G1_REG_DEC_CTRL4_BLACKWHITE_E;
+	if (pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED)
+		reg |= G1_REG_DEC_CTRL4_WEIGHT_PRED_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL4);
+
+	/* Decoder control register 5. */
+	reg = G1_REG_DEC_CTRL5_REFPIC_MK_LEN(slices[0].dec_ref_pic_marking_bit_size) |
+	      G1_REG_DEC_CTRL5_IDR_PIC_ID(slices[0].idr_pic_id);
+	if (pps->flags & V4L2_H264_PPS_FLAG_CONSTRAINED_INTRA_PRED)
+		reg |= G1_REG_DEC_CTRL5_CONST_INTRA_E;
+	if (pps->flags & V4L2_H264_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT)
+		reg |= G1_REG_DEC_CTRL5_FILT_CTRL_PRES;
+	if (pps->flags & V4L2_H264_PPS_FLAG_REDUNDANT_PIC_CNT_PRESENT)
+		reg |= G1_REG_DEC_CTRL5_RDPIC_CNT_PRES;
+	if (pps->flags & V4L2_H264_PPS_FLAG_TRANSFORM_8X8_MODE)
+		reg |= G1_REG_DEC_CTRL5_8X8TRANS_FLAG_E;
+	if (dec_param->flags & V4L2_H264_DECODE_PARAM_FLAG_IDR_PIC)
+		reg |= G1_REG_DEC_CTRL5_IDR_PIC_E;
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL5);
+
+	/* Decoder control register 6. */
+	reg = G1_REG_DEC_CTRL6_PPS_ID(slices[0].pic_parameter_set_id) |
+	      G1_REG_DEC_CTRL6_REFIDX0_ACTIVE(pps->num_ref_idx_l0_default_active_minus1 + 1) |
+	      G1_REG_DEC_CTRL6_REFIDX1_ACTIVE(pps->num_ref_idx_l1_default_active_minus1 + 1) |
+	      G1_REG_DEC_CTRL6_POC_LENGTH(slices[0].pic_order_cnt_bit_size);
+	vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL6);
+
+	/* Error concealment register. */
+	vdpu_write_relaxed(vpu, 0, G1_REG_ERR_CONC);
+
+	/* Prediction filter tap register. */
+	vdpu_write_relaxed(vpu,
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_0(1) |
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_1(-5 & 0x3ff) |
+			   G1_REG_PRED_FLT_PRED_BC_TAP_0_2(20),
+			   G1_REG_PRED_FLT);
+
+	/* Reference picture buffer control register. */
+	vdpu_write_relaxed(vpu, 0, G1_REG_REF_BUF_CTRL);
+
+	/* Reference picture buffer control register 2. */
+	vdpu_write_relaxed(vpu, G1_REG_REF_BUF_CTRL2_APF_THRESHOLD(8),
+			   G1_REG_REF_BUF_CTRL2);
+}
+
+static void set_ref(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *dst_buf = hantro_get_dst_buf(ctx);
+	struct v4l2_h264_dpb_entry *dpb = ctx->h264_dec.dpb;
+	const u8 *b0_reflist, *b1_reflist, *p_reflist;
+	struct hantro_dev *vpu = ctx->dev;
+	u32 dpb_longterm = 0;
+	u32 dpb_valid = 0;
+	int reg_num;
+	u32 reg;
+	int i;
+
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/*
+	 * Set up bit maps of valid and long term DPBs.
+	 * NOTE: The bits are reversed, i.e. MSb is DPB 0.
+	 */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; ++i) {
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+			dpb_valid |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			dpb_longterm |= BIT(HANTRO_H264_DPB_SIZE - 1 - i);
+	}
+	vdpu_write_relaxed(vpu, dpb_valid << 16, G1_REG_VALID_REF);
+	vdpu_write_relaxed(vpu, dpb_longterm << 16, G1_REG_LT_REF);
+
+	/*
+	 * Set up reference frame picture numbers.
+	 *
+	 * Each G1_REG_REF_PIC(x) register contains numbers of two
+	 * subsequential reference pictures.
+	 */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; i += 2) {
+		reg = 0;
+		if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].pic_num);
+		else
+			reg |= G1_REG_REF_PIC_REFER0_NBR(dpb[i].frame_num);
+
+		if (dpb[i + 1].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].pic_num);
+		else
+			reg |= G1_REG_REF_PIC_REFER1_NBR(dpb[i + 1].frame_num);
+
+		vdpu_write_relaxed(vpu, reg, G1_REG_REF_PIC(i / 2));
+	}
+
+	b0_reflist = ctx->h264_dec.reflists.b0;
+	b1_reflist = ctx->h264_dec.reflists.b1;
+	p_reflist = ctx->h264_dec.reflists.p;
+
+	/*
+	 * Each G1_REG_BD_REF_PIC(x) register contains three entries
+	 * of each forward and backward picture list.
+	 */
+	reg_num = 0;
+	for (i = 0; i < 15; i += 3) {
+		reg = G1_REG_BD_REF_PIC_BINIT_RLIST_F0(b0_reflist[i]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_F1(b0_reflist[i + 1]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_F2(b0_reflist[i + 2]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B0(b1_reflist[i]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B1(b1_reflist[i + 1]) |
+		      G1_REG_BD_REF_PIC_BINIT_RLIST_B2(b1_reflist[i + 2]);
+		vdpu_write_relaxed(vpu, reg, G1_REG_BD_REF_PIC(reg_num++));
+	}
+
+	/*
+	 * G1_REG_BD_P_REF_PIC register contains last entries (index 15)
+	 * of forward and backward reference picture lists and first 4 entries
+	 * of P forward picture list.
+	 */
+	reg = G1_REG_BD_P_REF_PIC_BINIT_RLIST_F15(b0_reflist[15]) |
+	      G1_REG_BD_P_REF_PIC_BINIT_RLIST_B15(b1_reflist[15]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F0(p_reflist[0]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F1(p_reflist[1]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F2(p_reflist[2]) |
+	      G1_REG_BD_P_REF_PIC_PINIT_RLIST_F3(p_reflist[3]);
+	vdpu_write_relaxed(vpu, reg, G1_REG_BD_P_REF_PIC);
+
+	/*
+	 * Each G1_REG_FWD_PIC(x) register contains six consecutive
+	 * entries of P forward picture list, starting from index 4.
+	 */
+	reg_num = 0;
+	for (i = 4; i < HANTRO_H264_DPB_SIZE; i += 6) {
+		reg = G1_REG_FWD_PIC_PINIT_RLIST_F0(p_reflist[i]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F1(p_reflist[i + 1]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F2(p_reflist[i + 2]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F3(p_reflist[i + 3]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F4(p_reflist[i + 4]) |
+		      G1_REG_FWD_PIC_PINIT_RLIST_F5(p_reflist[i + 5]);
+		vdpu_write_relaxed(vpu, reg, G1_REG_FWD_PIC(reg_num++));
+	}
+
+	/* Set up addresses of DPB buffers. */
+	for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
+		struct vb2_buffer *buf =  hantro_h264_get_ref_buf(ctx, i);
+
+		vdpu_write_relaxed(vpu, vb2_dma_contig_plane_dma_addr(buf, 0),
+				   G1_REG_ADDR_REF(i));
+	}
+}
+
+static void set_buffers(struct hantro_ctx *ctx)
+{
+	const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct hantro_dev *vpu = ctx->dev;
+	dma_addr_t src_dma, dst_dma;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/* Source (stream) buffer. */
+	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, src_dma, G1_REG_ADDR_STR);
+
+	/* Destination (decoded frame) buffer. */
+	dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
+	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
+
+	/* Higher profiles require DMV buffer appended to reference frames. */
+	if (ctrls->sps->profile_idc > 66) {
+		size_t sizeimage = ctx->dst_fmt.plane_fmt[0].sizeimage;
+		size_t mv_offset = round_up(sizeimage, 8);
+
+		if (ctrls->slices[0].flags & V4L2_H264_SLICE_FLAG_BOTTOM_FIELD)
+			mv_offset += 32 * H264_MB_WIDTH(ctx->dst_fmt.width);
+
+		vdpu_write_relaxed(vpu, dst_dma + mv_offset,
+				   G1_REG_ADDR_DIR_MV);
+	}
+
+	/* Auxiliary buffer prepared in hantro_g1_h264_dec_prepare_table(). */
+	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
+}
+
+void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	/* Prepare the H264 decoder context. */
+	if (hantro_h264_dec_prepare_run(ctx))
+		return;
+
+	/* Configure hardware registers. */
+	set_params(ctx);
+	set_ref(ctx);
+	set_buffers(ctx);
+
+	hantro_finish_run(ctx);
+
+	/* Start decoding! */
+	vdpu_write_relaxed(vpu,
+			   G1_REG_CONFIG_DEC_AXI_RD_ID(0xffu) |
+			   G1_REG_CONFIG_DEC_TIMEOUT_E |
+			   G1_REG_CONFIG_DEC_OUT_ENDIAN |
+			   G1_REG_CONFIG_DEC_STRENDIAN_E |
+			   G1_REG_CONFIG_DEC_MAX_BURST(16) |
+			   G1_REG_CONFIG_DEC_OUTSWAP32_E |
+			   G1_REG_CONFIG_DEC_INSWAP32_E |
+			   G1_REG_CONFIG_DEC_STRSWAP32_E |
+			   G1_REG_CONFIG_DEC_CLK_GATE_E,
+			   G1_REG_CONFIG);
+	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index cc4fbeb9011f..7a9051faf99f 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -154,6 +154,7 @@ void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 struct vb2_buffer *hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 					   unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
+void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] media: hantro: Enable H264 decoding on rk3288
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (7 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 8/9] media: hantro: Add support for H264 decoding on G1 Boris Brezillon
@ 2019-06-19 12:15 ` Boris Brezillon
  2019-06-19 12:23 ` [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel, Hertz Wong, Boris Brezillon

From: Hertz Wong <hertz.wong@rock-chips.com>

Now that the generic bits have been added, we can activate H264 decoding
on rk3288.

Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/staging/media/hantro/rk3288_vpu_hw.c | 21 +++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c b/drivers/staging/media/hantro/rk3288_vpu_hw.c
index 470e803e25a6..82a7b82f725a 100644
--- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
@@ -61,6 +61,19 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
 		.fourcc = V4L2_PIX_FMT_NV12,
 		.codec_mode = HANTRO_MODE_NONE,
 	},
+	{
+		.fourcc = V4L2_PIX_FMT_H264_SLICE_RAW,
+		.codec_mode = HANTRO_MODE_H264_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = H264_MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = H264_MB_DIM,
+		},
+	},
 	{
 		.fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
 		.codec_mode = HANTRO_MODE_MPEG2_DEC,
@@ -162,6 +175,12 @@ static const struct hantro_codec_ops rk3288_vpu_codec_ops[] = {
 		.init = hantro_jpeg_enc_init,
 		.exit = hantro_jpeg_enc_exit,
 	},
+	[HANTRO_MODE_H264_DEC] = {
+		.run = hantro_g1_h264_dec_run,
+		.reset = rk3288_vpu_dec_reset,
+		.init = hantro_h264_dec_init,
+		.exit = hantro_h264_dec_exit,
+	},
 	[HANTRO_MODE_MPEG2_DEC] = {
 		.run = hantro_g1_mpeg2_dec_run,
 		.reset = rk3288_vpu_dec_reset,
@@ -197,7 +216,7 @@ const struct hantro_variant rk3288_vpu_variant = {
 	.dec_fmts = rk3288_vpu_dec_fmts,
 	.num_dec_fmts = ARRAY_SIZE(rk3288_vpu_dec_fmts),
 	.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
-		 HANTRO_VP8_DECODER,
+		 HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
 	.codec_ops = rk3288_vpu_codec_ops,
 	.irqs = rk3288_irqs,
 	.num_irqs = ARRAY_SIZE(rk3288_irqs),
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/9] media: hantro: Add support for H264 decoding
  2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
                   ` (8 preceding siblings ...)
  2019-06-19 12:15 ` [PATCH 9/9] media: hantro: Enable H264 decoding on rk3288 Boris Brezillon
@ 2019-06-19 12:23 ` Boris Brezillon
  9 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-06-19 12:23 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel

On Wed, 19 Jun 2019 14:15:31 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> Hello,
> 
> This patch series adds support H264 decoding support to the hantro
> driver and doing some consilidation cleanup in the driver along the
> way.
> 
> Some details about the patches forming this patchset:
> 
> * The first patch is adding support for the sort_r() variant and has
>   been posted separately by Rasmus. I put it back there because Andrew
>   told me to repost it with the patch series using this new variant.
>   As mentioned in the patch itself, I'd like this patch to be merged
>   as soon as possible to avoid the synchronisation burden that might
>   appear if we decide to delay it.
> 
> * Patch 2 is needed to properly propagate the output buf timestamp to
>   the capture buf one, which is required for intra-frame references.
> 
> * Patches 3 to 6 are consolidating the code by providing helpers that
>   can be used by all hantro backend and simplifying the ctrl
>   initialization logic. We also constify the controls array.
> 
> * Patches 7 to 8 are adding common H264 decoding bits and patch 9 is
>   enabling H264 decoding on rk3288
> 
> Now, a few words about the dependencies. Unfortunately there are a lot,
> and that'd be great to have some of them merged.
> 
> * This series is based on top of Ezequiel's VP8 work [1].
> * It depends on [2] which defines/described the H264 decoding mode
>   control.
> * Depends on [3] since I'm using vb2_get_buffer() to retrieve a
>   reference buffer
> * The final dep is a fix I sent this morning allowing me to simplify the
>   ctrl initialization logic [4]

And now the links :-).

[1]https://patchwork.kernel.org/project/linux-media/list/?series=131677
[2]https://patchwork.kernel.org/project/linux-media/list/?series=129567
[3]https://patchwork.kernel.org/project/linux-media/list/?series=129895
[4]https://patchwork.kernel.org/patch/11003737/

> 
> Regards,
> 
> Boris
> 
> Boris Brezillon (5):
>   media: hantro: Move copy_metadata() before doing a decode operation
>   media: hantro: Constify the control array
>   media: hantro: Simplify the controls creation logic
>   media: hantro: Add hantro_get_{src,dst}_buf() helpers
>   media: hantro: Add helpers to prepare/finish a run
> 
> Hertz Wong (3):
>   media: hantro: Add core bits to support H264 decoding
>   media: hantro: Add support for H264 decoding on G1
>   media: hantro: Enable H264 decoding on rk3288
> 
> Rasmus Villemoes (1):
>   lib/sort.c: implement sort() variant taking context argument
> 
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  24 +-
>  drivers/staging/media/hantro/hantro_drv.c     |  95 ++-
>  .../staging/media/hantro/hantro_g1_h264_dec.c | 295 ++++++++
>  .../media/hantro/hantro_g1_mpeg2_dec.c        |  14 +-
>  .../staging/media/hantro/hantro_g1_vp8_dec.c  |  17 +-
>  .../staging/media/hantro/hantro_h1_jpeg_enc.c |  11 +-
>  drivers/staging/media/hantro/hantro_h264.c    | 638 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  55 ++
>  drivers/staging/media/hantro/hantro_v4l2.c    |  15 +-
>  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  21 +-
>  .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  12 +-
>  .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  14 +-
>  include/linux/sort.h                          |   5 +
>  lib/sort.c                                    |  34 +-
>  15 files changed, 1175 insertions(+), 77 deletions(-)
>  create mode 100644 drivers/staging/media/hantro/hantro_g1_h264_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_h264.c
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] media: hantro: Constify the control array
  2019-06-19 12:15 ` [PATCH 3/9] media: hantro: Constify the control array Boris Brezillon
@ 2019-07-05 16:05   ` Ezequiel Garcia
  2019-07-05 17:17     ` Boris Brezillon
  0 siblings, 1 reply; 23+ messages in thread
From: Ezequiel Garcia @ 2019-07-05 16:05 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Jonas Karlman, linux-rockchip, Heiko Stuebner,
	Andrew Morton, Rasmus Villemoes, Philipp Zabel

On Wed, 2019-06-19 at 14:15 +0200, Boris Brezillon wrote:
> controls[] is not supposed to be modified at runtime, let's make it
> explicit by adding a const specifier.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index 28b0fed89dcb..db49d643ddb7 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -264,7 +264,7 @@ static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>  	.s_ctrl = hantro_s_ctrl,
>  };
>  
> -static struct hantro_ctrl controls[] = {
> +static const struct hantro_ctrl controls[] = {
>  	{
>  		.id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
>  		.codec = HANTRO_JPEG_ENCODER,

This patch here breaks the build:

  CC [M]  drivers/staging/media/hantro/hantro_drv.o
/home/zeta/repos/linux/media_tree/drivers/staging/media/hantro/hantro_drv.c: In function ‘hantro_ctrls_setup’:
/home/zeta/repos/linux/media_tree/drivers/staging/media/hantro/hantro_drv.c:319:23: error: assignment of member ‘id’ in read-only object
    controls[i].cfg.id = controls[i].id;
                       ^
You can fix it by simply moving it after:

[PATCH 4/9] media: hantro: Simplify the controls creation logic

Regards,
Ezequiel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] media: hantro: Constify the control array
  2019-07-05 16:05   ` Ezequiel Garcia
@ 2019-07-05 17:17     ` Boris Brezillon
  0 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-07-05 17:17 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, linux-kernel, Tomasz Figa,
	Nicolas Dufresne, kernel, Paul Kocialkowski, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Rasmus Villemoes,
	Philipp Zabel

On Fri, 05 Jul 2019 13:05:10 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:

> On Wed, 2019-06-19 at 14:15 +0200, Boris Brezillon wrote:
> > controls[] is not supposed to be modified at runtime, let's make it
> > explicit by adding a const specifier.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> >  drivers/staging/media/hantro/hantro_drv.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> > index 28b0fed89dcb..db49d643ddb7 100644
> > --- a/drivers/staging/media/hantro/hantro_drv.c
> > +++ b/drivers/staging/media/hantro/hantro_drv.c
> > @@ -264,7 +264,7 @@ static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> >  	.s_ctrl = hantro_s_ctrl,
> >  };
> >  
> > -static struct hantro_ctrl controls[] = {
> > +static const struct hantro_ctrl controls[] = {
> >  	{
> >  		.id = V4L2_CID_JPEG_COMPRESSION_QUALITY,
> >  		.codec = HANTRO_JPEG_ENCODER,  
> 
> This patch here breaks the build:
> 
>   CC [M]  drivers/staging/media/hantro/hantro_drv.o
> /home/zeta/repos/linux/media_tree/drivers/staging/media/hantro/hantro_drv.c: In function ‘hantro_ctrls_setup’:
> /home/zeta/repos/linux/media_tree/drivers/staging/media/hantro/hantro_drv.c:319:23: error: assignment of member ‘id’ in read-only object
>     controls[i].cfg.id = controls[i].id;
>                        ^

Oops, didn't check bisectability.

> You can fix it by simply moving it after:
> 
> [PATCH 4/9] media: hantro: Simplify the controls creation logic

Yep, I'll do that.

Thanks,

Boris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument
  2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
@ 2019-07-25 12:40   ` Boris Brezillon
  2019-07-26  0:05   ` Andrew Morton
  1 sibling, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-07-25 12:40 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Andrew Morton
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Rasmus Villemoes, Philipp Zabel

Hi Andrew,

On Wed, 19 Jun 2019 14:15:32 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> 
> Our list_sort() utility has always supported a context argument that
> is passed through to the comparison routine. Now there's a use case
> for the similar thing for sort().
> 
> This implements sort_r by simply extending the existing sort function
> in the obvious way. To avoid code duplication, we want to implement
> sort() in terms of sort_r(). The naive way to do that is
> 
> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
> {
>   int (*real_cmp)(const void*, const void*) = ctx;
>   return real_cmp(a, b);
> }
> 
> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
> 
> but this would do two indirect calls for each comparison. Instead, do
> as is done for the default swap functions - that only adds a cost of a
> single easily predicted branch to each comparison call.
> 
> Aside from introducing support for the context argument, this also
> serves as preparation for patches that will eliminate the indirect
> comparison calls in common cases.
> 
> Requested-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
> Hi all,
> 
> Andrew, you acked the first version of this patch, but Rasmus proposed
> a better solution and posted a v2. Can you review/ack this version.

Hans is planning to take that patch soon. Would you mind adding your
Ack back (assuming you're okay with the new version of course)?

Thanks,

Boris

> 
> Hans, Mauro, Andrew suggested to have this patch applied along with
> its first user (the H264 backend of the hantro codec), so here it is.
> Note that, if possible, I'd like to have this patch queued for the next
> release even if the H264 bits don't get accepted as is. The rationale
> here being that Rasmus told me he was planning to further improve the
> sort() logic after the next -rc1 is out, and I fear his changes will
> conflict with this patch, which might involve some kind synchronisation
> (a topic branch) between the media maintainers and Andrew.
> 
> Let me know how you want to proceed with that.
> 
> Regards,
> 
> Boris
> ---
>  include/linux/sort.h |  5 +++++
>  lib/sort.c           | 34 ++++++++++++++++++++++++++++------
>  2 files changed, 33 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/sort.h b/include/linux/sort.h
> index 2b99a5dd073d..61b96d0ebc44 100644
> --- a/include/linux/sort.h
> +++ b/include/linux/sort.h
> @@ -4,6 +4,11 @@
>  
>  #include <linux/types.h>
>  
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp)(const void *, const void *, const void *),
> +	    void (*swap)(void *, void *, int),
> +	    const void *priv);
> +
>  void sort(void *base, size_t num, size_t size,
>  	  int (*cmp)(const void *, const void *),
>  	  void (*swap)(void *, void *, int));
> diff --git a/lib/sort.c b/lib/sort.c
> index cf408aec3733..d54cf97e9548 100644
> --- a/lib/sort.c
> +++ b/lib/sort.c
> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>  		swap_func(a, b, (int)size);
>  }
>  
> +typedef int (*cmp_func_t)(const void *, const void *);
> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)
> +
> +static int do_cmp(const void *a, const void *b,
> +		  cmp_r_func_t cmp, const void *priv)
> +{
> +	if (cmp == _CMP_WRAPPER)
> +		return ((cmp_func_t)(priv))(a, b);
> +	return cmp(a, b, priv);
> +}
> +
>  /**
>   * parent - given the offset of the child, find the offset of the parent.
>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>  }
>  
>  /**
> - * sort - sort an array of elements
> + * sort_r - sort an array of elements
>   * @base: pointer to data to sort
>   * @num: number of elements
>   * @size: size of each element
>   * @cmp_func: pointer to comparison function
>   * @swap_func: pointer to swap function or NULL
> + * @priv: third argument passed to comparison function
>   *
>   * This function does a heapsort on the given array.  You may provide
>   * a swap_func function if you need to do something more than a memory
> @@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>   * O(n*n) worst-case behavior and extra memory requirements that make
>   * it less suitable for kernel use.
>   */
> -void sort(void *base, size_t num, size_t size,
> -	  int (*cmp_func)(const void *, const void *),
> -	  void (*swap_func)(void *, void *, int size))
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp_func)(const void *, const void *, const void *),
> +	    void (*swap_func)(void *, void *, int size),
> +	    const void *priv)
>  {
>  	/* pre-scale counters for performance */
>  	size_t n = num * size, a = (num/2) * size;
> @@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
>  		 * average, 3/4 worst-case.)
>  		 */
>  		for (b = a; c = 2*b + size, (d = c + size) < n;)
> -			b = cmp_func(base + c, base + d) >= 0 ? c : d;
> +			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
>  		if (d == n)	/* Special case last leaf with no sibling */
>  			b = c;
>  
>  		/* Now backtrack from "b" to the correct location for "a" */
> -		while (b != a && cmp_func(base + a, base + b) >= 0)
> +		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
>  			b = parent(b, lsbit, size);
>  		c = b;			/* Where "a" belongs */
>  		while (b != a) {	/* Shift it into place */
> @@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
>  		}
>  	}
>  }
> +EXPORT_SYMBOL(sort_r);
> +
> +void sort(void *base, size_t num, size_t size,
> +	  int (*cmp_func)(const void *, const void *),
> +	  void (*swap_func)(void *, void *, int size))
> +{
> +	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
> +}
>  EXPORT_SYMBOL(sort);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-06-19 12:15 ` [PATCH 7/9] media: hantro: Add core bits to support H264 decoding Boris Brezillon
@ 2019-07-25 13:31   ` Rasmus Villemoes
  2019-07-26 10:06     ` Boris Brezillon
  2019-08-01  4:06   ` Tomasz Figa
  1 sibling, 1 reply; 23+ messages in thread
From: Rasmus Villemoes @ 2019-07-25 13:31 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, linux-media
  Cc: linux-kernel, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	linux-rockchip, Heiko Stuebner, Andrew Morton, Philipp Zabel,
	Hertz Wong

On 19/06/2019 14.15, Boris Brezillon wrote:
> From: Hertz Wong <hertz.wong@rock-chips.com>
> 
> Add helpers and patch hantro_{drv,v4l2}.c to prepare addition of H264
> decoding support.
> 
> Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> +
> +	/*
> +	 * Short term pics in descending pic num order, long term ones in
> +	 * ascending order.
> +	 */
> +	if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> +		return b->frame_num - a->frame_num;
> +
> +	return a->pic_num - b->pic_num;
> +}

Pet peeve: This works because ->frame_num and ->pic_num are u16, so
their difference is guaranteed to fit in an int.

> +static int b0_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
> +{
> +	const struct hantro_h264_reflist_builder *builder = data;
> +	const struct v4l2_h264_dpb_entry *a, *b;
> +	s32 poca, pocb;
> +	u8 idxa, idxb;
> +
> +	idxa = *((u8 *)ptra);
> +	idxb = *((u8 *)ptrb);
> +	a = &builder->dpb[idxa];
> +	b = &builder->dpb[idxb];
> +
> +	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
> +	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
> +		/* Short term pics firt. */
> +		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> +			return -1;
> +		else
> +			return 1;
> +	}
> +
> +	/* Long term pics in ascending pic num order. */
> +	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> +		return a->pic_num - b->pic_num;
> +
> +	poca = builder->pocs[idxa];
> +	pocb = builder->pocs[idxb];
> +
> +	/*
> +	 * Short term pics with POC < cur POC first in POC descending order
> +	 * followed by short term pics with POC > cur POC in POC ascending
> +	 * order.
> +	 */
> +	if ((poca < builder->curpoc) != (pocb < builder->curpoc))
> +		return poca - pocb;
> +	else if (poca < builder->curpoc)
> +		return pocb - poca;
> +
> +	return poca - pocb;
> +}

Here, however, poca and pocb are ints. What guarantees that their values
are not more than 2^31 apart? I know absolutely nothing about this code
or what these numbers represent, so it may be obvious that they are
smallish.

Rasmus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument
  2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
  2019-07-25 12:40   ` Boris Brezillon
@ 2019-07-26  0:05   ` Andrew Morton
  2019-07-26  6:59     ` Rasmus Villemoes
  1 sibling, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2019-07-26  0:05 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, linux-kernel, Tomasz Figa,
	Nicolas Dufresne, kernel, Paul Kocialkowski, Ezequiel Garcia,
	Jonas Karlman, linux-rockchip, Heiko Stuebner, Rasmus Villemoes,
	Philipp Zabel

On Wed, 19 Jun 2019 14:15:32 +0200 Boris Brezillon <boris.brezillon@collabora.com> wrote:

> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> 
> Our list_sort() utility has always supported a context argument that
> is passed through to the comparison routine. Now there's a use case
> for the similar thing for sort().
> 
> This implements sort_r by simply extending the existing sort function
> in the obvious way. To avoid code duplication, we want to implement
> sort() in terms of sort_r(). The naive way to do that is
> 
> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
> {
>   int (*real_cmp)(const void*, const void*) = ctx;
>   return real_cmp(a, b);
> }
> 
> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
> 
> but this would do two indirect calls for each comparison. Instead, do
> as is done for the default swap functions - that only adds a cost of a
> single easily predicted branch to each comparison call.
> 
> Aside from introducing support for the context argument, this also
> serves as preparation for patches that will eliminate the indirect
> comparison calls in common cases.

Acked-by: Andrew Morton <akpm@linux-foundation.org>

> --- a/lib/sort.c
> +++ b/lib/sort.c
> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>  		swap_func(a, b, (int)size);
>  }
>  
> +typedef int (*cmp_func_t)(const void *, const void *);
> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)

Although I can't say I'm a fan of _CMP_WRAPPER.  I don't understand
what the name means.  Why not simply open-code NULL in the two sites?

> +static int do_cmp(const void *a, const void *b,
> +		  cmp_r_func_t cmp, const void *priv)
> +{
> +	if (cmp == _CMP_WRAPPER)
> +		return ((cmp_func_t)(priv))(a, b);
> +	return cmp(a, b, priv);
> +}
> +
>  /**
>   * parent - given the offset of the child, find the offset of the parent.
>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>  }
>  
>  /**
> - * sort - sort an array of elements
> + * sort_r - sort an array of elements
>   * @base: pointer to data to sort
>   * @num: number of elements
>   * @size: size of each element
>   * @cmp_func: pointer to comparison function
>   * @swap_func: pointer to swap function or NULL
> + * @priv: third argument passed to comparison function

Passing priv==NULLis part of the interface and should be documented?

>   *
>   * This function does a heapsort on the given array.  You may provide
>   * a swap_func function if you need to do something more than a memory
> @@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>   * O(n*n) worst-case behavior and extra memory requirements that make
>   * it less suitable for kernel use.
>   */
> -void sort(void *base, size_t num, size_t size,
> -	  int (*cmp_func)(const void *, const void *),
> -	  void (*swap_func)(void *, void *, int size))
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp_func)(const void *, const void *, const void *),
> +	    void (*swap_func)(void *, void *, int size),
> +	    const void *priv)
>  {
>  	/* pre-scale counters for performance */
>  	size_t n = num * size, a = (num/2) * size;
> @@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
>  		 * average, 3/4 worst-case.)
>  		 */
>  		for (b = a; c = 2*b + size, (d = c + size) < n;)
> -			b = cmp_func(base + c, base + d) >= 0 ? c : d;
> +			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
>  		if (d == n)	/* Special case last leaf with no sibling */
>  			b = c;
>  
>  		/* Now backtrack from "b" to the correct location for "a" */
> -		while (b != a && cmp_func(base + a, base + b) >= 0)
> +		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
>  			b = parent(b, lsbit, size);
>  		c = b;			/* Where "a" belongs */
>  		while (b != a) {	/* Shift it into place */
> @@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
>  		}
>  	}
>  }
> +EXPORT_SYMBOL(sort_r);
> +
> +void sort(void *base, size_t num, size_t size,
> +	  int (*cmp_func)(const void *, const void *),
> +	  void (*swap_func)(void *, void *, int size))
> +{
> +	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
> +}
>  EXPORT_SYMBOL(sort);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument
  2019-07-26  0:05   ` Andrew Morton
@ 2019-07-26  6:59     ` Rasmus Villemoes
  0 siblings, 0 replies; 23+ messages in thread
From: Rasmus Villemoes @ 2019-07-26  6:59 UTC (permalink / raw)
  To: Andrew Morton, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, linux-kernel, Tomasz Figa,
	Nicolas Dufresne, kernel, Paul Kocialkowski, Ezequiel Garcia,
	Jonas Karlman, linux-rockchip, Heiko Stuebner, Philipp Zabel

On 26/07/2019 02.05, Andrew Morton wrote:
> On Wed, 19 Jun 2019 14:15:32 +0200 Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
>> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>>
>> Our list_sort() utility has always supported a context argument that
>> is passed through to the comparison routine. Now there's a use case
>> for the similar thing for sort().
>>
>> This implements sort_r by simply extending the existing sort function
>> in the obvious way. To avoid code duplication, we want to implement
>> sort() in terms of sort_r(). The naive way to do that is
>>
>> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
>> {
>>   int (*real_cmp)(const void*, const void*) = ctx;
>>   return real_cmp(a, b);
>> }
>>
>> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
>>
>> but this would do two indirect calls for each comparison. Instead, do
>> as is done for the default swap functions - that only adds a cost of a
>> single easily predicted branch to each comparison call.
>>
>> Aside from introducing support for the context argument, this also
>> serves as preparation for patches that will eliminate the indirect
>> comparison calls in common cases.
> 
> Acked-by: Andrew Morton <akpm@linux-foundation.org>
> 
>> --- a/lib/sort.c
>> +++ b/lib/sort.c
>> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>>  		swap_func(a, b, (int)size);
>>  }
>>  
>> +typedef int (*cmp_func_t)(const void *, const void *);
>> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
>> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)
> 
> Although I can't say I'm a fan of _CMP_WRAPPER.  I don't understand
> what the name means.  Why not simply open-code NULL in the two sites?

That's the preparation part. Once I find time to tie up the loose ends,
there'll be a

  sort_by_key(base, num, swap, key)

where base must be a pointer to (array of) struct foobar, and key is the
name of an u32 or u64 (more can be added) member. Internally, that will
work by calling sort_r with a sentinel _CMP_SORT_U32 (or _CMP_SORT_U64,
...) as cmp function and offsetof(typeof(*base), key) as the priv argument.

In do_cmp, we then check whether the cmp function is a small
non-negative integer and then do the appropriate comparison directly
instead of doing an indirect call.

And this infrastructure will be shared with list_sort which will grow a
similar list_sort_by_key(). I think the actual value of _CMP_WRAPPER
will change to simplify that part, so for that reason alone I won't
hard-code NULL.


>> +static int do_cmp(const void *a, const void *b,
>> +		  cmp_r_func_t cmp, const void *priv)
>> +{
>> +	if (cmp == _CMP_WRAPPER)
>> +		return ((cmp_func_t)(priv))(a, b);
>> +	return cmp(a, b, priv);
>> +}
>> +

i.e., this becomes something like

if ((unsigned long)cmp <= ...) {
  if (cmp == CMP_WRAPPER)
     // called from sort(), priv is the original cmp_func pointer
     return ((cmp_func_t)(priv))(a, b);
  // must be called from sort_by_key, priv is the offset in each struct
  long offset = (long)priv;
  a += offset;
  b += offset;
  if (cmp == CMP_U32)
    return *(u32*)a > *(u32*)b;
  if (cmp == CMP_u64)
    return *(u64*)a > *(u64*)b;
  WARN_ONCE() // should be eliminated by smart enough compiler
  return 0;
}
return cmp(a, b, priv);

>>  /**
>>   * parent - given the offset of the child, find the offset of the parent.
>>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
>> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>>  }
>>  
>>  /**
>> - * sort - sort an array of elements
>> + * sort_r - sort an array of elements
>>   * @base: pointer to data to sort
>>   * @num: number of elements
>>   * @size: size of each element
>>   * @cmp_func: pointer to comparison function
>>   * @swap_func: pointer to swap function or NULL
>> + * @priv: third argument passed to comparison function
> 
> Passing priv==NULLis part of the interface and should be documented?

No, to sort_r() as a public function @priv is completely opaque, and the
user can pass anything he likes. Only when sort_r() is called
"internally" with a sentinel value of cmp_func (e.g. from sort() or
sort_by_key()) does the priv argument have any meaning, but that's
implementation details that should absolutely not be documented.

Rasmus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-07-25 13:31   ` Rasmus Villemoes
@ 2019-07-26 10:06     ` Boris Brezillon
  0 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-07-26 10:06 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, linux-kernel, Tomasz Figa,
	Nicolas Dufresne, kernel, Paul Kocialkowski, Ezequiel Garcia,
	Jonas Karlman, linux-rockchip, Heiko Stuebner, Andrew Morton,
	Philipp Zabel, Hertz Wong

On Thu, 25 Jul 2019 15:31:41 +0200
Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:

> On 19/06/2019 14.15, Boris Brezillon wrote:
> > From: Hertz Wong <hertz.wong@rock-chips.com>
> > 
> > Add helpers and patch hantro_{drv,v4l2}.c to prepare addition of H264
> > decoding support.
> > 
> > Signed-off-by: Hertz Wong <hertz.wong@rock-chips.com>
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > +
> > +	/*
> > +	 * Short term pics in descending pic num order, long term ones in
> > +	 * ascending order.
> > +	 */
> > +	if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > +		return b->frame_num - a->frame_num;
> > +
> > +	return a->pic_num - b->pic_num;
> > +}  
> 
> Pet peeve: This works because ->frame_num and ->pic_num are u16, so
> their difference is guaranteed to fit in an int.
> 
> > +static int b0_ref_list_cmp(const void *ptra, const void *ptrb, const void *data)
> > +{
> > +	const struct hantro_h264_reflist_builder *builder = data;
> > +	const struct v4l2_h264_dpb_entry *a, *b;
> > +	s32 poca, pocb;
> > +	u8 idxa, idxb;
> > +
> > +	idxa = *((u8 *)ptra);
> > +	idxb = *((u8 *)ptrb);
> > +	a = &builder->dpb[idxa];
> > +	b = &builder->dpb[idxb];
> > +
> > +	if ((a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) !=
> > +	    (b->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)) {
> > +		/* Short term pics firt. */
> > +		if (!(a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM))
> > +			return -1;
> > +		else
> > +			return 1;
> > +	}
> > +
> > +	/* Long term pics in ascending pic num order. */
> > +	if (a->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
> > +		return a->pic_num - b->pic_num;
> > +
> > +	poca = builder->pocs[idxa];
> > +	pocb = builder->pocs[idxb];
> > +
> > +	/*
> > +	 * Short term pics with POC < cur POC first in POC descending order
> > +	 * followed by short term pics with POC > cur POC in POC ascending
> > +	 * order.
> > +	 */
> > +	if ((poca < builder->curpoc) != (pocb < builder->curpoc))
> > +		return poca - pocb;
> > +	else if (poca < builder->curpoc)
> > +		return pocb - poca;
> > +
> > +	return poca - pocb;
> > +}  
> 
> Here, however, poca and pocb are ints. What guarantees that their values
> are not more than 2^31 apart?

Good question. Both should normally be >= 0, which I guess prevents the
s32 overflow. This being said, it's something passed by userspace, and
I don't think we check the value (yet).

> I know absolutely nothing about this code
> or what these numbers represent, so it may be obvious that they are
> smallish.

Well, a safe approach would be to replace those subtraction by a
ternary operator.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-06-19 12:15 ` [PATCH 7/9] media: hantro: Add core bits to support H264 decoding Boris Brezillon
  2019-07-25 13:31   ` Rasmus Villemoes
@ 2019-08-01  4:06   ` Tomasz Figa
  2019-08-01  5:42     ` Boris Brezillon
                       ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Tomasz Figa @ 2019-08-01  4:06 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, Linux Media Mailing List,
	Linux Kernel Mailing List, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Andrew Morton, Rasmus Villemoes, Philipp Zabel,
	Hertz Wong

Hi Boris,

On Wed, Jun 19, 2019 at 9:15 PM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
[snip]
> @@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
>                 return -EINVAL;
>         }
>
> +       /* The H264 decoder needs extra size on the output buffer. */
> +       if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
> +               extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
> +                             DIV_ROUND_UP(pixfmt->height, 16);
> +

I wonder if this shouldn't be accounted for already in the sizeimage
returned by TRY_/S_FMT, so that the application can know the required
buffer size if it uses some external allocator and DMABUF memory type.
I know we had it like this in our downstream code, but it wasn't the
problem because we use minigbm, where we explicitly add the same
padding in the rockchip backend. Any thoughts?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-08-01  4:06   ` Tomasz Figa
@ 2019-08-01  5:42     ` Boris Brezillon
  2019-08-01  7:04     ` Paul Kocialkowski
  2019-08-05 18:38     ` Ezequiel Garcia
  2 siblings, 0 replies; 23+ messages in thread
From: Boris Brezillon @ 2019-08-01  5:42 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, Linux Media Mailing List,
	Linux Kernel Mailing List, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Ezequiel Garcia, Jonas Karlman,
	open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Andrew Morton, Rasmus Villemoes, Philipp Zabel,
	Hertz Wong

On Thu, 1 Aug 2019 13:06:10 +0900
Tomasz Figa <tfiga@chromium.org> wrote:

> Hi Boris,
> 
> On Wed, Jun 19, 2019 at 9:15 PM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> [snip]
> > @@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
> >                 return -EINVAL;
> >         }
> >
> > +       /* The H264 decoder needs extra size on the output buffer. */
> > +       if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
> > +               extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
> > +                             DIV_ROUND_UP(pixfmt->height, 16);
> > +  
> 
> I wonder if this shouldn't be accounted for already in the sizeimage
> returned by TRY_/S_FMT, so that the application can know the required
> buffer size if it uses some external allocator and DMABUF memory type.
> I know we had it like this in our downstream code, but it wasn't the
> problem because we use minigbm, where we explicitly add the same
> padding in the rockchip backend. Any thoughts?

Actually, I was wondering why it was not counted in ->sizeimage. I
thought you had a good reason to not expose the extra size to userspace
so I kept it like that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-08-01  4:06   ` Tomasz Figa
  2019-08-01  5:42     ` Boris Brezillon
@ 2019-08-01  7:04     ` Paul Kocialkowski
  2019-08-01  7:12       ` Tomasz Figa
  2019-08-05 18:38     ` Ezequiel Garcia
  2 siblings, 1 reply; 23+ messages in thread
From: Paul Kocialkowski @ 2019-08-01  7:04 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, Linux Media Mailing List,
	Linux Kernel Mailing List, Nicolas Dufresne, kernel,
	Ezequiel Garcia, Jonas Karlman, open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Andrew Morton, Rasmus Villemoes, Philipp Zabel,
	Hertz Wong

[-- Attachment #1: Type: text/plain, Size: 1359 bytes --]

Hi,

On Thu 01 Aug 19, 13:06, Tomasz Figa wrote:
> Hi Boris,
> 
> On Wed, Jun 19, 2019 at 9:15 PM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> [snip]
> > @@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
> >                 return -EINVAL;
> >         }
> >
> > +       /* The H264 decoder needs extra size on the output buffer. */
> > +       if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
> > +               extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
> > +                             DIV_ROUND_UP(pixfmt->height, 16);
> > +
> 
> I wonder if this shouldn't be accounted for already in the sizeimage
> returned by TRY_/S_FMT, so that the application can know the required
> buffer size if it uses some external allocator and DMABUF memory type.
> I know we had it like this in our downstream code, but it wasn't the
> problem because we use minigbm, where we explicitly add the same
> padding in the rockchip backend. Any thoughts?

Does the extra size have to be allocated along with the buffer?

On cedrus, we have a need for a similar side-buffer but give it a dedicated CMA
allocation, which should allow dma-buf-imported buffers.

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-08-01  7:04     ` Paul Kocialkowski
@ 2019-08-01  7:12       ` Tomasz Figa
  0 siblings, 0 replies; 23+ messages in thread
From: Tomasz Figa @ 2019-08-01  7:12 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, Linux Media Mailing List,
	Linux Kernel Mailing List, Nicolas Dufresne, kernel,
	Ezequiel Garcia, Jonas Karlman, open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Andrew Morton, Rasmus Villemoes, Philipp Zabel,
	Hertz Wong

On Thu, Aug 1, 2019 at 4:04 PM Paul Kocialkowski
<paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
> On Thu 01 Aug 19, 13:06, Tomasz Figa wrote:
> > Hi Boris,
> >
> > On Wed, Jun 19, 2019 at 9:15 PM Boris Brezillon
> > <boris.brezillon@collabora.com> wrote:
> > [snip]
> > > @@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
> > >                 return -EINVAL;
> > >         }
> > >
> > > +       /* The H264 decoder needs extra size on the output buffer. */
> > > +       if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
> > > +               extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
> > > +                             DIV_ROUND_UP(pixfmt->height, 16);
> > > +
> >
> > I wonder if this shouldn't be accounted for already in the sizeimage
> > returned by TRY_/S_FMT, so that the application can know the required
> > buffer size if it uses some external allocator and DMABUF memory type.
> > I know we had it like this in our downstream code, but it wasn't the
> > problem because we use minigbm, where we explicitly add the same
> > padding in the rockchip backend. Any thoughts?
>
> Does the extra size have to be allocated along with the buffer?
>
> On cedrus, we have a need for a similar side-buffer but give it a dedicated CMA
> allocation, which should allow dma-buf-imported buffers.

Yes, the decoder stores motion vectors (IIRC) after the image data.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] media: hantro: Add core bits to support H264 decoding
  2019-08-01  4:06   ` Tomasz Figa
  2019-08-01  5:42     ` Boris Brezillon
  2019-08-01  7:04     ` Paul Kocialkowski
@ 2019-08-05 18:38     ` Ezequiel Garcia
  2 siblings, 0 replies; 23+ messages in thread
From: Ezequiel Garcia @ 2019-08-05 18:38 UTC (permalink / raw)
  To: Tomasz Figa, Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, Linux Media Mailing List,
	Linux Kernel Mailing List, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Jonas Karlman, open list:ARM/Rockchip SoC...,
	Heiko Stuebner, Andrew Morton, Rasmus Villemoes, Philipp Zabel,
	Hertz Wong

Hey Tomasz,

On Thu, 2019-08-01 at 13:06 +0900, Tomasz Figa wrote:
> Hi Boris,
> 
> On Wed, Jun 19, 2019 at 9:15 PM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> [snip]
> > @@ -533,10 +535,21 @@ hantro_queue_setup(struct vb2_queue *vq, unsigned int *num_buffers,
> >                 return -EINVAL;
> >         }
> > 
> > +       /* The H264 decoder needs extra size on the output buffer. */
> > +       if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE_RAW)
> > +               extra_size0 = 128 * DIV_ROUND_UP(pixfmt->width, 16) *
> > +                             DIV_ROUND_UP(pixfmt->height, 16);
> > +
> 
> I wonder if this shouldn't be accounted for already in the sizeimage
> returned by TRY_/S_FMT, so that the application can know the required
> buffer size if it uses some external allocator and DMABUF memory type.
> I know we had it like this in our downstream code, but it wasn't the
> problem because we use minigbm, where we explicitly add the same
> padding in the rockchip backend. Any thoughts?
> 

Nice catch. This should be fixed and accounted in TRY_FMT as you suggest.

Thanks,
Eze 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-08-05 18:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-19 12:15 [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon
2019-06-19 12:15 ` [PATCH 1/9] lib/sort.c: implement sort() variant taking context argument Boris Brezillon
2019-07-25 12:40   ` Boris Brezillon
2019-07-26  0:05   ` Andrew Morton
2019-07-26  6:59     ` Rasmus Villemoes
2019-06-19 12:15 ` [PATCH 2/9] media: hantro: Move copy_metadata() before doing a decode operation Boris Brezillon
2019-06-19 12:15 ` [PATCH 3/9] media: hantro: Constify the control array Boris Brezillon
2019-07-05 16:05   ` Ezequiel Garcia
2019-07-05 17:17     ` Boris Brezillon
2019-06-19 12:15 ` [PATCH 4/9] media: hantro: Simplify the controls creation logic Boris Brezillon
2019-06-19 12:15 ` [PATCH 5/9] media: hantro: Add hantro_get_{src,dst}_buf() helpers Boris Brezillon
2019-06-19 12:15 ` [PATCH 6/9] media: hantro: Add helpers to prepare/finish a run Boris Brezillon
2019-06-19 12:15 ` [PATCH 7/9] media: hantro: Add core bits to support H264 decoding Boris Brezillon
2019-07-25 13:31   ` Rasmus Villemoes
2019-07-26 10:06     ` Boris Brezillon
2019-08-01  4:06   ` Tomasz Figa
2019-08-01  5:42     ` Boris Brezillon
2019-08-01  7:04     ` Paul Kocialkowski
2019-08-01  7:12       ` Tomasz Figa
2019-08-05 18:38     ` Ezequiel Garcia
2019-06-19 12:15 ` [PATCH 8/9] media: hantro: Add support for H264 decoding on G1 Boris Brezillon
2019-06-19 12:15 ` [PATCH 9/9] media: hantro: Enable H264 decoding on rk3288 Boris Brezillon
2019-06-19 12:23 ` [PATCH 0/9] media: hantro: Add support for H264 decoding Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).