All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/11]  Add HANTRO G2/HEVC decoder support for IMX8MQ
@ 2021-03-03 11:39 ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The IMX8MQ got two VPUs but until now only G1 has been enabled.
This series aim to add the second VPU (aka G2) and provide basic 
HEVC decoding support.

To be able to decode HEVC it is needed to add/update some of the
structures in the uapi. In addition of them one HANTRO dedicated
control is required to inform the driver of the numbre of bits to skip
at the beginning of the slice header.
The hardware require to allocate few auxiliary buffers to store the
references frame or tile size data.

The driver has been tested with fluster test suite stream.
For example with this command: ./fluster.py run -ts JCT-VC-HEVC_V1 -d GStreamer-H.265-V4L2SL-Gst1.0
 
This series depends of the reset rework posted here: https://www.spinics.net/lists/arm-kernel/msg878440.html

Finally the both VPUs will have a node the device-tree and be
independent from v4l2 point of view.

A branch with all the dev is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/upstream_g2_v4

version 4:
- Split the changes in hevc controls in 2 commits to make them easier to
  review.
- Change hantro_codec_ops run() prototype to return errors   
- Hantro v4l2 dedicated control is now only an integer
- rebase on top of VPU reset changes posted here:
  https://www.spinics.net/lists/arm-kernel/msg878440.html
- Various fix from previous remarks
- Limit the modifications in API to what the driver needs

version 3:
- Fix typo in Hantro v4l2 dedicated control
- Add documentation for the new structures and fields
- Rebased on top of media_tree for-linus-5.12-rc1 tag

version 2:
- remove all change related to scaling
- squash commits to a coherent split
- be more verbose about the added fields
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.
- be more verbose about why I change the bindings
- remove all sign-off expect mime since it is confusing
- remove useless clocks in VPUs nodes

Benjamin

Benjamin Gaignard (11):
  media: hevc: Add fields and flags for hevc PPS
  media: hevc: Add decode params control
  media: hantro: change hantro_codec_ops run prototype to return errors
  media: hantro: Define HEVC codec profiles and supported features
  media: hantro: Add a field to distinguish the hardware versions
  media: uapi: Add a control for HANTRO driver
  media: hantro: Introduce G2/HEVC decoder
  media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
  media: hantro: IMX8M: add variant for G2/HEVC codec
  dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  arm64: dts: imx8mq: Add node to G2 hardware

 .../bindings/media/nxp,imx8mq-vpu.yaml        |  46 +-
 .../userspace-api/media/drivers/hantro.rst    |  10 +
 .../userspace-api/media/drivers/index.rst     |   1 +
 .../media/v4l/ext-ctrls-codec.rst             | 108 +++-
 .../media/v4l/vidioc-queryctrl.rst            |   6 +
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  41 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +-
 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  34 +-
 drivers/staging/media/hantro/hantro_drv.c     | 118 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |   6 +-
 .../media/hantro/hantro_g1_mpeg2_dec.c        |   4 +-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |   6 +-
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |   4 +-
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  69 +-
 .../staging/media/hantro/hantro_postproc.c    |  17 +
 drivers/staging/media/hantro/hantro_v4l2.c    |   1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  95 ++-
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |   4 +-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |   4 +-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |   6 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
 drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
 include/media/hevc-ctrls.h                    |  33 +-
 include/uapi/linux/v4l2-controls.h            |   5 +
 30 files changed, 1675 insertions(+), 92 deletions(-)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v4 00/11]  Add HANTRO G2/HEVC decoder support for IMX8MQ
@ 2021-03-03 11:39 ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The IMX8MQ got two VPUs but until now only G1 has been enabled.
This series aim to add the second VPU (aka G2) and provide basic 
HEVC decoding support.

To be able to decode HEVC it is needed to add/update some of the
structures in the uapi. In addition of them one HANTRO dedicated
control is required to inform the driver of the numbre of bits to skip
at the beginning of the slice header.
The hardware require to allocate few auxiliary buffers to store the
references frame or tile size data.

The driver has been tested with fluster test suite stream.
For example with this command: ./fluster.py run -ts JCT-VC-HEVC_V1 -d GStreamer-H.265-V4L2SL-Gst1.0
 
This series depends of the reset rework posted here: https://www.spinics.net/lists/arm-kernel/msg878440.html

Finally the both VPUs will have a node the device-tree and be
independent from v4l2 point of view.

A branch with all the dev is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/upstream_g2_v4

version 4:
- Split the changes in hevc controls in 2 commits to make them easier to
  review.
- Change hantro_codec_ops run() prototype to return errors   
- Hantro v4l2 dedicated control is now only an integer
- rebase on top of VPU reset changes posted here:
  https://www.spinics.net/lists/arm-kernel/msg878440.html
- Various fix from previous remarks
- Limit the modifications in API to what the driver needs

version 3:
- Fix typo in Hantro v4l2 dedicated control
- Add documentation for the new structures and fields
- Rebased on top of media_tree for-linus-5.12-rc1 tag

version 2:
- remove all change related to scaling
- squash commits to a coherent split
- be more verbose about the added fields
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.
- be more verbose about why I change the bindings
- remove all sign-off expect mime since it is confusing
- remove useless clocks in VPUs nodes

Benjamin

Benjamin Gaignard (11):
  media: hevc: Add fields and flags for hevc PPS
  media: hevc: Add decode params control
  media: hantro: change hantro_codec_ops run prototype to return errors
  media: hantro: Define HEVC codec profiles and supported features
  media: hantro: Add a field to distinguish the hardware versions
  media: uapi: Add a control for HANTRO driver
  media: hantro: Introduce G2/HEVC decoder
  media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
  media: hantro: IMX8M: add variant for G2/HEVC codec
  dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  arm64: dts: imx8mq: Add node to G2 hardware

 .../bindings/media/nxp,imx8mq-vpu.yaml        |  46 +-
 .../userspace-api/media/drivers/hantro.rst    |  10 +
 .../userspace-api/media/drivers/index.rst     |   1 +
 .../media/v4l/ext-ctrls-codec.rst             | 108 +++-
 .../media/v4l/vidioc-queryctrl.rst            |   6 +
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  41 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +-
 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  34 +-
 drivers/staging/media/hantro/hantro_drv.c     | 118 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |   6 +-
 .../media/hantro/hantro_g1_mpeg2_dec.c        |   4 +-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |   6 +-
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |   4 +-
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  69 +-
 .../staging/media/hantro/hantro_postproc.c    |  17 +
 drivers/staging/media/hantro/hantro_v4l2.c    |   1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  95 ++-
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |   4 +-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |   4 +-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |   6 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
 drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
 include/media/hevc-ctrls.h                    |  33 +-
 include/uapi/linux/v4l2-controls.h            |   5 +
 30 files changed, 1675 insertions(+), 92 deletions(-)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v4 00/11]  Add HANTRO G2/HEVC decoder support for IMX8MQ
@ 2021-03-03 11:39 ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The IMX8MQ got two VPUs but until now only G1 has been enabled.
This series aim to add the second VPU (aka G2) and provide basic 
HEVC decoding support.

To be able to decode HEVC it is needed to add/update some of the
structures in the uapi. In addition of them one HANTRO dedicated
control is required to inform the driver of the numbre of bits to skip
at the beginning of the slice header.
The hardware require to allocate few auxiliary buffers to store the
references frame or tile size data.

The driver has been tested with fluster test suite stream.
For example with this command: ./fluster.py run -ts JCT-VC-HEVC_V1 -d GStreamer-H.265-V4L2SL-Gst1.0
 
This series depends of the reset rework posted here: https://www.spinics.net/lists/arm-kernel/msg878440.html

Finally the both VPUs will have a node the device-tree and be
independent from v4l2 point of view.

A branch with all the dev is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/upstream_g2_v4

version 4:
- Split the changes in hevc controls in 2 commits to make them easier to
  review.
- Change hantro_codec_ops run() prototype to return errors   
- Hantro v4l2 dedicated control is now only an integer
- rebase on top of VPU reset changes posted here:
  https://www.spinics.net/lists/arm-kernel/msg878440.html
- Various fix from previous remarks
- Limit the modifications in API to what the driver needs

version 3:
- Fix typo in Hantro v4l2 dedicated control
- Add documentation for the new structures and fields
- Rebased on top of media_tree for-linus-5.12-rc1 tag

version 2:
- remove all change related to scaling
- squash commits to a coherent split
- be more verbose about the added fields
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.
- be more verbose about why I change the bindings
- remove all sign-off expect mime since it is confusing
- remove useless clocks in VPUs nodes

Benjamin

Benjamin Gaignard (11):
  media: hevc: Add fields and flags for hevc PPS
  media: hevc: Add decode params control
  media: hantro: change hantro_codec_ops run prototype to return errors
  media: hantro: Define HEVC codec profiles and supported features
  media: hantro: Add a field to distinguish the hardware versions
  media: uapi: Add a control for HANTRO driver
  media: hantro: Introduce G2/HEVC decoder
  media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
  media: hantro: IMX8M: add variant for G2/HEVC codec
  dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  arm64: dts: imx8mq: Add node to G2 hardware

 .../bindings/media/nxp,imx8mq-vpu.yaml        |  46 +-
 .../userspace-api/media/drivers/hantro.rst    |  10 +
 .../userspace-api/media/drivers/index.rst     |   1 +
 .../media/v4l/ext-ctrls-codec.rst             | 108 +++-
 .../media/v4l/vidioc-queryctrl.rst            |   6 +
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  41 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +-
 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  34 +-
 drivers/staging/media/hantro/hantro_drv.c     | 118 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |   6 +-
 .../media/hantro/hantro_g1_mpeg2_dec.c        |   4 +-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |   6 +-
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |   4 +-
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  69 +-
 .../staging/media/hantro/hantro_postproc.c    |  17 +
 drivers/staging/media/hantro/hantro_v4l2.c    |   1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  95 ++-
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |   4 +-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |   4 +-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |   6 +-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
 drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
 include/media/hevc-ctrls.h                    |  33 +-
 include/uapi/linux/v4l2-controls.h            |   5 +
 30 files changed, 1675 insertions(+), 92 deletions(-)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v4 01/11] media: hevc: Add fields and flags for hevc PPS
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add fields and flags as they are defined in
7.4.3.3.1 "General picture parameter set RBSP semantics of the
H.265 ITU specification.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../userspace-api/media/v4l/ext-ctrls-codec.rst    | 14 ++++++++++++++
 include/media/hevc-ctrls.h                         |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index 00944e97d638..d62e8e423f3b 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3234,6 +3234,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``num_extra_slice_header_bits``
       -
+    * - __u8
+      - ``num_ref_idx_l0_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l0_active_minus1
+    * - __u8
+      - ``num_ref_idx_l1_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l1_active_minus1
     * - __s8
       - ``init_qp_minus26``
       -
@@ -3342,6 +3348,14 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
       - 0x00040000
       -
+    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
+      - 0x00080000
+      - Specifies the presence of deblocking filter control syntax elements in
+        the PPS
+    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
+      - 0x00100000
+      - Specifies that tile column boundaries and likewise tile row boundaries
+        are distributed uniformly across the picture
 
 ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
     Specifies various slice-specific parameters, especially from the NAL unit
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index b4cb2ef02f17..003f819ecb26 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -100,10 +100,14 @@ struct v4l2_ctrl_hevc_sps {
 #define V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER	(1ULL << 16)
 #define V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT		(1ULL << 17)
 #define V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT (1ULL << 18)
+#define V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT	(1ULL << 19)
+#define V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING			(1ULL << 20)
 
 struct v4l2_ctrl_hevc_pps {
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture parameter set */
 	__u8	num_extra_slice_header_bits;
+	__u8	num_ref_idx_l0_default_active_minus1;
+	__u8	num_ref_idx_l1_default_active_minus1;
 	__s8	init_qp_minus26;
 	__u8	diff_cu_qp_delta_depth;
 	__s8	pps_cb_qp_offset;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 01/11] media: hevc: Add fields and flags for hevc PPS
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add fields and flags as they are defined in
7.4.3.3.1 "General picture parameter set RBSP semantics of the
H.265 ITU specification.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../userspace-api/media/v4l/ext-ctrls-codec.rst    | 14 ++++++++++++++
 include/media/hevc-ctrls.h                         |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index 00944e97d638..d62e8e423f3b 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3234,6 +3234,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``num_extra_slice_header_bits``
       -
+    * - __u8
+      - ``num_ref_idx_l0_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l0_active_minus1
+    * - __u8
+      - ``num_ref_idx_l1_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l1_active_minus1
     * - __s8
       - ``init_qp_minus26``
       -
@@ -3342,6 +3348,14 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
       - 0x00040000
       -
+    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
+      - 0x00080000
+      - Specifies the presence of deblocking filter control syntax elements in
+        the PPS
+    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
+      - 0x00100000
+      - Specifies that tile column boundaries and likewise tile row boundaries
+        are distributed uniformly across the picture
 
 ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
     Specifies various slice-specific parameters, especially from the NAL unit
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index b4cb2ef02f17..003f819ecb26 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -100,10 +100,14 @@ struct v4l2_ctrl_hevc_sps {
 #define V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER	(1ULL << 16)
 #define V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT		(1ULL << 17)
 #define V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT (1ULL << 18)
+#define V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT	(1ULL << 19)
+#define V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING			(1ULL << 20)
 
 struct v4l2_ctrl_hevc_pps {
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture parameter set */
 	__u8	num_extra_slice_header_bits;
+	__u8	num_ref_idx_l0_default_active_minus1;
+	__u8	num_ref_idx_l1_default_active_minus1;
 	__s8	init_qp_minus26;
 	__u8	diff_cu_qp_delta_depth;
 	__s8	pps_cb_qp_offset;
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 01/11] media: hevc: Add fields and flags for hevc PPS
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add fields and flags as they are defined in
7.4.3.3.1 "General picture parameter set RBSP semantics of the
H.265 ITU specification.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../userspace-api/media/v4l/ext-ctrls-codec.rst    | 14 ++++++++++++++
 include/media/hevc-ctrls.h                         |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index 00944e97d638..d62e8e423f3b 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3234,6 +3234,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``num_extra_slice_header_bits``
       -
+    * - __u8
+      - ``num_ref_idx_l0_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l0_active_minus1
+    * - __u8
+      - ``num_ref_idx_l1_default_active_minus1``
+      - Specifies the inferred value of num_ref_idx_l1_active_minus1
     * - __s8
       - ``init_qp_minus26``
       -
@@ -3342,6 +3348,14 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
       - 0x00040000
       -
+    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
+      - 0x00080000
+      - Specifies the presence of deblocking filter control syntax elements in
+        the PPS
+    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
+      - 0x00100000
+      - Specifies that tile column boundaries and likewise tile row boundaries
+        are distributed uniformly across the picture
 
 ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
     Specifies various slice-specific parameters, especially from the NAL unit
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index b4cb2ef02f17..003f819ecb26 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -100,10 +100,14 @@ struct v4l2_ctrl_hevc_sps {
 #define V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER	(1ULL << 16)
 #define V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT		(1ULL << 17)
 #define V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT (1ULL << 18)
+#define V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT	(1ULL << 19)
+#define V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING			(1ULL << 20)
 
 struct v4l2_ctrl_hevc_pps {
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture parameter set */
 	__u8	num_extra_slice_header_bits;
+	__u8	num_ref_idx_l0_default_active_minus1;
+	__u8	num_ref_idx_l1_default_active_minus1;
 	__s8	init_qp_minus26;
 	__u8	diff_cu_qp_delta_depth;
 	__s8	pps_cb_qp_offset;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 02/11] media: hevc: Add decode params control
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add decode params control and it associated structure to regroup
all the information that are needed to decode a reference frame as
it is describe in ITU-T Rec. H.265 section "8.3.2 Decoding process
for reference picture set".

Adapt Cedrus driver to these changes.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../media/v4l/ext-ctrls-codec.rst             | 94 +++++++++++++++----
 .../media/v4l/vidioc-queryctrl.rst            |  6 ++
 drivers/media/v4l2-core/v4l2-ctrls.c          | 26 +++--
 drivers/staging/media/sunxi/cedrus/cedrus.c   |  6 ++
 drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |  2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |  6 +-
 include/media/hevc-ctrls.h                    | 29 ++++--
 8 files changed, 134 insertions(+), 36 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index d62e8e423f3b..8a6d45cb437e 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3436,9 +3436,6 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``pic_struct``
       -
-    * - __u8
-      - ``num_active_dpb_entries``
-      - The number of entries in ``dpb``.
     * - __u8
       - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L0 reference elements as indices in the DPB.
@@ -3446,22 +3443,8 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
       - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L1 reference elements as indices in the DPB.
     * - __u8
-      - ``num_rps_poc_st_curr_before``
-      - The number of reference pictures in the short-term set that come before
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_st_curr_after``
-      - The number of reference pictures in the short-term set that come after
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_lt_curr``
-      - The number of reference pictures in the long-term set.
-    * - __u8
-      - ``padding[7]``
+      - ``padding``
       - Applications and drivers must set this to zero.
-    * - struct :c:type:`v4l2_hevc_dpb_entry`
-      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
-      - The decoded picture buffer, for meta-data about reference frames.
     * - struct :c:type:`v4l2_hevc_pred_weight_table`
       - ``pred_weight_table``
       - The prediction weight coefficients for inter-picture prediction.
@@ -3660,3 +3643,78 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     so this has to come from client.
     This is applicable to H264 and valid Range is from 0 to 63.
     Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
+
+``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
+    Specifies various decode parameters, especially the references picture order
+    count (POC) for all the lists (short, long, before, current, after) and the
+    number of entries for each of them.
+    These parameters are defined according to :ref:`hevc`.
+    They are described in section 8.3 "Slice decoding process" of the
+    specification.
+
+.. c:type:: v4l2_ctrl_hevc_decode_params
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_ctrl_hevc_decode_params
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __s32
+      - ``pic_order_cnt_val``
+      - PicOrderCntVal as described in section 8.3.1 "Decoding process
+        for picture order count" of the specification.
+    * - __u8
+      - ``num_active_dpb_entries``
+      - The number of entries in ``dpb``.
+    * - struct :c:type:`v4l2_hevc_dpb_entry`
+      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - The decoded picture buffer, for meta-data about reference frames.
+    * - __u8
+      - ``num_poc_st_curr_before``
+      - The number of reference pictures in the short-term set that come before
+        the current frame.
+    * - __u8
+      - ``num_poc_st_curr_after``
+      - The number of reference pictures in the short-term set that come after
+        the current frame.
+    * - __u8
+      - ``num_poc_lt_curr``
+      - The number of reference pictures in the long-term set.
+    * - __u8
+      - ``poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrBefore as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrAfter as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocLtCurr as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u64
+      - ``flags``
+      - See :ref:`Decode Parameters Flags <hevc_decode_params_flags>`
+
+.. _hevc_decode_params_flags:
+
+``Decode Parameters Flags``
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC``
+      - 0x00000001
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC``
+      - 0x00000002
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR``
+      - 0x00000004
+      -
diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index 82f61f1e2fb8..d84ae255bc79 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -486,6 +486,12 @@ See also the examples in :ref:`control`.
       - n/a
       - A struct :c:type:`v4l2_ctrl_hevc_slice_params`, containing HEVC
 	slice parameters for stateless video decoders.
+    * - ``V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS``
+      - n/a
+      - n/a
+      - n/a
+      - A struct :c:type:`v4l2_ctrl_hevc_decode_params`, containing HEVC
+	decoding parameters for stateless video decoders.
 
 .. tabularcolumns:: |p{6.6cm}|p{2.2cm}|p{8.7cm}|
 
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 016cf6204cbb..4060b5bcc3c0 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -1028,6 +1028,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_HEVC_SPS:			return "HEVC Sequence Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_PPS:			return "HEVC Picture Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:		return "HEVC Slice Parameters";
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:		return "HEVC Decode Parameters";
 	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE:		return "HEVC Decode Mode";
 	case V4L2_CID_MPEG_VIDEO_HEVC_START_CODE:		return "HEVC Start Code";
 
@@ -1482,6 +1483,9 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:
 		*type = V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS;
 		break;
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:
+		*type = V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS;
+		break;
 	case V4L2_CID_UNIT_CELL_SIZE:
 		*type = V4L2_CTRL_TYPE_AREA;
 		*flags |= V4L2_CTRL_FLAG_READ_ONLY;
@@ -1833,6 +1837,7 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 	struct v4l2_ctrl_hevc_sps *p_hevc_sps;
 	struct v4l2_ctrl_hevc_pps *p_hevc_pps;
 	struct v4l2_ctrl_hevc_slice_params *p_hevc_slice_params;
+	struct v4l2_ctrl_hevc_decode_params *p_hevc_decode_params;
 	struct v4l2_area *area;
 	void *p = ptr.p + idx * ctrl->elem_size;
 	unsigned int i;
@@ -2108,23 +2113,27 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 		zero_padding(*p_hevc_pps);
 		break;
 
-	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
-		p_hevc_slice_params = p;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		p_hevc_decode_params = p;
 
-		if (p_hevc_slice_params->num_active_dpb_entries >
+		if (p_hevc_decode_params->num_active_dpb_entries >
 		    V4L2_HEVC_DPB_ENTRIES_NUM_MAX)
 			return -EINVAL;
 
-		zero_padding(p_hevc_slice_params->pred_weight_table);
-
-		for (i = 0; i < p_hevc_slice_params->num_active_dpb_entries;
+		for (i = 0; i < p_hevc_decode_params->num_active_dpb_entries;
 		     i++) {
 			struct v4l2_hevc_dpb_entry *dpb_entry =
-				&p_hevc_slice_params->dpb[i];
+				&p_hevc_decode_params->dpb[i];
 
 			zero_padding(*dpb_entry);
 		}
 
+		break;
+
+	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
+		p_hevc_slice_params = p;
+
+		zero_padding(p_hevc_slice_params->pred_weight_table);
 		zero_padding(*p_hevc_slice_params);
 		break;
 
@@ -2821,6 +2830,9 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
 		elem_size = sizeof(struct v4l2_ctrl_hevc_slice_params);
 		break;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		elem_size = sizeof(struct v4l2_ctrl_hevc_decode_params);
+		break;
 	case V4L2_CTRL_TYPE_AREA:
 		elem_size = sizeof(struct v4l2_area);
 		break;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.c b/drivers/staging/media/sunxi/cedrus/cedrus.c
index 7bd9291c8d5f..4cd3cab1a257 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.c
@@ -151,6 +151,12 @@ static const struct cedrus_control cedrus_controls[] = {
 		},
 		.codec		= CEDRUS_CODEC_VP8,
 	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
+		.codec		= CEDRUS_CODEC_H265,
+	},
 };
 
 #define CEDRUS_CONTROLS_COUNT	ARRAY_SIZE(cedrus_controls)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.h b/drivers/staging/media/sunxi/cedrus/cedrus.h
index 251a6a660351..2ca33ac38b9a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.h
@@ -76,6 +76,7 @@ struct cedrus_h265_run {
 	const struct v4l2_ctrl_hevc_sps			*sps;
 	const struct v4l2_ctrl_hevc_pps			*pps;
 	const struct v4l2_ctrl_hevc_slice_params	*slice_params;
+	const struct v4l2_ctrl_hevc_decode_params	*decode_params;
 };
 
 struct cedrus_vp8_run {
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index a9090daf626a..cd821f417a14 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -68,6 +68,8 @@ void cedrus_device_run(void *priv)
 			V4L2_CID_MPEG_VIDEO_HEVC_PPS);
 		run.h265.slice_params = cedrus_find_control_data(ctx,
 			V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS);
+		run.h265.decode_params = cedrus_find_control_data(ctx,
+			V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
 		break;
 
 	case V4L2_PIX_FMT_VP8_FRAME:
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
index ce497d0197df..dce5db6be13a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -245,6 +245,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	const struct v4l2_ctrl_hevc_sps *sps;
 	const struct v4l2_ctrl_hevc_pps *pps;
 	const struct v4l2_ctrl_hevc_slice_params *slice_params;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
 	const struct v4l2_hevc_pred_weight_table *pred_weight_table;
 	dma_addr_t src_buf_addr;
 	dma_addr_t src_buf_end_addr;
@@ -256,6 +257,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	sps = run->h265.sps;
 	pps = run->h265.pps;
 	slice_params = run->h265.slice_params;
+	decode_params = run->h265.decode_params;
 	pred_weight_table = &slice_params->pred_weight_table;
 
 	/* MV column buffer size and allocation. */
@@ -487,7 +489,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	reg = VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(slice_params->slice_tc_offset_div2) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(slice_params->slice_beta_offset_div2) |
-	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(slice_params->num_rps_poc_st_curr_after == 0) |
+	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(decode_params->num_rps_poc_st_curr_after == 0) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(slice_params->slice_cr_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(slice_params->slice_cb_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_QP_DELTA(slice_params->slice_qp_delta);
@@ -528,7 +530,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	/* Write decoded picture buffer in pic list. */
 	cedrus_h265_frame_info_write_dpb(ctx, slice_params->dpb,
-					 slice_params->num_active_dpb_entries);
+					 decode_params->num_active_dpb_entries);
 
 	/* Output frame. */
 
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index 003f819ecb26..8e0109eea454 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -19,6 +19,7 @@
 #define V4L2_CID_MPEG_VIDEO_HEVC_SPS		(V4L2_CID_CODEC_BASE + 1008)
 #define V4L2_CID_MPEG_VIDEO_HEVC_PPS		(V4L2_CID_CODEC_BASE + 1009)
 #define V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS	(V4L2_CID_CODEC_BASE + 1010)
+#define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS	(V4L2_CID_CODEC_BASE + 1012)
 #define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE	(V4L2_CID_CODEC_BASE + 1015)
 #define V4L2_CID_MPEG_VIDEO_HEVC_START_CODE	(V4L2_CID_CODEC_BASE + 1016)
 
@@ -26,6 +27,7 @@
 #define V4L2_CTRL_TYPE_HEVC_SPS 0x0120
 #define V4L2_CTRL_TYPE_HEVC_PPS 0x0121
 #define V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS 0x0122
+#define V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS 0x0124
 
 enum v4l2_mpeg_video_hevc_decode_mode {
 	V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_SLICE_BASED,
@@ -194,18 +196,10 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u8	pic_struct;
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	__u8	num_active_dpb_entries;
 	__u8	ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 	__u8	ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 
-	__u8	num_rps_poc_st_curr_before;
-	__u8	num_rps_poc_st_curr_after;
-	__u8	num_rps_poc_lt_curr;
-
-	__u8	padding;
-
-	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	padding[5];
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
 	struct v4l2_hevc_pred_weight_table pred_weight_table;
@@ -213,4 +207,21 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u64	flags;
 };
 
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC		0x1
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC		0x2
+#define V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR  0x4
+
+struct v4l2_ctrl_hevc_decode_params {
+	__s32	pic_order_cnt_val;
+	__u8	num_active_dpb_entries;
+	struct	v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	num_poc_st_curr_before;
+	__u8	num_poc_st_curr_after;
+	__u8	num_poc_lt_curr;
+	__u8	poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u64	flags;
+};
+
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 02/11] media: hevc: Add decode params control
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add decode params control and it associated structure to regroup
all the information that are needed to decode a reference frame as
it is describe in ITU-T Rec. H.265 section "8.3.2 Decoding process
for reference picture set".

Adapt Cedrus driver to these changes.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../media/v4l/ext-ctrls-codec.rst             | 94 +++++++++++++++----
 .../media/v4l/vidioc-queryctrl.rst            |  6 ++
 drivers/media/v4l2-core/v4l2-ctrls.c          | 26 +++--
 drivers/staging/media/sunxi/cedrus/cedrus.c   |  6 ++
 drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |  2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |  6 +-
 include/media/hevc-ctrls.h                    | 29 ++++--
 8 files changed, 134 insertions(+), 36 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index d62e8e423f3b..8a6d45cb437e 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3436,9 +3436,6 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``pic_struct``
       -
-    * - __u8
-      - ``num_active_dpb_entries``
-      - The number of entries in ``dpb``.
     * - __u8
       - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L0 reference elements as indices in the DPB.
@@ -3446,22 +3443,8 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
       - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L1 reference elements as indices in the DPB.
     * - __u8
-      - ``num_rps_poc_st_curr_before``
-      - The number of reference pictures in the short-term set that come before
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_st_curr_after``
-      - The number of reference pictures in the short-term set that come after
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_lt_curr``
-      - The number of reference pictures in the long-term set.
-    * - __u8
-      - ``padding[7]``
+      - ``padding``
       - Applications and drivers must set this to zero.
-    * - struct :c:type:`v4l2_hevc_dpb_entry`
-      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
-      - The decoded picture buffer, for meta-data about reference frames.
     * - struct :c:type:`v4l2_hevc_pred_weight_table`
       - ``pred_weight_table``
       - The prediction weight coefficients for inter-picture prediction.
@@ -3660,3 +3643,78 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     so this has to come from client.
     This is applicable to H264 and valid Range is from 0 to 63.
     Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
+
+``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
+    Specifies various decode parameters, especially the references picture order
+    count (POC) for all the lists (short, long, before, current, after) and the
+    number of entries for each of them.
+    These parameters are defined according to :ref:`hevc`.
+    They are described in section 8.3 "Slice decoding process" of the
+    specification.
+
+.. c:type:: v4l2_ctrl_hevc_decode_params
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_ctrl_hevc_decode_params
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __s32
+      - ``pic_order_cnt_val``
+      - PicOrderCntVal as described in section 8.3.1 "Decoding process
+        for picture order count" of the specification.
+    * - __u8
+      - ``num_active_dpb_entries``
+      - The number of entries in ``dpb``.
+    * - struct :c:type:`v4l2_hevc_dpb_entry`
+      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - The decoded picture buffer, for meta-data about reference frames.
+    * - __u8
+      - ``num_poc_st_curr_before``
+      - The number of reference pictures in the short-term set that come before
+        the current frame.
+    * - __u8
+      - ``num_poc_st_curr_after``
+      - The number of reference pictures in the short-term set that come after
+        the current frame.
+    * - __u8
+      - ``num_poc_lt_curr``
+      - The number of reference pictures in the long-term set.
+    * - __u8
+      - ``poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrBefore as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrAfter as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocLtCurr as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u64
+      - ``flags``
+      - See :ref:`Decode Parameters Flags <hevc_decode_params_flags>`
+
+.. _hevc_decode_params_flags:
+
+``Decode Parameters Flags``
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC``
+      - 0x00000001
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC``
+      - 0x00000002
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR``
+      - 0x00000004
+      -
diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index 82f61f1e2fb8..d84ae255bc79 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -486,6 +486,12 @@ See also the examples in :ref:`control`.
       - n/a
       - A struct :c:type:`v4l2_ctrl_hevc_slice_params`, containing HEVC
 	slice parameters for stateless video decoders.
+    * - ``V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS``
+      - n/a
+      - n/a
+      - n/a
+      - A struct :c:type:`v4l2_ctrl_hevc_decode_params`, containing HEVC
+	decoding parameters for stateless video decoders.
 
 .. tabularcolumns:: |p{6.6cm}|p{2.2cm}|p{8.7cm}|
 
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 016cf6204cbb..4060b5bcc3c0 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -1028,6 +1028,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_HEVC_SPS:			return "HEVC Sequence Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_PPS:			return "HEVC Picture Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:		return "HEVC Slice Parameters";
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:		return "HEVC Decode Parameters";
 	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE:		return "HEVC Decode Mode";
 	case V4L2_CID_MPEG_VIDEO_HEVC_START_CODE:		return "HEVC Start Code";
 
@@ -1482,6 +1483,9 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:
 		*type = V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS;
 		break;
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:
+		*type = V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS;
+		break;
 	case V4L2_CID_UNIT_CELL_SIZE:
 		*type = V4L2_CTRL_TYPE_AREA;
 		*flags |= V4L2_CTRL_FLAG_READ_ONLY;
@@ -1833,6 +1837,7 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 	struct v4l2_ctrl_hevc_sps *p_hevc_sps;
 	struct v4l2_ctrl_hevc_pps *p_hevc_pps;
 	struct v4l2_ctrl_hevc_slice_params *p_hevc_slice_params;
+	struct v4l2_ctrl_hevc_decode_params *p_hevc_decode_params;
 	struct v4l2_area *area;
 	void *p = ptr.p + idx * ctrl->elem_size;
 	unsigned int i;
@@ -2108,23 +2113,27 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 		zero_padding(*p_hevc_pps);
 		break;
 
-	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
-		p_hevc_slice_params = p;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		p_hevc_decode_params = p;
 
-		if (p_hevc_slice_params->num_active_dpb_entries >
+		if (p_hevc_decode_params->num_active_dpb_entries >
 		    V4L2_HEVC_DPB_ENTRIES_NUM_MAX)
 			return -EINVAL;
 
-		zero_padding(p_hevc_slice_params->pred_weight_table);
-
-		for (i = 0; i < p_hevc_slice_params->num_active_dpb_entries;
+		for (i = 0; i < p_hevc_decode_params->num_active_dpb_entries;
 		     i++) {
 			struct v4l2_hevc_dpb_entry *dpb_entry =
-				&p_hevc_slice_params->dpb[i];
+				&p_hevc_decode_params->dpb[i];
 
 			zero_padding(*dpb_entry);
 		}
 
+		break;
+
+	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
+		p_hevc_slice_params = p;
+
+		zero_padding(p_hevc_slice_params->pred_weight_table);
 		zero_padding(*p_hevc_slice_params);
 		break;
 
@@ -2821,6 +2830,9 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
 		elem_size = sizeof(struct v4l2_ctrl_hevc_slice_params);
 		break;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		elem_size = sizeof(struct v4l2_ctrl_hevc_decode_params);
+		break;
 	case V4L2_CTRL_TYPE_AREA:
 		elem_size = sizeof(struct v4l2_area);
 		break;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.c b/drivers/staging/media/sunxi/cedrus/cedrus.c
index 7bd9291c8d5f..4cd3cab1a257 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.c
@@ -151,6 +151,12 @@ static const struct cedrus_control cedrus_controls[] = {
 		},
 		.codec		= CEDRUS_CODEC_VP8,
 	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
+		.codec		= CEDRUS_CODEC_H265,
+	},
 };
 
 #define CEDRUS_CONTROLS_COUNT	ARRAY_SIZE(cedrus_controls)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.h b/drivers/staging/media/sunxi/cedrus/cedrus.h
index 251a6a660351..2ca33ac38b9a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.h
@@ -76,6 +76,7 @@ struct cedrus_h265_run {
 	const struct v4l2_ctrl_hevc_sps			*sps;
 	const struct v4l2_ctrl_hevc_pps			*pps;
 	const struct v4l2_ctrl_hevc_slice_params	*slice_params;
+	const struct v4l2_ctrl_hevc_decode_params	*decode_params;
 };
 
 struct cedrus_vp8_run {
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index a9090daf626a..cd821f417a14 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -68,6 +68,8 @@ void cedrus_device_run(void *priv)
 			V4L2_CID_MPEG_VIDEO_HEVC_PPS);
 		run.h265.slice_params = cedrus_find_control_data(ctx,
 			V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS);
+		run.h265.decode_params = cedrus_find_control_data(ctx,
+			V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
 		break;
 
 	case V4L2_PIX_FMT_VP8_FRAME:
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
index ce497d0197df..dce5db6be13a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -245,6 +245,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	const struct v4l2_ctrl_hevc_sps *sps;
 	const struct v4l2_ctrl_hevc_pps *pps;
 	const struct v4l2_ctrl_hevc_slice_params *slice_params;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
 	const struct v4l2_hevc_pred_weight_table *pred_weight_table;
 	dma_addr_t src_buf_addr;
 	dma_addr_t src_buf_end_addr;
@@ -256,6 +257,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	sps = run->h265.sps;
 	pps = run->h265.pps;
 	slice_params = run->h265.slice_params;
+	decode_params = run->h265.decode_params;
 	pred_weight_table = &slice_params->pred_weight_table;
 
 	/* MV column buffer size and allocation. */
@@ -487,7 +489,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	reg = VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(slice_params->slice_tc_offset_div2) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(slice_params->slice_beta_offset_div2) |
-	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(slice_params->num_rps_poc_st_curr_after == 0) |
+	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(decode_params->num_rps_poc_st_curr_after == 0) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(slice_params->slice_cr_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(slice_params->slice_cb_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_QP_DELTA(slice_params->slice_qp_delta);
@@ -528,7 +530,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	/* Write decoded picture buffer in pic list. */
 	cedrus_h265_frame_info_write_dpb(ctx, slice_params->dpb,
-					 slice_params->num_active_dpb_entries);
+					 decode_params->num_active_dpb_entries);
 
 	/* Output frame. */
 
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index 003f819ecb26..8e0109eea454 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -19,6 +19,7 @@
 #define V4L2_CID_MPEG_VIDEO_HEVC_SPS		(V4L2_CID_CODEC_BASE + 1008)
 #define V4L2_CID_MPEG_VIDEO_HEVC_PPS		(V4L2_CID_CODEC_BASE + 1009)
 #define V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS	(V4L2_CID_CODEC_BASE + 1010)
+#define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS	(V4L2_CID_CODEC_BASE + 1012)
 #define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE	(V4L2_CID_CODEC_BASE + 1015)
 #define V4L2_CID_MPEG_VIDEO_HEVC_START_CODE	(V4L2_CID_CODEC_BASE + 1016)
 
@@ -26,6 +27,7 @@
 #define V4L2_CTRL_TYPE_HEVC_SPS 0x0120
 #define V4L2_CTRL_TYPE_HEVC_PPS 0x0121
 #define V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS 0x0122
+#define V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS 0x0124
 
 enum v4l2_mpeg_video_hevc_decode_mode {
 	V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_SLICE_BASED,
@@ -194,18 +196,10 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u8	pic_struct;
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	__u8	num_active_dpb_entries;
 	__u8	ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 	__u8	ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 
-	__u8	num_rps_poc_st_curr_before;
-	__u8	num_rps_poc_st_curr_after;
-	__u8	num_rps_poc_lt_curr;
-
-	__u8	padding;
-
-	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	padding[5];
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
 	struct v4l2_hevc_pred_weight_table pred_weight_table;
@@ -213,4 +207,21 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u64	flags;
 };
 
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC		0x1
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC		0x2
+#define V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR  0x4
+
+struct v4l2_ctrl_hevc_decode_params {
+	__s32	pic_order_cnt_val;
+	__u8	num_active_dpb_entries;
+	struct	v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	num_poc_st_curr_before;
+	__u8	num_poc_st_curr_after;
+	__u8	num_poc_lt_curr;
+	__u8	poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u64	flags;
+};
+
 #endif
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 02/11] media: hevc: Add decode params control
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add decode params control and it associated structure to regroup
all the information that are needed to decode a reference frame as
it is describe in ITU-T Rec. H.265 section "8.3.2 Decoding process
for reference picture set".

Adapt Cedrus driver to these changes.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 .../media/v4l/ext-ctrls-codec.rst             | 94 +++++++++++++++----
 .../media/v4l/vidioc-queryctrl.rst            |  6 ++
 drivers/media/v4l2-core/v4l2-ctrls.c          | 26 +++--
 drivers/staging/media/sunxi/cedrus/cedrus.c   |  6 ++
 drivers/staging/media/sunxi/cedrus/cedrus.h   |  1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |  2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |  6 +-
 include/media/hevc-ctrls.h                    | 29 ++++--
 8 files changed, 134 insertions(+), 36 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index d62e8e423f3b..8a6d45cb437e 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -3436,9 +3436,6 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     * - __u8
       - ``pic_struct``
       -
-    * - __u8
-      - ``num_active_dpb_entries``
-      - The number of entries in ``dpb``.
     * - __u8
       - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L0 reference elements as indices in the DPB.
@@ -3446,22 +3443,8 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
       - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
       - The list of L1 reference elements as indices in the DPB.
     * - __u8
-      - ``num_rps_poc_st_curr_before``
-      - The number of reference pictures in the short-term set that come before
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_st_curr_after``
-      - The number of reference pictures in the short-term set that come after
-        the current frame.
-    * - __u8
-      - ``num_rps_poc_lt_curr``
-      - The number of reference pictures in the long-term set.
-    * - __u8
-      - ``padding[7]``
+      - ``padding``
       - Applications and drivers must set this to zero.
-    * - struct :c:type:`v4l2_hevc_dpb_entry`
-      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
-      - The decoded picture buffer, for meta-data about reference frames.
     * - struct :c:type:`v4l2_hevc_pred_weight_table`
       - ``pred_weight_table``
       - The prediction weight coefficients for inter-picture prediction.
@@ -3660,3 +3643,78 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
     so this has to come from client.
     This is applicable to H264 and valid Range is from 0 to 63.
     Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
+
+``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
+    Specifies various decode parameters, especially the references picture order
+    count (POC) for all the lists (short, long, before, current, after) and the
+    number of entries for each of them.
+    These parameters are defined according to :ref:`hevc`.
+    They are described in section 8.3 "Slice decoding process" of the
+    specification.
+
+.. c:type:: v4l2_ctrl_hevc_decode_params
+
+.. cssclass:: longtable
+
+.. flat-table:: struct v4l2_ctrl_hevc_decode_params
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - __s32
+      - ``pic_order_cnt_val``
+      - PicOrderCntVal as described in section 8.3.1 "Decoding process
+        for picture order count" of the specification.
+    * - __u8
+      - ``num_active_dpb_entries``
+      - The number of entries in ``dpb``.
+    * - struct :c:type:`v4l2_hevc_dpb_entry`
+      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - The decoded picture buffer, for meta-data about reference frames.
+    * - __u8
+      - ``num_poc_st_curr_before``
+      - The number of reference pictures in the short-term set that come before
+        the current frame.
+    * - __u8
+      - ``num_poc_st_curr_after``
+      - The number of reference pictures in the short-term set that come after
+        the current frame.
+    * - __u8
+      - ``num_poc_lt_curr``
+      - The number of reference pictures in the long-term set.
+    * - __u8
+      - ``poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrBefore as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocStCurrAfter as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u8
+      - ``poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
+      - PocLtCurr as described in section 8.3.2 "Decoding process for reference
+        picture set.
+    * - __u64
+      - ``flags``
+      - See :ref:`Decode Parameters Flags <hevc_decode_params_flags>`
+
+.. _hevc_decode_params_flags:
+
+``Decode Parameters Flags``
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC``
+      - 0x00000001
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC``
+      - 0x00000002
+      -
+    * - ``V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR``
+      - 0x00000004
+      -
diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
index 82f61f1e2fb8..d84ae255bc79 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
@@ -486,6 +486,12 @@ See also the examples in :ref:`control`.
       - n/a
       - A struct :c:type:`v4l2_ctrl_hevc_slice_params`, containing HEVC
 	slice parameters for stateless video decoders.
+    * - ``V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS``
+      - n/a
+      - n/a
+      - n/a
+      - A struct :c:type:`v4l2_ctrl_hevc_decode_params`, containing HEVC
+	decoding parameters for stateless video decoders.
 
 .. tabularcolumns:: |p{6.6cm}|p{2.2cm}|p{8.7cm}|
 
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 016cf6204cbb..4060b5bcc3c0 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -1028,6 +1028,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_HEVC_SPS:			return "HEVC Sequence Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_PPS:			return "HEVC Picture Parameter Set";
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:		return "HEVC Slice Parameters";
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:		return "HEVC Decode Parameters";
 	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE:		return "HEVC Decode Mode";
 	case V4L2_CID_MPEG_VIDEO_HEVC_START_CODE:		return "HEVC Start Code";
 
@@ -1482,6 +1483,9 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS:
 		*type = V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS;
 		break;
+	case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS:
+		*type = V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS;
+		break;
 	case V4L2_CID_UNIT_CELL_SIZE:
 		*type = V4L2_CTRL_TYPE_AREA;
 		*flags |= V4L2_CTRL_FLAG_READ_ONLY;
@@ -1833,6 +1837,7 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 	struct v4l2_ctrl_hevc_sps *p_hevc_sps;
 	struct v4l2_ctrl_hevc_pps *p_hevc_pps;
 	struct v4l2_ctrl_hevc_slice_params *p_hevc_slice_params;
+	struct v4l2_ctrl_hevc_decode_params *p_hevc_decode_params;
 	struct v4l2_area *area;
 	void *p = ptr.p + idx * ctrl->elem_size;
 	unsigned int i;
@@ -2108,23 +2113,27 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
 		zero_padding(*p_hevc_pps);
 		break;
 
-	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
-		p_hevc_slice_params = p;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		p_hevc_decode_params = p;
 
-		if (p_hevc_slice_params->num_active_dpb_entries >
+		if (p_hevc_decode_params->num_active_dpb_entries >
 		    V4L2_HEVC_DPB_ENTRIES_NUM_MAX)
 			return -EINVAL;
 
-		zero_padding(p_hevc_slice_params->pred_weight_table);
-
-		for (i = 0; i < p_hevc_slice_params->num_active_dpb_entries;
+		for (i = 0; i < p_hevc_decode_params->num_active_dpb_entries;
 		     i++) {
 			struct v4l2_hevc_dpb_entry *dpb_entry =
-				&p_hevc_slice_params->dpb[i];
+				&p_hevc_decode_params->dpb[i];
 
 			zero_padding(*dpb_entry);
 		}
 
+		break;
+
+	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
+		p_hevc_slice_params = p;
+
+		zero_padding(p_hevc_slice_params->pred_weight_table);
 		zero_padding(*p_hevc_slice_params);
 		break;
 
@@ -2821,6 +2830,9 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl,
 	case V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS:
 		elem_size = sizeof(struct v4l2_ctrl_hevc_slice_params);
 		break;
+	case V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS:
+		elem_size = sizeof(struct v4l2_ctrl_hevc_decode_params);
+		break;
 	case V4L2_CTRL_TYPE_AREA:
 		elem_size = sizeof(struct v4l2_area);
 		break;
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.c b/drivers/staging/media/sunxi/cedrus/cedrus.c
index 7bd9291c8d5f..4cd3cab1a257 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.c
@@ -151,6 +151,12 @@ static const struct cedrus_control cedrus_controls[] = {
 		},
 		.codec		= CEDRUS_CODEC_VP8,
 	},
+	{
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
+		.codec		= CEDRUS_CODEC_H265,
+	},
 };
 
 #define CEDRUS_CONTROLS_COUNT	ARRAY_SIZE(cedrus_controls)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus.h b/drivers/staging/media/sunxi/cedrus/cedrus.h
index 251a6a660351..2ca33ac38b9a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus.h
@@ -76,6 +76,7 @@ struct cedrus_h265_run {
 	const struct v4l2_ctrl_hevc_sps			*sps;
 	const struct v4l2_ctrl_hevc_pps			*pps;
 	const struct v4l2_ctrl_hevc_slice_params	*slice_params;
+	const struct v4l2_ctrl_hevc_decode_params	*decode_params;
 };
 
 struct cedrus_vp8_run {
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
index a9090daf626a..cd821f417a14 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_dec.c
@@ -68,6 +68,8 @@ void cedrus_device_run(void *priv)
 			V4L2_CID_MPEG_VIDEO_HEVC_PPS);
 		run.h265.slice_params = cedrus_find_control_data(ctx,
 			V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS);
+		run.h265.decode_params = cedrus_find_control_data(ctx,
+			V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
 		break;
 
 	case V4L2_PIX_FMT_VP8_FRAME:
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
index ce497d0197df..dce5db6be13a 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -245,6 +245,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	const struct v4l2_ctrl_hevc_sps *sps;
 	const struct v4l2_ctrl_hevc_pps *pps;
 	const struct v4l2_ctrl_hevc_slice_params *slice_params;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
 	const struct v4l2_hevc_pred_weight_table *pred_weight_table;
 	dma_addr_t src_buf_addr;
 	dma_addr_t src_buf_end_addr;
@@ -256,6 +257,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 	sps = run->h265.sps;
 	pps = run->h265.pps;
 	slice_params = run->h265.slice_params;
+	decode_params = run->h265.decode_params;
 	pred_weight_table = &slice_params->pred_weight_table;
 
 	/* MV column buffer size and allocation. */
@@ -487,7 +489,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	reg = VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(slice_params->slice_tc_offset_div2) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(slice_params->slice_beta_offset_div2) |
-	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(slice_params->num_rps_poc_st_curr_after == 0) |
+	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(decode_params->num_rps_poc_st_curr_after == 0) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(slice_params->slice_cr_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(slice_params->slice_cb_qp_offset) |
 	      VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_QP_DELTA(slice_params->slice_qp_delta);
@@ -528,7 +530,7 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
 
 	/* Write decoded picture buffer in pic list. */
 	cedrus_h265_frame_info_write_dpb(ctx, slice_params->dpb,
-					 slice_params->num_active_dpb_entries);
+					 decode_params->num_active_dpb_entries);
 
 	/* Output frame. */
 
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index 003f819ecb26..8e0109eea454 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -19,6 +19,7 @@
 #define V4L2_CID_MPEG_VIDEO_HEVC_SPS		(V4L2_CID_CODEC_BASE + 1008)
 #define V4L2_CID_MPEG_VIDEO_HEVC_PPS		(V4L2_CID_CODEC_BASE + 1009)
 #define V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS	(V4L2_CID_CODEC_BASE + 1010)
+#define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS	(V4L2_CID_CODEC_BASE + 1012)
 #define V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE	(V4L2_CID_CODEC_BASE + 1015)
 #define V4L2_CID_MPEG_VIDEO_HEVC_START_CODE	(V4L2_CID_CODEC_BASE + 1016)
 
@@ -26,6 +27,7 @@
 #define V4L2_CTRL_TYPE_HEVC_SPS 0x0120
 #define V4L2_CTRL_TYPE_HEVC_PPS 0x0121
 #define V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS 0x0122
+#define V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS 0x0124
 
 enum v4l2_mpeg_video_hevc_decode_mode {
 	V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_SLICE_BASED,
@@ -194,18 +196,10 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u8	pic_struct;
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	__u8	num_active_dpb_entries;
 	__u8	ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 	__u8	ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
 
-	__u8	num_rps_poc_st_curr_before;
-	__u8	num_rps_poc_st_curr_after;
-	__u8	num_rps_poc_lt_curr;
-
-	__u8	padding;
-
-	/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
-	struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	padding[5];
 
 	/* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
 	struct v4l2_hevc_pred_weight_table pred_weight_table;
@@ -213,4 +207,21 @@ struct v4l2_ctrl_hevc_slice_params {
 	__u64	flags;
 };
 
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC		0x1
+#define V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC		0x2
+#define V4L2_HEVC_DECODE_PARAM_FLAG_NO_OUTPUT_OF_PRIOR  0x4
+
+struct v4l2_ctrl_hevc_decode_params {
+	__s32	pic_order_cnt_val;
+	__u8	num_active_dpb_entries;
+	struct	v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	num_poc_st_curr_before;
+	__u8	num_poc_st_curr_after;
+	__u8	num_poc_lt_curr;
+	__u8	poc_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u8	poc_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
+	__u64	flags;
+};
+
 #endif
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Change hantro_codec_ops run prototype from 'void' to 'int'.
This allow to cancel the job if an error occur while configuring
the hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
 .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
 drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
 9 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e5f200e64993..ac1429f00b33 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -161,7 +161,9 @@ static void device_run(void *priv)
 
 	v4l2_m2m_buf_copy_metadata(src, dst, true);
 
-	ctx->codec_ops->run(ctx);
+	if (ctx->codec_ops->run(ctx))
+		goto err_cancel_job;
+
 	return;
 
 err_cancel_job:
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 845bef73d218..fcd4db13c9fe 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
 	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
 }
 
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 
 	/* Prepare the H264 decoder context. */
 	if (hantro_h264_dec_prepare_run(ctx))
-		return;
+		return -EINVAL;
 
 	/* Configure hardware registers. */
 	set_params(ctx);
@@ -301,4 +301,6 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 			   G1_REG_CONFIG_DEC_CLK_GATE_E,
 			   G1_REG_CONFIG);
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
index 6386a3989bfe..5e8943d31dc5 100644
--- a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
@@ -155,7 +155,7 @@ hantro_g1_mpeg2_dec_set_buffers(struct hantro_dev *vpu, struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, backward_addr, G1_REG_REFER3_BASE);
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -248,4 +248,6 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = G1_REG_DEC_E(1);
 	vdpu_write(vpu, reg, G1_SWREG(1));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
index a5cdf150cd16..d665df026546 100644
--- a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
@@ -426,7 +426,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
 }
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -439,7 +439,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -499,4 +499,6 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
index b88dc4ed06db..56cf261a8e95 100644
--- a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
+++ b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
@@ -88,7 +88,7 @@ hantro_h1_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -136,6 +136,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vepu_write(vpu, reg, H1_REG_ENC_CTRL);
+
+	return 0;
 }
 
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx)
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 34c9e4649a25..4e2e7a5ed283 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -126,14 +126,15 @@ struct hantro_postproc_ctx {
  *		Optional and called from process context.
  * @run:	Start single {en,de)coding job. Called from atomic context
  *		to indicate that a pair of buffers is ready and the hardware
- *		should be programmed and started.
+ *		should be programmed and started. Returns zero if OK, a
+ *		negative value in error cases.
  * @done:	Read back processing results and additional data from hardware.
  * @reset:	Reset the hardware in case of a timeout.
  */
 struct hantro_codec_ops {
 	int (*init)(struct hantro_ctx *ctx);
 	void (*exit)(struct hantro_ctx *ctx);
-	void (*run)(struct hantro_ctx *ctx);
+	int (*run)(struct hantro_ctx *ctx);
 	void (*done)(struct hantro_ctx *ctx);
 	void (*reset)(struct hantro_ctx *ctx);
 };
@@ -164,8 +165,8 @@ void hantro_irq_done(struct hantro_dev *vpu,
 void hantro_start_prepare_run(struct hantro_ctx *ctx);
 void hantro_end_prepare_run(struct hantro_ctx *ctx);
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
 int hantro_jpeg_enc_init(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
@@ -173,7 +174,7 @@ void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
 dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 				   unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
@@ -204,15 +205,15 @@ hantro_h264_mv_size(unsigned int width, unsigned int height)
 	return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32;
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
 	const struct v4l2_ctrl_mpeg2_quantization *ctrl);
 int hantro_mpeg2_dec_init(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_exit(struct hantro_ctx *ctx);
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
 int hantro_vp8_dec_init(struct hantro_ctx *ctx);
 void hantro_vp8_dec_exit(struct hantro_ctx *ctx);
 void hantro_vp8_prob_update(struct hantro_ctx *ctx,
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
index 3498e6124acd..3a27ebef4f38 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
@@ -118,7 +118,7 @@ rk3399_vpu_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -168,4 +168,6 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 	/* Kick the watchdog and start encoding */
 	hantro_end_prepare_run(ctx);
 	vepu_write(vpu, reg, VEPU_REG_ENCODE_START);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
index f610fa5b4335..4bd3080abbc1 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
@@ -157,7 +157,7 @@ rk3399_vpu_mpeg2_dec_set_buffers(struct hantro_dev *vpu,
 	vdpu_write_relaxed(vpu, backward_addr, VDPU_REG_REFER3_BASE);
 }
 
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -254,4 +254,6 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = vdpu_read(vpu, VDPU_SWREG(57)) | VDPU_REG_DEC_E(1);
 	vdpu_write(vpu, reg, VDPU_SWREG(57));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
index a4a792f00b11..755571e16fcd 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
@@ -504,7 +504,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, VDPU_REG_ADDR_DST);
 }
 
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -517,7 +517,7 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -590,4 +590,6 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	hantro_reg_write(vpu, &vp8_dec_start_dec, 1);
+
+	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Change hantro_codec_ops run prototype from 'void' to 'int'.
This allow to cancel the job if an error occur while configuring
the hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
 .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
 drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
 9 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e5f200e64993..ac1429f00b33 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -161,7 +161,9 @@ static void device_run(void *priv)
 
 	v4l2_m2m_buf_copy_metadata(src, dst, true);
 
-	ctx->codec_ops->run(ctx);
+	if (ctx->codec_ops->run(ctx))
+		goto err_cancel_job;
+
 	return;
 
 err_cancel_job:
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 845bef73d218..fcd4db13c9fe 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
 	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
 }
 
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 
 	/* Prepare the H264 decoder context. */
 	if (hantro_h264_dec_prepare_run(ctx))
-		return;
+		return -EINVAL;
 
 	/* Configure hardware registers. */
 	set_params(ctx);
@@ -301,4 +301,6 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 			   G1_REG_CONFIG_DEC_CLK_GATE_E,
 			   G1_REG_CONFIG);
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
index 6386a3989bfe..5e8943d31dc5 100644
--- a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
@@ -155,7 +155,7 @@ hantro_g1_mpeg2_dec_set_buffers(struct hantro_dev *vpu, struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, backward_addr, G1_REG_REFER3_BASE);
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -248,4 +248,6 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = G1_REG_DEC_E(1);
 	vdpu_write(vpu, reg, G1_SWREG(1));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
index a5cdf150cd16..d665df026546 100644
--- a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
@@ -426,7 +426,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
 }
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -439,7 +439,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -499,4 +499,6 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
index b88dc4ed06db..56cf261a8e95 100644
--- a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
+++ b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
@@ -88,7 +88,7 @@ hantro_h1_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -136,6 +136,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vepu_write(vpu, reg, H1_REG_ENC_CTRL);
+
+	return 0;
 }
 
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx)
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 34c9e4649a25..4e2e7a5ed283 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -126,14 +126,15 @@ struct hantro_postproc_ctx {
  *		Optional and called from process context.
  * @run:	Start single {en,de)coding job. Called from atomic context
  *		to indicate that a pair of buffers is ready and the hardware
- *		should be programmed and started.
+ *		should be programmed and started. Returns zero if OK, a
+ *		negative value in error cases.
  * @done:	Read back processing results and additional data from hardware.
  * @reset:	Reset the hardware in case of a timeout.
  */
 struct hantro_codec_ops {
 	int (*init)(struct hantro_ctx *ctx);
 	void (*exit)(struct hantro_ctx *ctx);
-	void (*run)(struct hantro_ctx *ctx);
+	int (*run)(struct hantro_ctx *ctx);
 	void (*done)(struct hantro_ctx *ctx);
 	void (*reset)(struct hantro_ctx *ctx);
 };
@@ -164,8 +165,8 @@ void hantro_irq_done(struct hantro_dev *vpu,
 void hantro_start_prepare_run(struct hantro_ctx *ctx);
 void hantro_end_prepare_run(struct hantro_ctx *ctx);
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
 int hantro_jpeg_enc_init(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
@@ -173,7 +174,7 @@ void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
 dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 				   unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
@@ -204,15 +205,15 @@ hantro_h264_mv_size(unsigned int width, unsigned int height)
 	return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32;
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
 	const struct v4l2_ctrl_mpeg2_quantization *ctrl);
 int hantro_mpeg2_dec_init(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_exit(struct hantro_ctx *ctx);
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
 int hantro_vp8_dec_init(struct hantro_ctx *ctx);
 void hantro_vp8_dec_exit(struct hantro_ctx *ctx);
 void hantro_vp8_prob_update(struct hantro_ctx *ctx,
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
index 3498e6124acd..3a27ebef4f38 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
@@ -118,7 +118,7 @@ rk3399_vpu_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -168,4 +168,6 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 	/* Kick the watchdog and start encoding */
 	hantro_end_prepare_run(ctx);
 	vepu_write(vpu, reg, VEPU_REG_ENCODE_START);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
index f610fa5b4335..4bd3080abbc1 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
@@ -157,7 +157,7 @@ rk3399_vpu_mpeg2_dec_set_buffers(struct hantro_dev *vpu,
 	vdpu_write_relaxed(vpu, backward_addr, VDPU_REG_REFER3_BASE);
 }
 
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -254,4 +254,6 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = vdpu_read(vpu, VDPU_SWREG(57)) | VDPU_REG_DEC_E(1);
 	vdpu_write(vpu, reg, VDPU_SWREG(57));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
index a4a792f00b11..755571e16fcd 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
@@ -504,7 +504,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, VDPU_REG_ADDR_DST);
 }
 
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -517,7 +517,7 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -590,4 +590,6 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	hantro_reg_write(vpu, &vp8_dec_start_dec, 1);
+
+	return 0;
 }
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Change hantro_codec_ops run prototype from 'void' to 'int'.
This allow to cancel the job if an error occur while configuring
the hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
 .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
 .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
 .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
 drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
 .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
 .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
 .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
 9 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e5f200e64993..ac1429f00b33 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -161,7 +161,9 @@ static void device_run(void *priv)
 
 	v4l2_m2m_buf_copy_metadata(src, dst, true);
 
-	ctx->codec_ops->run(ctx);
+	if (ctx->codec_ops->run(ctx))
+		goto err_cancel_job;
+
 	return;
 
 err_cancel_job:
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 845bef73d218..fcd4db13c9fe 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
 	vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
 }
 
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 
 	/* Prepare the H264 decoder context. */
 	if (hantro_h264_dec_prepare_run(ctx))
-		return;
+		return -EINVAL;
 
 	/* Configure hardware registers. */
 	set_params(ctx);
@@ -301,4 +301,6 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
 			   G1_REG_CONFIG_DEC_CLK_GATE_E,
 			   G1_REG_CONFIG);
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
index 6386a3989bfe..5e8943d31dc5 100644
--- a/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_mpeg2_dec.c
@@ -155,7 +155,7 @@ hantro_g1_mpeg2_dec_set_buffers(struct hantro_dev *vpu, struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, backward_addr, G1_REG_REFER3_BASE);
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -248,4 +248,6 @@ void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = G1_REG_DEC_E(1);
 	vdpu_write(vpu, reg, G1_SWREG(1));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
index a5cdf150cd16..d665df026546 100644
--- a/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_vp8_dec.c
@@ -426,7 +426,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
 }
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -439,7 +439,7 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -499,4 +499,6 @@ void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
index b88dc4ed06db..56cf261a8e95 100644
--- a/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
+++ b/drivers/staging/media/hantro/hantro_h1_jpeg_enc.c
@@ -88,7 +88,7 @@ hantro_h1_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -136,6 +136,8 @@ void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	vepu_write(vpu, reg, H1_REG_ENC_CTRL);
+
+	return 0;
 }
 
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx)
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 34c9e4649a25..4e2e7a5ed283 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -126,14 +126,15 @@ struct hantro_postproc_ctx {
  *		Optional and called from process context.
  * @run:	Start single {en,de)coding job. Called from atomic context
  *		to indicate that a pair of buffers is ready and the hardware
- *		should be programmed and started.
+ *		should be programmed and started. Returns zero if OK, a
+ *		negative value in error cases.
  * @done:	Read back processing results and additional data from hardware.
  * @reset:	Reset the hardware in case of a timeout.
  */
 struct hantro_codec_ops {
 	int (*init)(struct hantro_ctx *ctx);
 	void (*exit)(struct hantro_ctx *ctx);
-	void (*run)(struct hantro_ctx *ctx);
+	int (*run)(struct hantro_ctx *ctx);
 	void (*done)(struct hantro_ctx *ctx);
 	void (*reset)(struct hantro_ctx *ctx);
 };
@@ -164,8 +165,8 @@ void hantro_irq_done(struct hantro_dev *vpu,
 void hantro_start_prepare_run(struct hantro_ctx *ctx);
 void hantro_end_prepare_run(struct hantro_ctx *ctx);
 
-void hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
+int hantro_h1_jpeg_enc_run(struct hantro_ctx *ctx);
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx);
 int hantro_jpeg_enc_init(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_exit(struct hantro_ctx *ctx);
 void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
@@ -173,7 +174,7 @@ void hantro_jpeg_enc_done(struct hantro_ctx *ctx);
 dma_addr_t hantro_h264_get_ref_buf(struct hantro_ctx *ctx,
 				   unsigned int dpb_idx);
 int hantro_h264_dec_prepare_run(struct hantro_ctx *ctx);
-void hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
@@ -204,15 +205,15 @@ hantro_h264_mv_size(unsigned int width, unsigned int height)
 	return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32;
 }
 
-void hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_copy_qtable(u8 *qtable,
 	const struct v4l2_ctrl_mpeg2_quantization *ctrl);
 int hantro_mpeg2_dec_init(struct hantro_ctx *ctx);
 void hantro_mpeg2_dec_exit(struct hantro_ctx *ctx);
 
-void hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
+int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx);
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx);
 int hantro_vp8_dec_init(struct hantro_ctx *ctx);
 void hantro_vp8_dec_exit(struct hantro_ctx *ctx);
 void hantro_vp8_prob_update(struct hantro_ctx *ctx,
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
index 3498e6124acd..3a27ebef4f38 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_jpeg_enc.c
@@ -118,7 +118,7 @@ rk3399_vpu_jpeg_enc_set_qtable(struct hantro_dev *vpu,
 	}
 }
 
-void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
+int rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -168,4 +168,6 @@ void rk3399_vpu_jpeg_enc_run(struct hantro_ctx *ctx)
 	/* Kick the watchdog and start encoding */
 	hantro_end_prepare_run(ctx);
 	vepu_write(vpu, reg, VEPU_REG_ENCODE_START);
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
index f610fa5b4335..4bd3080abbc1 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_mpeg2_dec.c
@@ -157,7 +157,7 @@ rk3399_vpu_mpeg2_dec_set_buffers(struct hantro_dev *vpu,
 	vdpu_write_relaxed(vpu, backward_addr, VDPU_REG_REFER3_BASE);
 }
 
-void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *src_buf, *dst_buf;
@@ -254,4 +254,6 @@ void rk3399_vpu_mpeg2_dec_run(struct hantro_ctx *ctx)
 
 	reg = vdpu_read(vpu, VDPU_SWREG(57)) | VDPU_REG_DEC_E(1);
 	vdpu_write(vpu, reg, VDPU_SWREG(57));
+
+	return 0;
 }
diff --git a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
index a4a792f00b11..755571e16fcd 100644
--- a/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
+++ b/drivers/staging/media/hantro/rk3399_vpu_hw_vp8_dec.c
@@ -504,7 +504,7 @@ static void cfg_buffers(struct hantro_ctx *ctx,
 	vdpu_write_relaxed(vpu, dst_dma, VDPU_REG_ADDR_DST);
 }
 
-void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
+int rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 {
 	const struct v4l2_ctrl_vp8_frame_header *hdr;
 	struct hantro_dev *vpu = ctx->dev;
@@ -517,7 +517,7 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 
 	hdr = hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER);
 	if (WARN_ON(!hdr))
-		return;
+		return -EINVAL;
 
 	/* Reset segment_map buffer in keyframe */
 	if (VP8_FRAME_IS_KEY_FRAME(hdr) && ctx->vp8_dec.segment_map.cpu)
@@ -590,4 +590,6 @@ void rk3399_vpu_vp8_dec_run(struct hantro_ctx *ctx)
 	hantro_end_prepare_run(ctx);
 
 	hantro_reg_write(vpu, &vp8_dec_start_dec, 1);
+
+	return 0;
 }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 04/11] media: hantro: Define HEVC codec profiles and supported features
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Define which HEVC profiles (up to level 5.1) and features
(no scaling, no 10 bits) are supported by the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h     |  3 ++
 drivers/staging/media/hantro/hantro_drv.c | 58 +++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 65f9f7ea7dcf..a76a0d79db9f 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -34,6 +34,7 @@ struct hantro_codec_ops;
 #define HANTRO_MPEG2_DECODER	BIT(16)
 #define HANTRO_VP8_DECODER	BIT(17)
 #define HANTRO_H264_DECODER	BIT(18)
+#define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -99,6 +100,7 @@ struct hantro_variant {
  * @HANTRO_MODE_H264_DEC: H264 decoder.
  * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
  * @HANTRO_MODE_VP8_DEC: VP8 decoder.
+ * @HANTRO_MODE_HEVC_DEC: HEVC decoder.
  */
 enum hantro_codec_mode {
 	HANTRO_MODE_NONE = -1,
@@ -106,6 +108,7 @@ enum hantro_codec_mode {
 	HANTRO_MODE_H264_DEC,
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
+	HANTRO_MODE_HEVC_DEC,
 };
 
 /*
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index ac1429f00b33..f0b68e16fcc0 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -245,6 +245,18 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
 		if (sps->bit_depth_luma_minus8 != 0)
 			/* Only 8-bit is supported */
 			return -EINVAL;
+	} else if (ctrl->id == V4L2_CID_MPEG_VIDEO_HEVC_SPS) {
+		const struct v4l2_ctrl_hevc_sps *sps = ctrl->p_new.p_hevc_sps;
+
+		if (sps->bit_depth_luma_minus8 != sps->bit_depth_chroma_minus8)
+			/* Luma and chroma bit depth mismatch */
+			return -EINVAL;
+		if (sps->bit_depth_luma_minus8 != 0)
+			/* Only 8-bit is supported */
+			return -EINVAL;
+		if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)
+			/* No scaling support */
+			return -EINVAL;
 	}
 	return 0;
 }
@@ -351,6 +363,52 @@ static const struct hantro_ctrl controls[] = {
 			.def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
 		}
 	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.max = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.def = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_START_CODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.max = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.def = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PROFILE,
+			.min = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+			.max = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN_10,
+			.def = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_LEVEL,
+			.min = V4L2_MPEG_VIDEO_HEVC_LEVEL_1,
+			.max = V4L2_MPEG_VIDEO_HEVC_LEVEL_5_1,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_SPS,
+			.ops = &hantro_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PPS,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
 	},
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 04/11] media: hantro: Define HEVC codec profiles and supported features
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Define which HEVC profiles (up to level 5.1) and features
(no scaling, no 10 bits) are supported by the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h     |  3 ++
 drivers/staging/media/hantro/hantro_drv.c | 58 +++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 65f9f7ea7dcf..a76a0d79db9f 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -34,6 +34,7 @@ struct hantro_codec_ops;
 #define HANTRO_MPEG2_DECODER	BIT(16)
 #define HANTRO_VP8_DECODER	BIT(17)
 #define HANTRO_H264_DECODER	BIT(18)
+#define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -99,6 +100,7 @@ struct hantro_variant {
  * @HANTRO_MODE_H264_DEC: H264 decoder.
  * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
  * @HANTRO_MODE_VP8_DEC: VP8 decoder.
+ * @HANTRO_MODE_HEVC_DEC: HEVC decoder.
  */
 enum hantro_codec_mode {
 	HANTRO_MODE_NONE = -1,
@@ -106,6 +108,7 @@ enum hantro_codec_mode {
 	HANTRO_MODE_H264_DEC,
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
+	HANTRO_MODE_HEVC_DEC,
 };
 
 /*
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index ac1429f00b33..f0b68e16fcc0 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -245,6 +245,18 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
 		if (sps->bit_depth_luma_minus8 != 0)
 			/* Only 8-bit is supported */
 			return -EINVAL;
+	} else if (ctrl->id == V4L2_CID_MPEG_VIDEO_HEVC_SPS) {
+		const struct v4l2_ctrl_hevc_sps *sps = ctrl->p_new.p_hevc_sps;
+
+		if (sps->bit_depth_luma_minus8 != sps->bit_depth_chroma_minus8)
+			/* Luma and chroma bit depth mismatch */
+			return -EINVAL;
+		if (sps->bit_depth_luma_minus8 != 0)
+			/* Only 8-bit is supported */
+			return -EINVAL;
+		if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)
+			/* No scaling support */
+			return -EINVAL;
 	}
 	return 0;
 }
@@ -351,6 +363,52 @@ static const struct hantro_ctrl controls[] = {
 			.def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
 		}
 	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.max = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.def = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_START_CODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.max = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.def = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PROFILE,
+			.min = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+			.max = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN_10,
+			.def = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_LEVEL,
+			.min = V4L2_MPEG_VIDEO_HEVC_LEVEL_1,
+			.max = V4L2_MPEG_VIDEO_HEVC_LEVEL_5_1,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_SPS,
+			.ops = &hantro_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PPS,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
 	},
 };
 
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 04/11] media: hantro: Define HEVC codec profiles and supported features
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Define which HEVC profiles (up to level 5.1) and features
(no scaling, no 10 bits) are supported by the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h     |  3 ++
 drivers/staging/media/hantro/hantro_drv.c | 58 +++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 65f9f7ea7dcf..a76a0d79db9f 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -34,6 +34,7 @@ struct hantro_codec_ops;
 #define HANTRO_MPEG2_DECODER	BIT(16)
 #define HANTRO_VP8_DECODER	BIT(17)
 #define HANTRO_H264_DECODER	BIT(18)
+#define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
 /**
@@ -99,6 +100,7 @@ struct hantro_variant {
  * @HANTRO_MODE_H264_DEC: H264 decoder.
  * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
  * @HANTRO_MODE_VP8_DEC: VP8 decoder.
+ * @HANTRO_MODE_HEVC_DEC: HEVC decoder.
  */
 enum hantro_codec_mode {
 	HANTRO_MODE_NONE = -1,
@@ -106,6 +108,7 @@ enum hantro_codec_mode {
 	HANTRO_MODE_H264_DEC,
 	HANTRO_MODE_MPEG2_DEC,
 	HANTRO_MODE_VP8_DEC,
+	HANTRO_MODE_HEVC_DEC,
 };
 
 /*
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index ac1429f00b33..f0b68e16fcc0 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -245,6 +245,18 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
 		if (sps->bit_depth_luma_minus8 != 0)
 			/* Only 8-bit is supported */
 			return -EINVAL;
+	} else if (ctrl->id == V4L2_CID_MPEG_VIDEO_HEVC_SPS) {
+		const struct v4l2_ctrl_hevc_sps *sps = ctrl->p_new.p_hevc_sps;
+
+		if (sps->bit_depth_luma_minus8 != sps->bit_depth_chroma_minus8)
+			/* Luma and chroma bit depth mismatch */
+			return -EINVAL;
+		if (sps->bit_depth_luma_minus8 != 0)
+			/* Only 8-bit is supported */
+			return -EINVAL;
+		if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)
+			/* No scaling support */
+			return -EINVAL;
 	}
 	return 0;
 }
@@ -351,6 +363,52 @@ static const struct hantro_ctrl controls[] = {
 			.def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
 		}
 	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.max = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+			.def = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_START_CODE,
+			.min = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.max = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+			.def = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PROFILE,
+			.min = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+			.max = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN_10,
+			.def = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_LEVEL,
+			.min = V4L2_MPEG_VIDEO_HEVC_LEVEL_1,
+			.max = V4L2_MPEG_VIDEO_HEVC_LEVEL_5_1,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_SPS,
+			.ops = &hantro_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_PPS,
+		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
+		},
 	},
 };
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Decoders hardware blocks could exist in multiple versions: add
a field to distinguish them at runtime.
G2 hardware block doesn't have postprocessor hantro_needs_postproc
function should always returns false in for this hardware.
hantro_needs_postproc function becoming to much complex to
stay inline in .h file move it to .c file.

Keep the default behavoir to be G1 hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h          | 13 +++++++------
 drivers/staging/media/hantro/hantro_drv.c      |  2 ++
 drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index a76a0d79db9f..05876e426419 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -37,6 +37,9 @@ struct hantro_codec_ops;
 #define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
+#define HANTRO_G1_REV		0x6731
+#define HANTRO_G2_REV		0x6732
+
 /**
  * struct hantro_irq - irq handler and name
  *
@@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
  * @enc_base:		Mapped address of VPU encoder register for convenience.
  * @dec_base:		Mapped address of VPU decoder register for convenience.
  * @ctrl_base:		Mapped address of VPU control block.
+ * @core_hw_dec_rev	Runtime detected HW decoder core revision
  * @vpu_mutex:		Mutex to synchronize V4L2 calls.
  * @irqlock:		Spinlock to synchronize access to data structures
  *			shared with interrupt handlers.
@@ -190,6 +194,7 @@ struct hantro_dev {
 	void __iomem *enc_base;
 	void __iomem *dec_base;
 	void __iomem *ctrl_base;
+	u32 core_hw_dec_rev;
 
 	struct mutex vpu_mutex;	/* video_device lock */
 	spinlock_t irqlock;
@@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
 	return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
 }
 
-static inline bool
-hantro_needs_postproc(const struct hantro_ctx *ctx,
-		      const struct hantro_fmt *fmt)
-{
-	return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
-}
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt);
 
 static inline dma_addr_t
 hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index f0b68e16fcc0..e3e6df28f470 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
 	}
 	vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
 	vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
+	/* by default decoder is G1 */
+	vpu->core_hw_dec_rev = HANTRO_G1_REV;
 
 	ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
 	if (ret) {
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 6d2a8f2a8f0b..050880f720d6 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
 	.display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
 };
 
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	if (ctx->is_encoder)
+		return false;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G1_REV)
+		return fmt->fourcc != V4L2_PIX_FMT_NV12;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
+		return false;
+
+	return false;
+}
+
 void hantro_postproc_enable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Decoders hardware blocks could exist in multiple versions: add
a field to distinguish them at runtime.
G2 hardware block doesn't have postprocessor hantro_needs_postproc
function should always returns false in for this hardware.
hantro_needs_postproc function becoming to much complex to
stay inline in .h file move it to .c file.

Keep the default behavoir to be G1 hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h          | 13 +++++++------
 drivers/staging/media/hantro/hantro_drv.c      |  2 ++
 drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index a76a0d79db9f..05876e426419 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -37,6 +37,9 @@ struct hantro_codec_ops;
 #define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
+#define HANTRO_G1_REV		0x6731
+#define HANTRO_G2_REV		0x6732
+
 /**
  * struct hantro_irq - irq handler and name
  *
@@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
  * @enc_base:		Mapped address of VPU encoder register for convenience.
  * @dec_base:		Mapped address of VPU decoder register for convenience.
  * @ctrl_base:		Mapped address of VPU control block.
+ * @core_hw_dec_rev	Runtime detected HW decoder core revision
  * @vpu_mutex:		Mutex to synchronize V4L2 calls.
  * @irqlock:		Spinlock to synchronize access to data structures
  *			shared with interrupt handlers.
@@ -190,6 +194,7 @@ struct hantro_dev {
 	void __iomem *enc_base;
 	void __iomem *dec_base;
 	void __iomem *ctrl_base;
+	u32 core_hw_dec_rev;
 
 	struct mutex vpu_mutex;	/* video_device lock */
 	spinlock_t irqlock;
@@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
 	return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
 }
 
-static inline bool
-hantro_needs_postproc(const struct hantro_ctx *ctx,
-		      const struct hantro_fmt *fmt)
-{
-	return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
-}
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt);
 
 static inline dma_addr_t
 hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index f0b68e16fcc0..e3e6df28f470 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
 	}
 	vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
 	vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
+	/* by default decoder is G1 */
+	vpu->core_hw_dec_rev = HANTRO_G1_REV;
 
 	ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
 	if (ret) {
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 6d2a8f2a8f0b..050880f720d6 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
 	.display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
 };
 
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	if (ctx->is_encoder)
+		return false;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G1_REV)
+		return fmt->fourcc != V4L2_PIX_FMT_NV12;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
+		return false;
+
+	return false;
+}
+
 void hantro_postproc_enable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Decoders hardware blocks could exist in multiple versions: add
a field to distinguish them at runtime.
G2 hardware block doesn't have postprocessor hantro_needs_postproc
function should always returns false in for this hardware.
hantro_needs_postproc function becoming to much complex to
stay inline in .h file move it to .c file.

Keep the default behavoir to be G1 hardware.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro.h          | 13 +++++++------
 drivers/staging/media/hantro/hantro_drv.c      |  2 ++
 drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index a76a0d79db9f..05876e426419 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -37,6 +37,9 @@ struct hantro_codec_ops;
 #define HANTRO_HEVC_DECODER	BIT(19)
 #define HANTRO_DECODERS		0xffff0000
 
+#define HANTRO_G1_REV		0x6731
+#define HANTRO_G2_REV		0x6732
+
 /**
  * struct hantro_irq - irq handler and name
  *
@@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
  * @enc_base:		Mapped address of VPU encoder register for convenience.
  * @dec_base:		Mapped address of VPU decoder register for convenience.
  * @ctrl_base:		Mapped address of VPU control block.
+ * @core_hw_dec_rev	Runtime detected HW decoder core revision
  * @vpu_mutex:		Mutex to synchronize V4L2 calls.
  * @irqlock:		Spinlock to synchronize access to data structures
  *			shared with interrupt handlers.
@@ -190,6 +194,7 @@ struct hantro_dev {
 	void __iomem *enc_base;
 	void __iomem *dec_base;
 	void __iomem *ctrl_base;
+	u32 core_hw_dec_rev;
 
 	struct mutex vpu_mutex;	/* video_device lock */
 	spinlock_t irqlock;
@@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
 	return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
 }
 
-static inline bool
-hantro_needs_postproc(const struct hantro_ctx *ctx,
-		      const struct hantro_fmt *fmt)
-{
-	return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
-}
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt);
 
 static inline dma_addr_t
 hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index f0b68e16fcc0..e3e6df28f470 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
 	}
 	vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
 	vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
+	/* by default decoder is G1 */
+	vpu->core_hw_dec_rev = HANTRO_G1_REV;
 
 	ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
 	if (ret) {
diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
index 6d2a8f2a8f0b..050880f720d6 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
 	.display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
 };
 
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+			   const struct hantro_fmt *fmt)
+{
+	struct hantro_dev *vpu = ctx->dev;
+
+	if (ctx->is_encoder)
+		return false;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G1_REV)
+		return fmt->fourcc != V4L2_PIX_FMT_NV12;
+
+	if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
+		return false;
+
+	return false;
+}
+
 void hantro_postproc_enable(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 06/11] media: uapi: Add a control for HANTRO driver
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The Hantro hevc driver needs to know the number of bits to skip at
the beginning of the slice header.
That is a hardware specific requirement so create a dedicated control
that this purpose.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- The control is now an integer which is enough to provide the numbers
  of bits to skip.
version 3:
- Fix typo in field name

 Documentation/userspace-api/media/drivers/hantro.rst | 10 ++++++++++
 Documentation/userspace-api/media/drivers/index.rst  |  1 +
 include/uapi/linux/v4l2-controls.h                   |  5 +++++
 3 files changed, 16 insertions(+)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst

diff --git a/Documentation/userspace-api/media/drivers/hantro.rst b/Documentation/userspace-api/media/drivers/hantro.rst
new file mode 100644
index 000000000000..655b0c5f5d5c
--- /dev/null
+++ b/Documentation/userspace-api/media/drivers/hantro.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Hantro video decoder driver
+===========================
+
+The Hantro video decoder driver implements the following driver-specific controls:
+
+``V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (integer)``
+    Specifies to Hantro HEVC video decoder driver the number of data (in bits) to
+    skip in the slice segment header syntax after 'slice type' token.
diff --git a/Documentation/userspace-api/media/drivers/index.rst b/Documentation/userspace-api/media/drivers/index.rst
index 1a9038f5f9fa..12e3c512d718 100644
--- a/Documentation/userspace-api/media/drivers/index.rst
+++ b/Documentation/userspace-api/media/drivers/index.rst
@@ -33,6 +33,7 @@ For more details see the file COPYING in the source distribution of Linux.
 
 	ccs
 	cx2341x-uapi
+        hantro
 	imx-uapi
 	max2175
 	meye-uapi
diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
index 039c0d7add1b..ced7486c7f46 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -209,6 +209,11 @@ enum v4l2_colorfx {
  * We reserve 128 controls for this driver.
  */
 #define V4L2_CID_USER_CCS_BASE			(V4L2_CID_USER_BASE + 0x10f0)
+/*
+ * The base for HANTRO driver controls.
+ * We reserve 32 controls for this driver.
+ */
+#define V4L2_CID_USER_HANTRO_BASE		(V4L2_CID_USER_BASE + 0x1170)
 
 /* MPEG-class control IDs */
 /* The MPEG controls are applicable to all codec controls
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 06/11] media: uapi: Add a control for HANTRO driver
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The Hantro hevc driver needs to know the number of bits to skip at
the beginning of the slice header.
That is a hardware specific requirement so create a dedicated control
that this purpose.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- The control is now an integer which is enough to provide the numbers
  of bits to skip.
version 3:
- Fix typo in field name

 Documentation/userspace-api/media/drivers/hantro.rst | 10 ++++++++++
 Documentation/userspace-api/media/drivers/index.rst  |  1 +
 include/uapi/linux/v4l2-controls.h                   |  5 +++++
 3 files changed, 16 insertions(+)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst

diff --git a/Documentation/userspace-api/media/drivers/hantro.rst b/Documentation/userspace-api/media/drivers/hantro.rst
new file mode 100644
index 000000000000..655b0c5f5d5c
--- /dev/null
+++ b/Documentation/userspace-api/media/drivers/hantro.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Hantro video decoder driver
+===========================
+
+The Hantro video decoder driver implements the following driver-specific controls:
+
+``V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (integer)``
+    Specifies to Hantro HEVC video decoder driver the number of data (in bits) to
+    skip in the slice segment header syntax after 'slice type' token.
diff --git a/Documentation/userspace-api/media/drivers/index.rst b/Documentation/userspace-api/media/drivers/index.rst
index 1a9038f5f9fa..12e3c512d718 100644
--- a/Documentation/userspace-api/media/drivers/index.rst
+++ b/Documentation/userspace-api/media/drivers/index.rst
@@ -33,6 +33,7 @@ For more details see the file COPYING in the source distribution of Linux.
 
 	ccs
 	cx2341x-uapi
+        hantro
 	imx-uapi
 	max2175
 	meye-uapi
diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
index 039c0d7add1b..ced7486c7f46 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -209,6 +209,11 @@ enum v4l2_colorfx {
  * We reserve 128 controls for this driver.
  */
 #define V4L2_CID_USER_CCS_BASE			(V4L2_CID_USER_BASE + 0x10f0)
+/*
+ * The base for HANTRO driver controls.
+ * We reserve 32 controls for this driver.
+ */
+#define V4L2_CID_USER_HANTRO_BASE		(V4L2_CID_USER_BASE + 0x1170)
 
 /* MPEG-class control IDs */
 /* The MPEG controls are applicable to all codec controls
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 06/11] media: uapi: Add a control for HANTRO driver
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The Hantro hevc driver needs to know the number of bits to skip at
the beginning of the slice header.
That is a hardware specific requirement so create a dedicated control
that this purpose.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- The control is now an integer which is enough to provide the numbers
  of bits to skip.
version 3:
- Fix typo in field name

 Documentation/userspace-api/media/drivers/hantro.rst | 10 ++++++++++
 Documentation/userspace-api/media/drivers/index.rst  |  1 +
 include/uapi/linux/v4l2-controls.h                   |  5 +++++
 3 files changed, 16 insertions(+)
 create mode 100644 Documentation/userspace-api/media/drivers/hantro.rst

diff --git a/Documentation/userspace-api/media/drivers/hantro.rst b/Documentation/userspace-api/media/drivers/hantro.rst
new file mode 100644
index 000000000000..655b0c5f5d5c
--- /dev/null
+++ b/Documentation/userspace-api/media/drivers/hantro.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Hantro video decoder driver
+===========================
+
+The Hantro video decoder driver implements the following driver-specific controls:
+
+``V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (integer)``
+    Specifies to Hantro HEVC video decoder driver the number of data (in bits) to
+    skip in the slice segment header syntax after 'slice type' token.
diff --git a/Documentation/userspace-api/media/drivers/index.rst b/Documentation/userspace-api/media/drivers/index.rst
index 1a9038f5f9fa..12e3c512d718 100644
--- a/Documentation/userspace-api/media/drivers/index.rst
+++ b/Documentation/userspace-api/media/drivers/index.rst
@@ -33,6 +33,7 @@ For more details see the file COPYING in the source distribution of Linux.
 
 	ccs
 	cx2341x-uapi
+        hantro
 	imx-uapi
 	max2175
 	meye-uapi
diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
index 039c0d7add1b..ced7486c7f46 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -209,6 +209,11 @@ enum v4l2_colorfx {
  * We reserve 128 controls for this driver.
  */
 #define V4L2_CID_USER_CCS_BASE			(V4L2_CID_USER_BASE + 0x10f0)
+/*
+ * The base for HANTRO driver controls.
+ * We reserve 32 controls for this driver.
+ */
+#define V4L2_CID_USER_HANTRO_BASE		(V4L2_CID_USER_BASE + 0x1170)
 
 /* MPEG-class control IDs */
 /* The MPEG controls are applicable to all codec controls
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Implement all the logic to get G2 hardware decoding HEVC frames.
It support up level 5.1 HEVC stream.
It doesn't support yet 10 bits formats or scaling feature.

Add HANTRO HEVC dedicated control to skip some bits at the beginning
of the slice header. That is very specific to this hardware so can't
go into uapi structures. Compute the needed value is complex and require
information from the stream that only the userland knows so let it
provide the correct value to the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- fix Ezequiel comments
- use dedicated control as an integer
- change hantro_g2_hevc_dec_run prototype to return errors

version 2:
- squash multiple commits in this one.
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.

 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  18 +
 drivers/staging/media/hantro/hantro_drv.c     |  53 ++
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  49 ++
 7 files changed, 1228 insertions(+)
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 743ce08eb184..0357f1772267 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -9,12 +9,14 @@ hantro-vpu-y += \
 		hantro_h1_jpeg_enc.o \
 		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
+		hantro_g2_hevc_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		rk3399_vpu_hw_vp8_dec.o \
 		hantro_jpeg.o \
 		hantro_h264.o \
+		hantro_hevc.o \
 		hantro_mpeg2.o \
 		hantro_vp8.o
 
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 05876e426419..a9b80b2c9124 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -225,6 +225,7 @@ struct hantro_dev {
  * @jpeg_enc:		JPEG-encoding context.
  * @mpeg2_dec:		MPEG-2-decoding context.
  * @vp8_dec:		VP8-decoding context.
+ * @hevc_dec:		HEVC-decoding context.
  */
 struct hantro_ctx {
 	struct hantro_dev *dev;
@@ -251,6 +252,7 @@ struct hantro_ctx {
 		struct hantro_jpeg_enc_hw_ctx jpeg_enc;
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
+		struct hantro_hevc_dec_hw_ctx hevc_dec;
 	};
 };
 
@@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
 	return vb2_dma_contig_plane_dma_addr(vb, 0);
 }
 
+static inline size_t
+hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].size;
+	return vb2_plane_size(vb, 0);
+}
+
+static inline void *
+hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].cpu;
+	return vb2_plane_vaddr(vb, 0);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx);
 void hantro_postproc_enable(struct hantro_ctx *ctx);
 void hantro_postproc_free(struct hantro_ctx *ctx);
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e3e6df28f470..bc90a52f4d3d 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -30,6 +30,13 @@
 
 #define DRIVER_NAME "hantro-vpu"
 
+/*
+ * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
+ * the number of data (in bits) to skip in the
+ * slice segment header syntax after 'slice type' token
+ */
+#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP	(V4L2_CID_USER_HANTRO_BASE + 0)
+
 int hantro_debug;
 module_param_named(debug, hantro_debug, int, 0644);
 MODULE_PARM_DESC(debug,
@@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
 	return 0;
 }
 
+static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+	struct hantro_ctx *ctx;
+
+	ctx = container_of(ctrl->handler,
+			   struct hantro_ctx, ctrl_handler);
+
+	vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
+
+	switch (ctrl->id) {
+	case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
+		ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
 	.try_ctrl = hantro_try_ctrl,
 };
@@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
 	.s_ctrl = hantro_jpeg_s_ctrl,
 };
 
+static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
+	.s_ctrl = hantro_hevc_s_ctrl,
+};
+
 static const struct hantro_ctrl controls[] = {
 	{
 		.codec = HANTRO_JPEG_ENCODER,
@@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
 		.cfg = {
 			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
 		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
+			.name = "Hantro HEVC slice header skip bytes",
+			.type = V4L2_CTRL_TYPE_INTEGER,
+			.min = 0,
+			.def = 0,
+			.max = 0x7fffffff,
+			.step = 1,
+			.ops = &hantro_hevc_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
+			 HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
+			 HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_USER_CLASS,
+			.name = "HANTRO controls",
+			.type = V4L2_CTRL_TYPE_CTRL_CLASS,
+			.flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
+		},
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
new file mode 100644
index 000000000000..5d75b36bc40c
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -0,0 +1,587 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include "hantro_hw.h"
+#include "hantro_g2_regs.h"
+
+#define HEVC_DEC_MODE	0xC
+
+#define BUS_WIDTH_32		0
+#define BUS_WIDTH_64		1
+#define BUS_WIDTH_128		2
+#define BUS_WIDTH_256		3
+
+static inline void hantro_write_addr(struct hantro_dev *vpu,
+				     unsigned long offset,
+				     dma_addr_t addr)
+{
+	vdpu_write(vpu, addr & 0xffffffff, offset);
+}
+
+static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
+	unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
+	unsigned int max_log2_ctb_size, ctb_size;
+	bool tiles_enabled, uniform_spacing;
+	u32 no_chroma = 0;
+
+	tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
+	uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
+
+	hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
+
+	max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
+			    sps->log2_diff_max_min_luma_coding_block_size;
+	pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
+			    (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
+	pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
+			     >> max_log2_ctb_size;
+	ctb_size = 1 << max_log2_ctb_size;
+
+	vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
+		  pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
+
+	if (tiles_enabled) {
+		unsigned int i, j, h;
+
+		vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
+
+		hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
+		hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
+
+		/* write width + height for each tile in pic */
+		if (!uniform_spacing) {
+			u32 tmp_w = 0, tmp_h = 0;
+
+			for (i = 0; i < num_tile_rows; i++) {
+				if (i == num_tile_rows - 1)
+					h = pic_height_in_ctbs - tmp_h;
+				else
+					h = pps->row_height_minus1[i] + 1;
+				tmp_h += h;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
+					tmp_w += pps->column_width_minus1[j] + 1;
+					*p++ = pps->column_width_minus1[j + 1];
+					*p++ = h;
+					if (i == 0 && h == 1 && ctb_size == 16)
+						no_chroma = 1;
+				}
+				/* last column */
+				*p++ = pic_width_in_ctbs - tmp_w;
+				*p++ = h;
+			}
+		} else { /* uniform spacing */
+			u32 tmp, prev_h, prev_w;
+
+			for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
+				tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
+				h = tmp - prev_h;
+				prev_h = tmp;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
+					tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
+					*p++ = tmp - prev_w;
+					*p++ = h;
+					if (j == 0 &&
+					    (pps->column_width_minus1[0] + 1) == 1 &&
+					    ctb_size == 16)
+						no_chroma = 1;
+					prev_w = tmp;
+				}
+			}
+		}
+	} else {
+		hantro_reg_write(vpu, hevc_num_tile_rows, 1);
+		hantro_reg_write(vpu, hevc_num_tile_cols, 1);
+
+		/* There's one tile, with dimensions equal to pic size. */
+		p[0] = pic_width_in_ctbs;
+		p[1] = pic_height_in_ctbs;
+	}
+
+	if (no_chroma)
+		vpu_debug(1, "%s: no chroma!\n", __func__);
+}
+
+static void set_params(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	struct hantro_dev *vpu = ctx->dev;
+	u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
+	u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
+	u32 pic_width_aligned, pic_height_aligned;
+	u32 partial_ctb_x, partial_ctb_y;
+
+	hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
+	hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
+
+	hantro_reg_write(vpu, hevc_output_8_bits, 0);
+
+	hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
+
+	min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
+	max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
+
+	hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
+	hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
+
+	min_cb_size = 1 << min_log2_cb_size;
+	max_ctb_size = 1 << max_log2_ctb_size;
+
+	pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
+	pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
+	pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
+	pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
+
+	partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
+	partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
+
+	hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
+	hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
+
+	hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
+	hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
+
+	hantro_reg_write(vpu, hevc_pic_width_4x4,
+			 (pic_width_in_min_cbs * min_cb_size) / 4);
+	hantro_reg_write(vpu, hevc_pic_height_4x4,
+			 (pic_height_in_min_cbs * min_cb_size) / 4);
+
+	hantro_reg_write(vpu, hevc_max_inter_hierdepth,
+			 sps->max_transform_hierarchy_depth_inter);
+	hantro_reg_write(vpu, hevc_max_intra_hierdepth,
+			 sps->max_transform_hierarchy_depth_intra);
+	hantro_reg_write(vpu, hevc_min_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2);
+	hantro_reg_write(vpu, hevc_max_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2 +
+			 sps->log2_diff_max_min_luma_transform_block_size);
+
+	hantro_reg_write(vpu, hevc_tempor_mvp_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
+			 !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
+	hantro_reg_write(vpu, hevc_strong_smooth_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
+	hantro_reg_write(vpu, hevc_asym_pred_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
+	hantro_reg_write(vpu, hevc_sao_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
+	hantro_reg_write(vpu, hevc_sign_data_hide,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
+	} else {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
+	}
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
+	} else {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
+	hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
+	hantro_reg_write(vpu, hevc_slice_chqp_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
+	hantro_reg_write(vpu, hevc_weight_bipr_idc,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
+	hantro_reg_write(vpu, hevc_transq_bypass,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
+	hantro_reg_write(vpu, hevc_list_mod_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
+	hantro_reg_write(vpu, hevc_entropy_sync_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_idr_pic_e,
+			 !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
+	hantro_reg_write(vpu, hevc_parallel_merge,
+			 pps->log2_parallel_merge_level_minus2 + 2);
+	hantro_reg_write(vpu, hevc_pcm_filt_d,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
+	hantro_reg_write(vpu, hevc_pcm_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
+	if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
+		hantro_reg_write(vpu, hevc_max_pcm_size,
+				 sps->log2_diff_max_min_pcm_luma_coding_block_size +
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_min_pcm_size,
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
+				 sps->pcm_sample_bit_depth_luma_minus1 + 1);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
+				 sps->pcm_sample_bit_depth_chroma_minus1 + 1);
+	} else {
+		hantro_reg_write(vpu, hevc_max_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_min_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_start_code_e, 1);
+	hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
+	hantro_reg_write(vpu, hevc_weight_pred_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_const_intra_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+	hantro_reg_write(vpu, hevc_transform_skip,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
+	hantro_reg_write(vpu, hevc_out_filtering_dis,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
+	hantro_reg_write(vpu, hevc_filt_ctrl_pres,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
+	hantro_reg_write(vpu, hevc_dependent_slice,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
+	hantro_reg_write(vpu, hevc_filter_override,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
+	hantro_reg_write(vpu, hevc_refidx0_active,
+			 pps->num_ref_idx_l0_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_refidx1_active,
+			 pps->num_ref_idx_l1_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_apf_threshold, 8);
+}
+
+static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
+{
+	int i;
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
+			return i;
+	}
+
+	return 0x0;
+}
+
+static void set_ref_pic_list(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	const struct hantro_reg *ref_pic_regs0[] = {
+		hevc_rlist_f0,
+		hevc_rlist_f1,
+		hevc_rlist_f2,
+		hevc_rlist_f3,
+		hevc_rlist_f4,
+		hevc_rlist_f5,
+		hevc_rlist_f6,
+		hevc_rlist_f7,
+		hevc_rlist_f8,
+		hevc_rlist_f9,
+		hevc_rlist_f10,
+		hevc_rlist_f11,
+		hevc_rlist_f12,
+		hevc_rlist_f13,
+		hevc_rlist_f14,
+		hevc_rlist_f15,
+	};
+	const struct hantro_reg *ref_pic_regs1[] = {
+		hevc_rlist_b0,
+		hevc_rlist_b1,
+		hevc_rlist_b2,
+		hevc_rlist_b3,
+		hevc_rlist_b4,
+		hevc_rlist_b5,
+		hevc_rlist_b6,
+		hevc_rlist_b7,
+		hevc_rlist_b8,
+		hevc_rlist_b9,
+		hevc_rlist_b10,
+		hevc_rlist_b11,
+		hevc_rlist_b12,
+		hevc_rlist_b13,
+		hevc_rlist_b14,
+		hevc_rlist_b15,
+	};
+	unsigned int i, j;
+
+	/* List 0 contains: short term before, short term after and long term */
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	/* Fill the list, copying over and over */
+	i = 0;
+	while (j < ARRAY_SIZE(list0))
+		list0[j++] = list0[i++];
+
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	i = 0;
+	while (j < ARRAY_SIZE(list1))
+		list1[j++] = list1[i++];
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
+		hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
+	}
+}
+
+static int set_ref(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
+	struct hantro_dev *vpu = ctx->dev;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
+	u32 max_ref_frames;
+	u16 dpb_longterm_e;
+
+	const struct hantro_reg *cur_poc[] = {
+		hevc_cur_poc_00,
+		hevc_cur_poc_01,
+		hevc_cur_poc_02,
+		hevc_cur_poc_03,
+		hevc_cur_poc_04,
+		hevc_cur_poc_05,
+		hevc_cur_poc_06,
+		hevc_cur_poc_07,
+		hevc_cur_poc_08,
+		hevc_cur_poc_09,
+		hevc_cur_poc_10,
+		hevc_cur_poc_11,
+		hevc_cur_poc_12,
+		hevc_cur_poc_13,
+		hevc_cur_poc_14,
+		hevc_cur_poc_15,
+	};
+	unsigned int i;
+
+	max_ref_frames = decode_params->num_poc_lt_curr +
+		decode_params->num_poc_st_curr_before +
+		decode_params->num_poc_st_curr_after;
+	/*
+	 * Set max_ref_frames to non-zero to avoid HW hang when decoding
+	 * badly marked I-frames.
+	 */
+	max_ref_frames = max_ref_frames ? max_ref_frames : 1;
+	hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
+	hantro_reg_write(vpu, hevc_filter_over_slices,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
+	hantro_reg_write(vpu, hevc_filter_over_tiles,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+
+	/*
+	 * Write POC count diff from current pic. For frame decoding only compute
+	 * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
+	 */
+	for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
+		char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
+
+		hantro_reg_write(vpu, cur_poc[i], poc_diff);
+	}
+
+	if (i < ARRAY_SIZE(cur_poc)) {
+		/*
+		 * After the references, fill one entry pointing to itself,
+		 * i.e. difference is zero.
+		 */
+		hantro_reg_write(vpu, cur_poc[i], 0);
+		i++;
+	}
+
+	/* Fill the rest with the current picture */
+	for (; i < ARRAY_SIZE(cur_poc); i++)
+		hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
+
+	set_ref_pic_list(ctx);
+
+	/* We will only keep the references picture that are still used */
+	ctx->hevc_dec.ref_bufs_used = 0;
+
+	/* Set up addresses of DPB buffers */
+	dpb_longterm_e = 0;
+	for (i = 0; i < decode_params->num_active_dpb_entries &&
+	     i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
+		luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
+		if (!luma_addr)
+			return -ENOMEM;
+
+		chroma_addr = luma_addr + cr_offset;
+		mv_addr = luma_addr + mv_offset;
+
+		if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
+			dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
+
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
+	}
+
+	luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
+	if (!luma_addr)
+		return -ENOMEM;
+
+	chroma_addr = luma_addr + cr_offset;
+	mv_addr = luma_addr + mv_offset;
+
+	hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+	hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+	hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
+
+	hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
+
+	hantro_hevc_ref_remove_unused(ctx);
+
+	for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
+	}
+
+	hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
+
+	return 0;
+}
+
+static void set_buffers(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	dma_addr_t src_dma, dst_dma;
+	u32 src_len, src_buf_len;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/* Source (stream) buffer. */
+	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
+	src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
+
+	hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
+	hantro_reg_write(vpu, hevc_stream_len, src_len);
+	hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
+	hantro_reg_write(vpu, hevc_strm_start_offset, 0);
+	hantro_reg_write(vpu, hevc_write_mvs_e, 1);
+
+	/* Destination (decoded frame) buffer. */
+	dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
+
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
+	hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
+	hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
+	hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
+	hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
+}
+
+void hantro_g2_check_idle(struct hantro_dev *vpu)
+{
+	int i;
+
+	for (i = 0; i < 3; i++) {
+		u32 status;
+
+		/* Make sure the VPU is idle */
+		status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+		if (status & HEVC_REG_INTERRUPT_DEC_E) {
+			pr_warn("%s: still enabled!!! resetting.\n", __func__);
+			status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
+			vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
+		}
+	}
+}
+
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	int ret;
+
+	hantro_g2_check_idle(vpu);
+
+	/* Prepare HEVC decoder context. */
+	ret = hantro_hevc_dec_prepare_run(ctx);
+	if (ret)
+		return ret;
+
+	/* Configure hardware registers. */
+	set_params(ctx);
+
+	/* set reference pictures */
+	ret = set_ref(ctx);
+	if (ret)
+		return ret;
+
+	set_buffers(ctx);
+	prepare_tile_info_buffer(ctx);
+
+	hantro_end_prepare_run(ctx);
+
+	hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
+	hantro_reg_write(vpu, hevc_clk_gate_e, 1);
+
+	/* Don't disable output */
+	hantro_reg_write(vpu, hevc_out_dis, 0);
+
+	/* Don't compress buffers */
+	hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
+
+	/* use NV12 as output format */
+	hantro_reg_write(vpu, hevc_out_rs_e, 1);
+
+	/* Bus width and max burst */
+	hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
+	hantro_reg_write(vpu, hevc_max_burst, 16);
+
+	/* Swap */
+	hantro_reg_write(vpu, hevc_strm_swap, 0xf);
+	hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
+	hantro_reg_write(vpu, hevc_compress_swap, 0xf);
+
+	/* Start decoding! */
+	vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
new file mode 100644
index 000000000000..a361c9ba911d
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -0,0 +1,198 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021, Collabora
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
+ */
+
+#ifndef HANTRO_G2_REGS_H_
+#define HANTRO_G2_REGS_H_
+
+#include "hantro.h"
+
+#define G2_SWREG(nr)	((nr) * 4)
+
+#define HEVC_DEC_REG(name, base, shift, mask) \
+	static const struct hantro_reg _hevc_##name[] = { \
+		{ G2_SWREG(base), (shift), (mask) } \
+	}; \
+	static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
+
+#define HEVC_REG_VERSION		G2_SWREG(0)
+
+#define HEVC_REG_INTERRUPT		G2_SWREG(1)
+#define HEVC_REG_INTERRUPT_DEC_RDY_INT	BIT(12)
+#define HEVC_REG_INTERRUPT_DEC_ABORT_E	BIT(5)
+#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS	BIT(4)
+#define HEVC_REG_INTERRUPT_DEC_E	BIT(0)
+
+HEVC_DEC_REG(strm_swap,		2, 28,	0xf)
+HEVC_DEC_REG(dirmv_swap,	2, 20,	0xf)
+
+HEVC_DEC_REG(mode,		  3, 27, 0x1f)
+HEVC_DEC_REG(compress_swap,	  3, 20, 0xf)
+HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
+HEVC_DEC_REG(out_rs_e,		  3, 16, 0x1)
+HEVC_DEC_REG(out_dis,		  3, 15, 0x1)
+HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
+HEVC_DEC_REG(write_mvs_e,	  3, 12, 0x1)
+
+HEVC_DEC_REG(pic_width_in_cbs,	4, 19,	0x1ff)
+HEVC_DEC_REG(pic_height_in_cbs,	4, 6,	0x1ff)
+HEVC_DEC_REG(num_ref_frames,	4, 0,	0x1f)
+
+HEVC_DEC_REG(scaling_list_e,	5, 24,	0x1)
+HEVC_DEC_REG(cb_qp_offset,	5, 19,	0x1f)
+HEVC_DEC_REG(cr_qp_offset,	5, 14,	0x1f)
+HEVC_DEC_REG(sign_data_hide,	5, 12,	0x1)
+HEVC_DEC_REG(tempor_mvp_e,	5, 11,	0x1)
+HEVC_DEC_REG(max_cu_qpd_depth,	5, 5,	0x3f)
+HEVC_DEC_REG(cu_qpd_e,		5, 4,	0x1)
+
+HEVC_DEC_REG(stream_len,	6, 0,	0xffffffff)
+
+HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
+HEVC_DEC_REG(weight_pred_e,	 7, 28, 0x1)
+HEVC_DEC_REG(weight_bipr_idc,	 7, 26, 0x3)
+HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
+HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
+HEVC_DEC_REG(asym_pred_e,	 7, 23, 0x1)
+HEVC_DEC_REG(sao_e,		 7, 22, 0x1)
+HEVC_DEC_REG(pcm_filt_d,	 7, 21, 0x1)
+HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
+HEVC_DEC_REG(dependent_slice,	 7, 19, 0x1)
+HEVC_DEC_REG(filter_override,	 7, 18, 0x1)
+HEVC_DEC_REG(strong_smooth_e,	 7, 17, 0x1)
+HEVC_DEC_REG(filt_offset_beta,	 7, 12, 0x1f)
+HEVC_DEC_REG(filt_offset_tc,	 7, 7,  0x1f)
+HEVC_DEC_REG(slice_hdr_ext_e,	 7, 6,	0x1)
+HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3,	0x7)
+
+HEVC_DEC_REG(const_intra_e,	 8, 31, 0x1)
+HEVC_DEC_REG(filt_ctrl_pres,	 8, 30, 0x1)
+HEVC_DEC_REG(idr_pic_e,		 8, 16, 0x1)
+HEVC_DEC_REG(bit_depth_pcm_y,	 8, 12, 0xf)
+HEVC_DEC_REG(bit_depth_pcm_c,	 8, 8,  0xf)
+HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
+HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
+HEVC_DEC_REG(output_8_bits,	 8, 3,  0x1)
+
+HEVC_DEC_REG(refidx1_active,	9, 19,	0x1f)
+HEVC_DEC_REG(refidx0_active,	9, 14,	0x1f)
+HEVC_DEC_REG(hdr_skip_length,	9, 0,	0x3fff)
+
+HEVC_DEC_REG(start_code_e,	10, 31, 0x1)
+HEVC_DEC_REG(init_qp,		10, 24, 0x3f)
+HEVC_DEC_REG(num_tile_cols,	10, 19, 0x1f)
+HEVC_DEC_REG(num_tile_rows,	10, 14, 0x1f)
+HEVC_DEC_REG(tile_e,		10, 1,	0x1)
+HEVC_DEC_REG(entropy_sync_e,	10, 0,	0x1)
+
+HEVC_DEC_REG(refer_lterm_e,	12, 16, 0xffff)
+HEVC_DEC_REG(min_cb_size,	12, 13, 0x7)
+HEVC_DEC_REG(max_cb_size,	12, 10, 0x7)
+HEVC_DEC_REG(min_pcm_size,	12, 7,  0x7)
+HEVC_DEC_REG(max_pcm_size,	12, 4,  0x7)
+HEVC_DEC_REG(pcm_e,		12, 3,  0x1)
+HEVC_DEC_REG(transform_skip,	12, 2,	0x1)
+HEVC_DEC_REG(transq_bypass,	12, 1,	0x1)
+HEVC_DEC_REG(list_mod_e,	12, 0,	0x1)
+
+HEVC_DEC_REG(min_trb_size,	  13, 13, 0x7)
+HEVC_DEC_REG(max_trb_size,	  13, 10, 0x7)
+HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
+HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
+HEVC_DEC_REG(parallel_merge,	  13, 0,  0xf)
+
+HEVC_DEC_REG(rlist_f0,		14, 0,	0x1f)
+HEVC_DEC_REG(rlist_f1,		14, 10,	0x1f)
+HEVC_DEC_REG(rlist_f2,		14, 20,	0x1f)
+HEVC_DEC_REG(rlist_b0,		14, 5,	0x1f)
+HEVC_DEC_REG(rlist_b1,		14, 15, 0x1f)
+HEVC_DEC_REG(rlist_b2,		14, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f3,		15, 0,	0x1f)
+HEVC_DEC_REG(rlist_f4,		15, 10, 0x1f)
+HEVC_DEC_REG(rlist_f5,		15, 20, 0x1f)
+HEVC_DEC_REG(rlist_b3,		15, 5,	0x1f)
+HEVC_DEC_REG(rlist_b4,		15, 15, 0x1f)
+HEVC_DEC_REG(rlist_b5,		15, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f6,		16, 0,	0x1f)
+HEVC_DEC_REG(rlist_f7,		16, 10, 0x1f)
+HEVC_DEC_REG(rlist_f8,		16, 20, 0x1f)
+HEVC_DEC_REG(rlist_b6,		16, 5,	0x1f)
+HEVC_DEC_REG(rlist_b7,		16, 15, 0x1f)
+HEVC_DEC_REG(rlist_b8,		16, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f9,		17, 0,	0x1f)
+HEVC_DEC_REG(rlist_f10,		17, 10, 0x1f)
+HEVC_DEC_REG(rlist_f11,		17, 20, 0x1f)
+HEVC_DEC_REG(rlist_b9,		17, 5,	0x1f)
+HEVC_DEC_REG(rlist_b10,		17, 15, 0x1f)
+HEVC_DEC_REG(rlist_b11,		17, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f12,		18, 0,	0x1f)
+HEVC_DEC_REG(rlist_f13,		18, 10, 0x1f)
+HEVC_DEC_REG(rlist_f14,		18, 20, 0x1f)
+HEVC_DEC_REG(rlist_b12,		18, 5,	0x1f)
+HEVC_DEC_REG(rlist_b13,		18, 15, 0x1f)
+HEVC_DEC_REG(rlist_b14,		18, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f15,		19, 0,	0x1f)
+HEVC_DEC_REG(rlist_b15,		19, 5,	0x1f)
+
+HEVC_DEC_REG(partial_ctb_x,	20, 31, 0x1)
+HEVC_DEC_REG(partial_ctb_y,	20, 30, 0x1)
+HEVC_DEC_REG(pic_width_4x4,	20, 16, 0xfff)
+HEVC_DEC_REG(pic_height_4x4,	20, 0,  0xfff)
+
+HEVC_DEC_REG(cur_poc_00,	46, 24,	0xff)
+HEVC_DEC_REG(cur_poc_01,	46, 16,	0xff)
+HEVC_DEC_REG(cur_poc_02,	46, 8,	0xff)
+HEVC_DEC_REG(cur_poc_03,	46, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_04,	47, 24,	0xff)
+HEVC_DEC_REG(cur_poc_05,	47, 16,	0xff)
+HEVC_DEC_REG(cur_poc_06,	47, 8,	0xff)
+HEVC_DEC_REG(cur_poc_07,	47, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_08,	48, 24,	0xff)
+HEVC_DEC_REG(cur_poc_09,	48, 16,	0xff)
+HEVC_DEC_REG(cur_poc_10,	48, 8,	0xff)
+HEVC_DEC_REG(cur_poc_11,	48, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_12,	49, 24, 0xff)
+HEVC_DEC_REG(cur_poc_13,	49, 16, 0xff)
+HEVC_DEC_REG(cur_poc_14,	49, 8,	0xff)
+HEVC_DEC_REG(cur_poc_15,	49, 0,	0xff)
+
+HEVC_DEC_REG(apf_threshold,	55, 0,	0xffff)
+
+HEVC_DEC_REG(clk_gate_e,	58, 16,	0x1)
+HEVC_DEC_REG(buswidth,		58, 8,	0x7)
+HEVC_DEC_REG(max_burst,		58, 0,	0xff)
+
+#define HEVC_REG_CONFIG				G2_SWREG(58)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_E		BIT(16)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E	BIT(17)
+
+#define HEVC_ADDR_DST		(G2_SWREG(65))
+#define HEVC_REG_ADDR_REF(i)	(G2_SWREG(67)  + ((i) * 0x8))
+#define HEVC_ADDR_DST_CHR	(G2_SWREG(99))
+#define HEVC_REG_CHR_REF(i)	(G2_SWREG(101) + ((i) * 0x8))
+#define HEVC_ADDR_DST_MV	(G2_SWREG(133))
+#define HEVC_REG_DMV_REF(i)	(G2_SWREG(135) + ((i) * 0x8))
+#define HEVC_ADDR_TILE_SIZE	(G2_SWREG(167))
+#define HEVC_ADDR_STR		(G2_SWREG(169))
+#define HEVC_SCALING_LIST	(G2_SWREG(171))
+#define HEVC_RASTER_SCAN	(G2_SWREG(175))
+#define HEVC_RASTER_SCAN_CHR	(G2_SWREG(177))
+#define HEVC_TILE_FILTER	(G2_SWREG(179))
+#define HEVC_TILE_SAO		(G2_SWREG(181))
+#define HEVC_TILE_BSD		(G2_SWREG(183))
+
+HEVC_DEC_REG(strm_buffer_len,	258, 0,	0xffffffff)
+HEVC_DEC_REG(strm_start_offset,	259, 0,	0xffffffff)
+
+#endif
diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
new file mode 100644
index 000000000000..8e319a837ff3
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_hevc.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include <linux/types.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+
+#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
+/*
+ * BSD control data of current picture at tile border
+ * 128 bits per 4x4 tile = 128/(8*4) bytes per row
+ */
+#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
+/* tile border coefficients of filter */
+#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
+
+#define MAX_TILE_COLS 20
+#define MAX_TILE_ROWS 22
+
+#define UNUSED_REF	-1
+
+#define G2_ALIGN		16
+#define MC_WORD_SIZE		32
+
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
+
+	return sps->pic_width_in_luma_samples *
+		sps->pic_height_in_luma_samples * bytes_per_pixel;
+}
+
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+
+	return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
+}
+
+static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
+	u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
+	size_t mv_size;
+
+	mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
+		  (1 << (2 * (8 - 4))) * 16) + 32;
+
+	vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
+		  pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
+
+	return mv_size;
+}
+
+static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+
+	return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
+}
+
+static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	struct hantro_dev *vpu = ctx->dev;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs[i].cpu) {
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
+					  hevc_dec->ref_bufs[i].cpu,
+					  hevc_dec->ref_bufs[i].dma);
+		}
+	}
+}
+
+static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	for (i = 0;  i < NUM_REF_PICTURES; i++)
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+}
+
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
+				   int poc)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Find the reference buffer in already know ones */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == poc) {
+			hevc_dec->ref_bufs_used |= 1 << i;
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	/* Allocate a new reference buffer */
+	for (i = 0; i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
+			if (!hevc_dec->ref_bufs[i].cpu) {
+				struct hantro_dev *vpu = ctx->dev;
+
+				hevc_dec->ref_bufs[i].cpu =
+					dma_alloc_coherent(vpu->dev,
+							   hantro_hevc_ref_size(ctx),
+							   &hevc_dec->ref_bufs[i].dma,
+							   GFP_KERNEL);
+				if (!hevc_dec->ref_bufs[i].cpu)
+					return 0;
+
+				hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
+			}
+			hevc_dec->ref_bufs_used |= 1 << i;
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			hevc_dec->ref_bufs_poc[i] = poc;
+
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	return 0;
+}
+
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF)
+			continue;
+
+		if (hevc_dec->ref_bufs_used & (1 << i))
+			continue;
+
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+	}
+}
+
+static int tile_buffer_reallocate(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int height64 = (sps->pic_height_in_luma_samples + 63) & ~63;
+	unsigned int size;
+
+	if (num_tile_cols <= 1 ||
+	    num_tile_cols <= hevc_dec->num_tile_cols_allocated)
+		return 0;
+
+	/* Need to reallocate due to tiles passed via PPS */
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+
+	size = VERT_FILTER_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_filter.cpu = dma_alloc_coherent(vpu->dev, size,
+						       &hevc_dec->tile_filter.dma,
+						       GFP_KERNEL);
+	if (!hevc_dec->tile_filter.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_filter.size = size;
+
+	size = VERT_SAO_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_sao.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_sao.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_sao.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_sao.size = size;
+
+	size = BSD_CTRL_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_bsd.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_bsd.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_bsd.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_bsd.size = size;
+
+	hevc_dec->num_tile_cols_allocated = num_tile_cols;
+
+	return 0;
+
+err_free_tile_buffers:
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	return -ENOMEM;
+}
+
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_ctx = &ctx->hevc_dec;
+	struct hantro_hevc_dec_ctrls *ctrls = &hevc_ctx->ctrls;
+	int ret;
+
+	hantro_start_prepare_run(ctx);
+
+	ctrls->decode_params =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
+	if (WARN_ON(!ctrls->decode_params))
+		return -EINVAL;
+
+	ctrls->sps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_SPS);
+	if (WARN_ON(!ctrls->sps))
+		return -EINVAL;
+
+	ctrls->pps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_PPS);
+	if (WARN_ON(!ctrls->pps))
+		return -EINVAL;
+
+	ret = tile_buffer_reallocate(ctx);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+
+	if (hevc_dec->tile_sizes.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sizes.size,
+				  hevc_dec->tile_sizes.cpu,
+				  hevc_dec->tile_sizes.dma);
+	hevc_dec->tile_sizes.cpu = 0;
+
+	if (hevc_dec->tile_filter.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	hantro_hevc_ref_free(ctx);
+}
+
+int hantro_hevc_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	unsigned int size;
+
+	memset(hevc_dec, 0, sizeof(*hevc_dec));
+
+	/*
+	 * Maximum number of tiles times width and height (2 bytes each),
+	 * rounding up to next 16 bytes boundary + one extra 16 byte
+	 * chunk (HW guys wanted to have this).
+	 */
+	size = round_up(MAX_TILE_COLS * MAX_TILE_ROWS * 4 * sizeof(u16) + 16, 16);
+	hevc_dec->tile_sizes.cpu = dma_alloc_coherent(vpu->dev, size,
+						      &hevc_dec->tile_sizes.dma,
+						      GFP_KERNEL);
+	if (!hevc_dec->tile_sizes.cpu)
+		return -ENOMEM;
+
+	hevc_dec->tile_sizes.size = size;
+
+	hantro_hevc_ref_init(ctx);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 4e2e7a5ed283..dade3b0769c1 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -20,6 +20,8 @@
 #define MB_WIDTH(w)		DIV_ROUND_UP(w, MB_DIM)
 #define MB_HEIGHT(h)		DIV_ROUND_UP(h, MB_DIM)
 
+#define NUM_REF_PICTURES	(V4L2_HEVC_DPB_ENTRIES_NUM_MAX + 1)
+
 struct hantro_dev;
 struct hantro_ctx;
 struct hantro_buf;
@@ -90,6 +92,44 @@ struct hantro_h264_dec_hw_ctx {
 	struct hantro_h264_dec_ctrls ctrls;
 };
 
+/**
+ * struct hantro_hevc_dec_ctrls
+ * @decode_params: Decode params
+ * @sps:	SPS info
+ * @pps:	PPS info
+ * @hevc_hdr_skip_length: the number of data (in bits) to skip in the
+ *			  slice segment header syntax after 'slice type'
+ *			  token
+ */
+struct hantro_hevc_dec_ctrls {
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
+	const struct v4l2_ctrl_hevc_sps *sps;
+	const struct v4l2_ctrl_hevc_pps *pps;
+	u32 hevc_hdr_skip_length;
+};
+
+/**
+ * struct hantro_hevc_dec_hw_ctx
+ * @tile_sizes:		Tile sizes buffer
+ * @tile_filter:	Tile vertical filter buffer
+ * @tile_sao:		Tile SAO buffer
+ * @tile_bsd:		Tile BSD control buffer
+ * @dpb:	DPB
+ * @reflists:	P/B0/B1 reflists
+ * @ctrls:	V4L2 controls attached to a run
+ */
+struct hantro_hevc_dec_hw_ctx {
+	struct hantro_aux_buf tile_sizes;
+	struct hantro_aux_buf tile_filter;
+	struct hantro_aux_buf tile_sao;
+	struct hantro_aux_buf tile_bsd;
+	struct hantro_aux_buf ref_bufs[NUM_REF_PICTURES];
+	int ref_bufs_poc[NUM_REF_PICTURES];
+	u32 ref_bufs_used;
+	struct hantro_hevc_dec_ctrls ctrls;
+	unsigned int num_tile_cols_allocated;
+};
+
 /**
  * struct hantro_mpeg2_dec_hw_ctx
  * @qtable:		Quantization table
@@ -178,6 +218,15 @@ int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
+int hantro_hevc_dec_init(struct hantro_ctx *ctx);
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx);
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc);
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
+
 static inline size_t
 hantro_h264_mv_size(unsigned int width, unsigned int height)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Implement all the logic to get G2 hardware decoding HEVC frames.
It support up level 5.1 HEVC stream.
It doesn't support yet 10 bits formats or scaling feature.

Add HANTRO HEVC dedicated control to skip some bits at the beginning
of the slice header. That is very specific to this hardware so can't
go into uapi structures. Compute the needed value is complex and require
information from the stream that only the userland knows so let it
provide the correct value to the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- fix Ezequiel comments
- use dedicated control as an integer
- change hantro_g2_hevc_dec_run prototype to return errors

version 2:
- squash multiple commits in this one.
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.

 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  18 +
 drivers/staging/media/hantro/hantro_drv.c     |  53 ++
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  49 ++
 7 files changed, 1228 insertions(+)
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 743ce08eb184..0357f1772267 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -9,12 +9,14 @@ hantro-vpu-y += \
 		hantro_h1_jpeg_enc.o \
 		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
+		hantro_g2_hevc_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		rk3399_vpu_hw_vp8_dec.o \
 		hantro_jpeg.o \
 		hantro_h264.o \
+		hantro_hevc.o \
 		hantro_mpeg2.o \
 		hantro_vp8.o
 
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 05876e426419..a9b80b2c9124 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -225,6 +225,7 @@ struct hantro_dev {
  * @jpeg_enc:		JPEG-encoding context.
  * @mpeg2_dec:		MPEG-2-decoding context.
  * @vp8_dec:		VP8-decoding context.
+ * @hevc_dec:		HEVC-decoding context.
  */
 struct hantro_ctx {
 	struct hantro_dev *dev;
@@ -251,6 +252,7 @@ struct hantro_ctx {
 		struct hantro_jpeg_enc_hw_ctx jpeg_enc;
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
+		struct hantro_hevc_dec_hw_ctx hevc_dec;
 	};
 };
 
@@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
 	return vb2_dma_contig_plane_dma_addr(vb, 0);
 }
 
+static inline size_t
+hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].size;
+	return vb2_plane_size(vb, 0);
+}
+
+static inline void *
+hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].cpu;
+	return vb2_plane_vaddr(vb, 0);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx);
 void hantro_postproc_enable(struct hantro_ctx *ctx);
 void hantro_postproc_free(struct hantro_ctx *ctx);
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e3e6df28f470..bc90a52f4d3d 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -30,6 +30,13 @@
 
 #define DRIVER_NAME "hantro-vpu"
 
+/*
+ * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
+ * the number of data (in bits) to skip in the
+ * slice segment header syntax after 'slice type' token
+ */
+#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP	(V4L2_CID_USER_HANTRO_BASE + 0)
+
 int hantro_debug;
 module_param_named(debug, hantro_debug, int, 0644);
 MODULE_PARM_DESC(debug,
@@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
 	return 0;
 }
 
+static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+	struct hantro_ctx *ctx;
+
+	ctx = container_of(ctrl->handler,
+			   struct hantro_ctx, ctrl_handler);
+
+	vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
+
+	switch (ctrl->id) {
+	case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
+		ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
 	.try_ctrl = hantro_try_ctrl,
 };
@@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
 	.s_ctrl = hantro_jpeg_s_ctrl,
 };
 
+static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
+	.s_ctrl = hantro_hevc_s_ctrl,
+};
+
 static const struct hantro_ctrl controls[] = {
 	{
 		.codec = HANTRO_JPEG_ENCODER,
@@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
 		.cfg = {
 			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
 		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
+			.name = "Hantro HEVC slice header skip bytes",
+			.type = V4L2_CTRL_TYPE_INTEGER,
+			.min = 0,
+			.def = 0,
+			.max = 0x7fffffff,
+			.step = 1,
+			.ops = &hantro_hevc_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
+			 HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
+			 HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_USER_CLASS,
+			.name = "HANTRO controls",
+			.type = V4L2_CTRL_TYPE_CTRL_CLASS,
+			.flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
+		},
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
new file mode 100644
index 000000000000..5d75b36bc40c
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -0,0 +1,587 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include "hantro_hw.h"
+#include "hantro_g2_regs.h"
+
+#define HEVC_DEC_MODE	0xC
+
+#define BUS_WIDTH_32		0
+#define BUS_WIDTH_64		1
+#define BUS_WIDTH_128		2
+#define BUS_WIDTH_256		3
+
+static inline void hantro_write_addr(struct hantro_dev *vpu,
+				     unsigned long offset,
+				     dma_addr_t addr)
+{
+	vdpu_write(vpu, addr & 0xffffffff, offset);
+}
+
+static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
+	unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
+	unsigned int max_log2_ctb_size, ctb_size;
+	bool tiles_enabled, uniform_spacing;
+	u32 no_chroma = 0;
+
+	tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
+	uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
+
+	hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
+
+	max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
+			    sps->log2_diff_max_min_luma_coding_block_size;
+	pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
+			    (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
+	pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
+			     >> max_log2_ctb_size;
+	ctb_size = 1 << max_log2_ctb_size;
+
+	vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
+		  pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
+
+	if (tiles_enabled) {
+		unsigned int i, j, h;
+
+		vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
+
+		hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
+		hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
+
+		/* write width + height for each tile in pic */
+		if (!uniform_spacing) {
+			u32 tmp_w = 0, tmp_h = 0;
+
+			for (i = 0; i < num_tile_rows; i++) {
+				if (i == num_tile_rows - 1)
+					h = pic_height_in_ctbs - tmp_h;
+				else
+					h = pps->row_height_minus1[i] + 1;
+				tmp_h += h;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
+					tmp_w += pps->column_width_minus1[j] + 1;
+					*p++ = pps->column_width_minus1[j + 1];
+					*p++ = h;
+					if (i == 0 && h == 1 && ctb_size == 16)
+						no_chroma = 1;
+				}
+				/* last column */
+				*p++ = pic_width_in_ctbs - tmp_w;
+				*p++ = h;
+			}
+		} else { /* uniform spacing */
+			u32 tmp, prev_h, prev_w;
+
+			for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
+				tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
+				h = tmp - prev_h;
+				prev_h = tmp;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
+					tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
+					*p++ = tmp - prev_w;
+					*p++ = h;
+					if (j == 0 &&
+					    (pps->column_width_minus1[0] + 1) == 1 &&
+					    ctb_size == 16)
+						no_chroma = 1;
+					prev_w = tmp;
+				}
+			}
+		}
+	} else {
+		hantro_reg_write(vpu, hevc_num_tile_rows, 1);
+		hantro_reg_write(vpu, hevc_num_tile_cols, 1);
+
+		/* There's one tile, with dimensions equal to pic size. */
+		p[0] = pic_width_in_ctbs;
+		p[1] = pic_height_in_ctbs;
+	}
+
+	if (no_chroma)
+		vpu_debug(1, "%s: no chroma!\n", __func__);
+}
+
+static void set_params(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	struct hantro_dev *vpu = ctx->dev;
+	u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
+	u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
+	u32 pic_width_aligned, pic_height_aligned;
+	u32 partial_ctb_x, partial_ctb_y;
+
+	hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
+	hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
+
+	hantro_reg_write(vpu, hevc_output_8_bits, 0);
+
+	hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
+
+	min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
+	max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
+
+	hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
+	hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
+
+	min_cb_size = 1 << min_log2_cb_size;
+	max_ctb_size = 1 << max_log2_ctb_size;
+
+	pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
+	pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
+	pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
+	pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
+
+	partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
+	partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
+
+	hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
+	hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
+
+	hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
+	hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
+
+	hantro_reg_write(vpu, hevc_pic_width_4x4,
+			 (pic_width_in_min_cbs * min_cb_size) / 4);
+	hantro_reg_write(vpu, hevc_pic_height_4x4,
+			 (pic_height_in_min_cbs * min_cb_size) / 4);
+
+	hantro_reg_write(vpu, hevc_max_inter_hierdepth,
+			 sps->max_transform_hierarchy_depth_inter);
+	hantro_reg_write(vpu, hevc_max_intra_hierdepth,
+			 sps->max_transform_hierarchy_depth_intra);
+	hantro_reg_write(vpu, hevc_min_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2);
+	hantro_reg_write(vpu, hevc_max_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2 +
+			 sps->log2_diff_max_min_luma_transform_block_size);
+
+	hantro_reg_write(vpu, hevc_tempor_mvp_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
+			 !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
+	hantro_reg_write(vpu, hevc_strong_smooth_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
+	hantro_reg_write(vpu, hevc_asym_pred_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
+	hantro_reg_write(vpu, hevc_sao_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
+	hantro_reg_write(vpu, hevc_sign_data_hide,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
+	} else {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
+	}
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
+	} else {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
+	hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
+	hantro_reg_write(vpu, hevc_slice_chqp_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
+	hantro_reg_write(vpu, hevc_weight_bipr_idc,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
+	hantro_reg_write(vpu, hevc_transq_bypass,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
+	hantro_reg_write(vpu, hevc_list_mod_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
+	hantro_reg_write(vpu, hevc_entropy_sync_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_idr_pic_e,
+			 !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
+	hantro_reg_write(vpu, hevc_parallel_merge,
+			 pps->log2_parallel_merge_level_minus2 + 2);
+	hantro_reg_write(vpu, hevc_pcm_filt_d,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
+	hantro_reg_write(vpu, hevc_pcm_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
+	if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
+		hantro_reg_write(vpu, hevc_max_pcm_size,
+				 sps->log2_diff_max_min_pcm_luma_coding_block_size +
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_min_pcm_size,
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
+				 sps->pcm_sample_bit_depth_luma_minus1 + 1);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
+				 sps->pcm_sample_bit_depth_chroma_minus1 + 1);
+	} else {
+		hantro_reg_write(vpu, hevc_max_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_min_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_start_code_e, 1);
+	hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
+	hantro_reg_write(vpu, hevc_weight_pred_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_const_intra_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+	hantro_reg_write(vpu, hevc_transform_skip,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
+	hantro_reg_write(vpu, hevc_out_filtering_dis,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
+	hantro_reg_write(vpu, hevc_filt_ctrl_pres,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
+	hantro_reg_write(vpu, hevc_dependent_slice,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
+	hantro_reg_write(vpu, hevc_filter_override,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
+	hantro_reg_write(vpu, hevc_refidx0_active,
+			 pps->num_ref_idx_l0_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_refidx1_active,
+			 pps->num_ref_idx_l1_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_apf_threshold, 8);
+}
+
+static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
+{
+	int i;
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
+			return i;
+	}
+
+	return 0x0;
+}
+
+static void set_ref_pic_list(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	const struct hantro_reg *ref_pic_regs0[] = {
+		hevc_rlist_f0,
+		hevc_rlist_f1,
+		hevc_rlist_f2,
+		hevc_rlist_f3,
+		hevc_rlist_f4,
+		hevc_rlist_f5,
+		hevc_rlist_f6,
+		hevc_rlist_f7,
+		hevc_rlist_f8,
+		hevc_rlist_f9,
+		hevc_rlist_f10,
+		hevc_rlist_f11,
+		hevc_rlist_f12,
+		hevc_rlist_f13,
+		hevc_rlist_f14,
+		hevc_rlist_f15,
+	};
+	const struct hantro_reg *ref_pic_regs1[] = {
+		hevc_rlist_b0,
+		hevc_rlist_b1,
+		hevc_rlist_b2,
+		hevc_rlist_b3,
+		hevc_rlist_b4,
+		hevc_rlist_b5,
+		hevc_rlist_b6,
+		hevc_rlist_b7,
+		hevc_rlist_b8,
+		hevc_rlist_b9,
+		hevc_rlist_b10,
+		hevc_rlist_b11,
+		hevc_rlist_b12,
+		hevc_rlist_b13,
+		hevc_rlist_b14,
+		hevc_rlist_b15,
+	};
+	unsigned int i, j;
+
+	/* List 0 contains: short term before, short term after and long term */
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	/* Fill the list, copying over and over */
+	i = 0;
+	while (j < ARRAY_SIZE(list0))
+		list0[j++] = list0[i++];
+
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	i = 0;
+	while (j < ARRAY_SIZE(list1))
+		list1[j++] = list1[i++];
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
+		hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
+	}
+}
+
+static int set_ref(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
+	struct hantro_dev *vpu = ctx->dev;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
+	u32 max_ref_frames;
+	u16 dpb_longterm_e;
+
+	const struct hantro_reg *cur_poc[] = {
+		hevc_cur_poc_00,
+		hevc_cur_poc_01,
+		hevc_cur_poc_02,
+		hevc_cur_poc_03,
+		hevc_cur_poc_04,
+		hevc_cur_poc_05,
+		hevc_cur_poc_06,
+		hevc_cur_poc_07,
+		hevc_cur_poc_08,
+		hevc_cur_poc_09,
+		hevc_cur_poc_10,
+		hevc_cur_poc_11,
+		hevc_cur_poc_12,
+		hevc_cur_poc_13,
+		hevc_cur_poc_14,
+		hevc_cur_poc_15,
+	};
+	unsigned int i;
+
+	max_ref_frames = decode_params->num_poc_lt_curr +
+		decode_params->num_poc_st_curr_before +
+		decode_params->num_poc_st_curr_after;
+	/*
+	 * Set max_ref_frames to non-zero to avoid HW hang when decoding
+	 * badly marked I-frames.
+	 */
+	max_ref_frames = max_ref_frames ? max_ref_frames : 1;
+	hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
+	hantro_reg_write(vpu, hevc_filter_over_slices,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
+	hantro_reg_write(vpu, hevc_filter_over_tiles,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+
+	/*
+	 * Write POC count diff from current pic. For frame decoding only compute
+	 * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
+	 */
+	for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
+		char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
+
+		hantro_reg_write(vpu, cur_poc[i], poc_diff);
+	}
+
+	if (i < ARRAY_SIZE(cur_poc)) {
+		/*
+		 * After the references, fill one entry pointing to itself,
+		 * i.e. difference is zero.
+		 */
+		hantro_reg_write(vpu, cur_poc[i], 0);
+		i++;
+	}
+
+	/* Fill the rest with the current picture */
+	for (; i < ARRAY_SIZE(cur_poc); i++)
+		hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
+
+	set_ref_pic_list(ctx);
+
+	/* We will only keep the references picture that are still used */
+	ctx->hevc_dec.ref_bufs_used = 0;
+
+	/* Set up addresses of DPB buffers */
+	dpb_longterm_e = 0;
+	for (i = 0; i < decode_params->num_active_dpb_entries &&
+	     i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
+		luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
+		if (!luma_addr)
+			return -ENOMEM;
+
+		chroma_addr = luma_addr + cr_offset;
+		mv_addr = luma_addr + mv_offset;
+
+		if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
+			dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
+
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
+	}
+
+	luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
+	if (!luma_addr)
+		return -ENOMEM;
+
+	chroma_addr = luma_addr + cr_offset;
+	mv_addr = luma_addr + mv_offset;
+
+	hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+	hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+	hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
+
+	hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
+
+	hantro_hevc_ref_remove_unused(ctx);
+
+	for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
+	}
+
+	hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
+
+	return 0;
+}
+
+static void set_buffers(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	dma_addr_t src_dma, dst_dma;
+	u32 src_len, src_buf_len;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/* Source (stream) buffer. */
+	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
+	src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
+
+	hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
+	hantro_reg_write(vpu, hevc_stream_len, src_len);
+	hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
+	hantro_reg_write(vpu, hevc_strm_start_offset, 0);
+	hantro_reg_write(vpu, hevc_write_mvs_e, 1);
+
+	/* Destination (decoded frame) buffer. */
+	dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
+
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
+	hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
+	hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
+	hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
+	hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
+}
+
+void hantro_g2_check_idle(struct hantro_dev *vpu)
+{
+	int i;
+
+	for (i = 0; i < 3; i++) {
+		u32 status;
+
+		/* Make sure the VPU is idle */
+		status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+		if (status & HEVC_REG_INTERRUPT_DEC_E) {
+			pr_warn("%s: still enabled!!! resetting.\n", __func__);
+			status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
+			vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
+		}
+	}
+}
+
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	int ret;
+
+	hantro_g2_check_idle(vpu);
+
+	/* Prepare HEVC decoder context. */
+	ret = hantro_hevc_dec_prepare_run(ctx);
+	if (ret)
+		return ret;
+
+	/* Configure hardware registers. */
+	set_params(ctx);
+
+	/* set reference pictures */
+	ret = set_ref(ctx);
+	if (ret)
+		return ret;
+
+	set_buffers(ctx);
+	prepare_tile_info_buffer(ctx);
+
+	hantro_end_prepare_run(ctx);
+
+	hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
+	hantro_reg_write(vpu, hevc_clk_gate_e, 1);
+
+	/* Don't disable output */
+	hantro_reg_write(vpu, hevc_out_dis, 0);
+
+	/* Don't compress buffers */
+	hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
+
+	/* use NV12 as output format */
+	hantro_reg_write(vpu, hevc_out_rs_e, 1);
+
+	/* Bus width and max burst */
+	hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
+	hantro_reg_write(vpu, hevc_max_burst, 16);
+
+	/* Swap */
+	hantro_reg_write(vpu, hevc_strm_swap, 0xf);
+	hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
+	hantro_reg_write(vpu, hevc_compress_swap, 0xf);
+
+	/* Start decoding! */
+	vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
new file mode 100644
index 000000000000..a361c9ba911d
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -0,0 +1,198 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021, Collabora
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
+ */
+
+#ifndef HANTRO_G2_REGS_H_
+#define HANTRO_G2_REGS_H_
+
+#include "hantro.h"
+
+#define G2_SWREG(nr)	((nr) * 4)
+
+#define HEVC_DEC_REG(name, base, shift, mask) \
+	static const struct hantro_reg _hevc_##name[] = { \
+		{ G2_SWREG(base), (shift), (mask) } \
+	}; \
+	static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
+
+#define HEVC_REG_VERSION		G2_SWREG(0)
+
+#define HEVC_REG_INTERRUPT		G2_SWREG(1)
+#define HEVC_REG_INTERRUPT_DEC_RDY_INT	BIT(12)
+#define HEVC_REG_INTERRUPT_DEC_ABORT_E	BIT(5)
+#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS	BIT(4)
+#define HEVC_REG_INTERRUPT_DEC_E	BIT(0)
+
+HEVC_DEC_REG(strm_swap,		2, 28,	0xf)
+HEVC_DEC_REG(dirmv_swap,	2, 20,	0xf)
+
+HEVC_DEC_REG(mode,		  3, 27, 0x1f)
+HEVC_DEC_REG(compress_swap,	  3, 20, 0xf)
+HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
+HEVC_DEC_REG(out_rs_e,		  3, 16, 0x1)
+HEVC_DEC_REG(out_dis,		  3, 15, 0x1)
+HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
+HEVC_DEC_REG(write_mvs_e,	  3, 12, 0x1)
+
+HEVC_DEC_REG(pic_width_in_cbs,	4, 19,	0x1ff)
+HEVC_DEC_REG(pic_height_in_cbs,	4, 6,	0x1ff)
+HEVC_DEC_REG(num_ref_frames,	4, 0,	0x1f)
+
+HEVC_DEC_REG(scaling_list_e,	5, 24,	0x1)
+HEVC_DEC_REG(cb_qp_offset,	5, 19,	0x1f)
+HEVC_DEC_REG(cr_qp_offset,	5, 14,	0x1f)
+HEVC_DEC_REG(sign_data_hide,	5, 12,	0x1)
+HEVC_DEC_REG(tempor_mvp_e,	5, 11,	0x1)
+HEVC_DEC_REG(max_cu_qpd_depth,	5, 5,	0x3f)
+HEVC_DEC_REG(cu_qpd_e,		5, 4,	0x1)
+
+HEVC_DEC_REG(stream_len,	6, 0,	0xffffffff)
+
+HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
+HEVC_DEC_REG(weight_pred_e,	 7, 28, 0x1)
+HEVC_DEC_REG(weight_bipr_idc,	 7, 26, 0x3)
+HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
+HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
+HEVC_DEC_REG(asym_pred_e,	 7, 23, 0x1)
+HEVC_DEC_REG(sao_e,		 7, 22, 0x1)
+HEVC_DEC_REG(pcm_filt_d,	 7, 21, 0x1)
+HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
+HEVC_DEC_REG(dependent_slice,	 7, 19, 0x1)
+HEVC_DEC_REG(filter_override,	 7, 18, 0x1)
+HEVC_DEC_REG(strong_smooth_e,	 7, 17, 0x1)
+HEVC_DEC_REG(filt_offset_beta,	 7, 12, 0x1f)
+HEVC_DEC_REG(filt_offset_tc,	 7, 7,  0x1f)
+HEVC_DEC_REG(slice_hdr_ext_e,	 7, 6,	0x1)
+HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3,	0x7)
+
+HEVC_DEC_REG(const_intra_e,	 8, 31, 0x1)
+HEVC_DEC_REG(filt_ctrl_pres,	 8, 30, 0x1)
+HEVC_DEC_REG(idr_pic_e,		 8, 16, 0x1)
+HEVC_DEC_REG(bit_depth_pcm_y,	 8, 12, 0xf)
+HEVC_DEC_REG(bit_depth_pcm_c,	 8, 8,  0xf)
+HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
+HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
+HEVC_DEC_REG(output_8_bits,	 8, 3,  0x1)
+
+HEVC_DEC_REG(refidx1_active,	9, 19,	0x1f)
+HEVC_DEC_REG(refidx0_active,	9, 14,	0x1f)
+HEVC_DEC_REG(hdr_skip_length,	9, 0,	0x3fff)
+
+HEVC_DEC_REG(start_code_e,	10, 31, 0x1)
+HEVC_DEC_REG(init_qp,		10, 24, 0x3f)
+HEVC_DEC_REG(num_tile_cols,	10, 19, 0x1f)
+HEVC_DEC_REG(num_tile_rows,	10, 14, 0x1f)
+HEVC_DEC_REG(tile_e,		10, 1,	0x1)
+HEVC_DEC_REG(entropy_sync_e,	10, 0,	0x1)
+
+HEVC_DEC_REG(refer_lterm_e,	12, 16, 0xffff)
+HEVC_DEC_REG(min_cb_size,	12, 13, 0x7)
+HEVC_DEC_REG(max_cb_size,	12, 10, 0x7)
+HEVC_DEC_REG(min_pcm_size,	12, 7,  0x7)
+HEVC_DEC_REG(max_pcm_size,	12, 4,  0x7)
+HEVC_DEC_REG(pcm_e,		12, 3,  0x1)
+HEVC_DEC_REG(transform_skip,	12, 2,	0x1)
+HEVC_DEC_REG(transq_bypass,	12, 1,	0x1)
+HEVC_DEC_REG(list_mod_e,	12, 0,	0x1)
+
+HEVC_DEC_REG(min_trb_size,	  13, 13, 0x7)
+HEVC_DEC_REG(max_trb_size,	  13, 10, 0x7)
+HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
+HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
+HEVC_DEC_REG(parallel_merge,	  13, 0,  0xf)
+
+HEVC_DEC_REG(rlist_f0,		14, 0,	0x1f)
+HEVC_DEC_REG(rlist_f1,		14, 10,	0x1f)
+HEVC_DEC_REG(rlist_f2,		14, 20,	0x1f)
+HEVC_DEC_REG(rlist_b0,		14, 5,	0x1f)
+HEVC_DEC_REG(rlist_b1,		14, 15, 0x1f)
+HEVC_DEC_REG(rlist_b2,		14, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f3,		15, 0,	0x1f)
+HEVC_DEC_REG(rlist_f4,		15, 10, 0x1f)
+HEVC_DEC_REG(rlist_f5,		15, 20, 0x1f)
+HEVC_DEC_REG(rlist_b3,		15, 5,	0x1f)
+HEVC_DEC_REG(rlist_b4,		15, 15, 0x1f)
+HEVC_DEC_REG(rlist_b5,		15, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f6,		16, 0,	0x1f)
+HEVC_DEC_REG(rlist_f7,		16, 10, 0x1f)
+HEVC_DEC_REG(rlist_f8,		16, 20, 0x1f)
+HEVC_DEC_REG(rlist_b6,		16, 5,	0x1f)
+HEVC_DEC_REG(rlist_b7,		16, 15, 0x1f)
+HEVC_DEC_REG(rlist_b8,		16, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f9,		17, 0,	0x1f)
+HEVC_DEC_REG(rlist_f10,		17, 10, 0x1f)
+HEVC_DEC_REG(rlist_f11,		17, 20, 0x1f)
+HEVC_DEC_REG(rlist_b9,		17, 5,	0x1f)
+HEVC_DEC_REG(rlist_b10,		17, 15, 0x1f)
+HEVC_DEC_REG(rlist_b11,		17, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f12,		18, 0,	0x1f)
+HEVC_DEC_REG(rlist_f13,		18, 10, 0x1f)
+HEVC_DEC_REG(rlist_f14,		18, 20, 0x1f)
+HEVC_DEC_REG(rlist_b12,		18, 5,	0x1f)
+HEVC_DEC_REG(rlist_b13,		18, 15, 0x1f)
+HEVC_DEC_REG(rlist_b14,		18, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f15,		19, 0,	0x1f)
+HEVC_DEC_REG(rlist_b15,		19, 5,	0x1f)
+
+HEVC_DEC_REG(partial_ctb_x,	20, 31, 0x1)
+HEVC_DEC_REG(partial_ctb_y,	20, 30, 0x1)
+HEVC_DEC_REG(pic_width_4x4,	20, 16, 0xfff)
+HEVC_DEC_REG(pic_height_4x4,	20, 0,  0xfff)
+
+HEVC_DEC_REG(cur_poc_00,	46, 24,	0xff)
+HEVC_DEC_REG(cur_poc_01,	46, 16,	0xff)
+HEVC_DEC_REG(cur_poc_02,	46, 8,	0xff)
+HEVC_DEC_REG(cur_poc_03,	46, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_04,	47, 24,	0xff)
+HEVC_DEC_REG(cur_poc_05,	47, 16,	0xff)
+HEVC_DEC_REG(cur_poc_06,	47, 8,	0xff)
+HEVC_DEC_REG(cur_poc_07,	47, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_08,	48, 24,	0xff)
+HEVC_DEC_REG(cur_poc_09,	48, 16,	0xff)
+HEVC_DEC_REG(cur_poc_10,	48, 8,	0xff)
+HEVC_DEC_REG(cur_poc_11,	48, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_12,	49, 24, 0xff)
+HEVC_DEC_REG(cur_poc_13,	49, 16, 0xff)
+HEVC_DEC_REG(cur_poc_14,	49, 8,	0xff)
+HEVC_DEC_REG(cur_poc_15,	49, 0,	0xff)
+
+HEVC_DEC_REG(apf_threshold,	55, 0,	0xffff)
+
+HEVC_DEC_REG(clk_gate_e,	58, 16,	0x1)
+HEVC_DEC_REG(buswidth,		58, 8,	0x7)
+HEVC_DEC_REG(max_burst,		58, 0,	0xff)
+
+#define HEVC_REG_CONFIG				G2_SWREG(58)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_E		BIT(16)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E	BIT(17)
+
+#define HEVC_ADDR_DST		(G2_SWREG(65))
+#define HEVC_REG_ADDR_REF(i)	(G2_SWREG(67)  + ((i) * 0x8))
+#define HEVC_ADDR_DST_CHR	(G2_SWREG(99))
+#define HEVC_REG_CHR_REF(i)	(G2_SWREG(101) + ((i) * 0x8))
+#define HEVC_ADDR_DST_MV	(G2_SWREG(133))
+#define HEVC_REG_DMV_REF(i)	(G2_SWREG(135) + ((i) * 0x8))
+#define HEVC_ADDR_TILE_SIZE	(G2_SWREG(167))
+#define HEVC_ADDR_STR		(G2_SWREG(169))
+#define HEVC_SCALING_LIST	(G2_SWREG(171))
+#define HEVC_RASTER_SCAN	(G2_SWREG(175))
+#define HEVC_RASTER_SCAN_CHR	(G2_SWREG(177))
+#define HEVC_TILE_FILTER	(G2_SWREG(179))
+#define HEVC_TILE_SAO		(G2_SWREG(181))
+#define HEVC_TILE_BSD		(G2_SWREG(183))
+
+HEVC_DEC_REG(strm_buffer_len,	258, 0,	0xffffffff)
+HEVC_DEC_REG(strm_start_offset,	259, 0,	0xffffffff)
+
+#endif
diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
new file mode 100644
index 000000000000..8e319a837ff3
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_hevc.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include <linux/types.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+
+#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
+/*
+ * BSD control data of current picture at tile border
+ * 128 bits per 4x4 tile = 128/(8*4) bytes per row
+ */
+#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
+/* tile border coefficients of filter */
+#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
+
+#define MAX_TILE_COLS 20
+#define MAX_TILE_ROWS 22
+
+#define UNUSED_REF	-1
+
+#define G2_ALIGN		16
+#define MC_WORD_SIZE		32
+
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
+
+	return sps->pic_width_in_luma_samples *
+		sps->pic_height_in_luma_samples * bytes_per_pixel;
+}
+
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+
+	return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
+}
+
+static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
+	u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
+	size_t mv_size;
+
+	mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
+		  (1 << (2 * (8 - 4))) * 16) + 32;
+
+	vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
+		  pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
+
+	return mv_size;
+}
+
+static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+
+	return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
+}
+
+static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	struct hantro_dev *vpu = ctx->dev;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs[i].cpu) {
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
+					  hevc_dec->ref_bufs[i].cpu,
+					  hevc_dec->ref_bufs[i].dma);
+		}
+	}
+}
+
+static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	for (i = 0;  i < NUM_REF_PICTURES; i++)
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+}
+
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
+				   int poc)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Find the reference buffer in already know ones */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == poc) {
+			hevc_dec->ref_bufs_used |= 1 << i;
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	/* Allocate a new reference buffer */
+	for (i = 0; i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
+			if (!hevc_dec->ref_bufs[i].cpu) {
+				struct hantro_dev *vpu = ctx->dev;
+
+				hevc_dec->ref_bufs[i].cpu =
+					dma_alloc_coherent(vpu->dev,
+							   hantro_hevc_ref_size(ctx),
+							   &hevc_dec->ref_bufs[i].dma,
+							   GFP_KERNEL);
+				if (!hevc_dec->ref_bufs[i].cpu)
+					return 0;
+
+				hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
+			}
+			hevc_dec->ref_bufs_used |= 1 << i;
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			hevc_dec->ref_bufs_poc[i] = poc;
+
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	return 0;
+}
+
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF)
+			continue;
+
+		if (hevc_dec->ref_bufs_used & (1 << i))
+			continue;
+
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+	}
+}
+
+static int tile_buffer_reallocate(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int height64 = (sps->pic_height_in_luma_samples + 63) & ~63;
+	unsigned int size;
+
+	if (num_tile_cols <= 1 ||
+	    num_tile_cols <= hevc_dec->num_tile_cols_allocated)
+		return 0;
+
+	/* Need to reallocate due to tiles passed via PPS */
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+
+	size = VERT_FILTER_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_filter.cpu = dma_alloc_coherent(vpu->dev, size,
+						       &hevc_dec->tile_filter.dma,
+						       GFP_KERNEL);
+	if (!hevc_dec->tile_filter.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_filter.size = size;
+
+	size = VERT_SAO_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_sao.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_sao.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_sao.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_sao.size = size;
+
+	size = BSD_CTRL_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_bsd.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_bsd.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_bsd.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_bsd.size = size;
+
+	hevc_dec->num_tile_cols_allocated = num_tile_cols;
+
+	return 0;
+
+err_free_tile_buffers:
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	return -ENOMEM;
+}
+
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_ctx = &ctx->hevc_dec;
+	struct hantro_hevc_dec_ctrls *ctrls = &hevc_ctx->ctrls;
+	int ret;
+
+	hantro_start_prepare_run(ctx);
+
+	ctrls->decode_params =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
+	if (WARN_ON(!ctrls->decode_params))
+		return -EINVAL;
+
+	ctrls->sps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_SPS);
+	if (WARN_ON(!ctrls->sps))
+		return -EINVAL;
+
+	ctrls->pps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_PPS);
+	if (WARN_ON(!ctrls->pps))
+		return -EINVAL;
+
+	ret = tile_buffer_reallocate(ctx);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+
+	if (hevc_dec->tile_sizes.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sizes.size,
+				  hevc_dec->tile_sizes.cpu,
+				  hevc_dec->tile_sizes.dma);
+	hevc_dec->tile_sizes.cpu = 0;
+
+	if (hevc_dec->tile_filter.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	hantro_hevc_ref_free(ctx);
+}
+
+int hantro_hevc_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	unsigned int size;
+
+	memset(hevc_dec, 0, sizeof(*hevc_dec));
+
+	/*
+	 * Maximum number of tiles times width and height (2 bytes each),
+	 * rounding up to next 16 bytes boundary + one extra 16 byte
+	 * chunk (HW guys wanted to have this).
+	 */
+	size = round_up(MAX_TILE_COLS * MAX_TILE_ROWS * 4 * sizeof(u16) + 16, 16);
+	hevc_dec->tile_sizes.cpu = dma_alloc_coherent(vpu->dev, size,
+						      &hevc_dec->tile_sizes.dma,
+						      GFP_KERNEL);
+	if (!hevc_dec->tile_sizes.cpu)
+		return -ENOMEM;
+
+	hevc_dec->tile_sizes.size = size;
+
+	hantro_hevc_ref_init(ctx);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 4e2e7a5ed283..dade3b0769c1 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -20,6 +20,8 @@
 #define MB_WIDTH(w)		DIV_ROUND_UP(w, MB_DIM)
 #define MB_HEIGHT(h)		DIV_ROUND_UP(h, MB_DIM)
 
+#define NUM_REF_PICTURES	(V4L2_HEVC_DPB_ENTRIES_NUM_MAX + 1)
+
 struct hantro_dev;
 struct hantro_ctx;
 struct hantro_buf;
@@ -90,6 +92,44 @@ struct hantro_h264_dec_hw_ctx {
 	struct hantro_h264_dec_ctrls ctrls;
 };
 
+/**
+ * struct hantro_hevc_dec_ctrls
+ * @decode_params: Decode params
+ * @sps:	SPS info
+ * @pps:	PPS info
+ * @hevc_hdr_skip_length: the number of data (in bits) to skip in the
+ *			  slice segment header syntax after 'slice type'
+ *			  token
+ */
+struct hantro_hevc_dec_ctrls {
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
+	const struct v4l2_ctrl_hevc_sps *sps;
+	const struct v4l2_ctrl_hevc_pps *pps;
+	u32 hevc_hdr_skip_length;
+};
+
+/**
+ * struct hantro_hevc_dec_hw_ctx
+ * @tile_sizes:		Tile sizes buffer
+ * @tile_filter:	Tile vertical filter buffer
+ * @tile_sao:		Tile SAO buffer
+ * @tile_bsd:		Tile BSD control buffer
+ * @dpb:	DPB
+ * @reflists:	P/B0/B1 reflists
+ * @ctrls:	V4L2 controls attached to a run
+ */
+struct hantro_hevc_dec_hw_ctx {
+	struct hantro_aux_buf tile_sizes;
+	struct hantro_aux_buf tile_filter;
+	struct hantro_aux_buf tile_sao;
+	struct hantro_aux_buf tile_bsd;
+	struct hantro_aux_buf ref_bufs[NUM_REF_PICTURES];
+	int ref_bufs_poc[NUM_REF_PICTURES];
+	u32 ref_bufs_used;
+	struct hantro_hevc_dec_ctrls ctrls;
+	unsigned int num_tile_cols_allocated;
+};
+
 /**
  * struct hantro_mpeg2_dec_hw_ctx
  * @qtable:		Quantization table
@@ -178,6 +218,15 @@ int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
+int hantro_hevc_dec_init(struct hantro_ctx *ctx);
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx);
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc);
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
+
 static inline size_t
 hantro_h264_mv_size(unsigned int width, unsigned int height)
 {
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Implement all the logic to get G2 hardware decoding HEVC frames.
It support up level 5.1 HEVC stream.
It doesn't support yet 10 bits formats or scaling feature.

Add HANTRO HEVC dedicated control to skip some bits at the beginning
of the slice header. That is very specific to this hardware so can't
go into uapi structures. Compute the needed value is complex and require
information from the stream that only the userland knows so let it
provide the correct value to the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- fix Ezequiel comments
- use dedicated control as an integer
- change hantro_g2_hevc_dec_run prototype to return errors

version 2:
- squash multiple commits in this one.
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.

 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  18 +
 drivers/staging/media/hantro/hantro_drv.c     |  53 ++
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  49 ++
 7 files changed, 1228 insertions(+)
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c

diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
index 743ce08eb184..0357f1772267 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -9,12 +9,14 @@ hantro-vpu-y += \
 		hantro_h1_jpeg_enc.o \
 		hantro_g1_h264_dec.o \
 		hantro_g1_mpeg2_dec.o \
+		hantro_g2_hevc_dec.o \
 		hantro_g1_vp8_dec.o \
 		rk3399_vpu_hw_jpeg_enc.o \
 		rk3399_vpu_hw_mpeg2_dec.o \
 		rk3399_vpu_hw_vp8_dec.o \
 		hantro_jpeg.o \
 		hantro_h264.o \
+		hantro_hevc.o \
 		hantro_mpeg2.o \
 		hantro_vp8.o
 
diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
index 05876e426419..a9b80b2c9124 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -225,6 +225,7 @@ struct hantro_dev {
  * @jpeg_enc:		JPEG-encoding context.
  * @mpeg2_dec:		MPEG-2-decoding context.
  * @vp8_dec:		VP8-decoding context.
+ * @hevc_dec:		HEVC-decoding context.
  */
 struct hantro_ctx {
 	struct hantro_dev *dev;
@@ -251,6 +252,7 @@ struct hantro_ctx {
 		struct hantro_jpeg_enc_hw_ctx jpeg_enc;
 		struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
 		struct hantro_vp8_dec_hw_ctx vp8_dec;
+		struct hantro_hevc_dec_hw_ctx hevc_dec;
 	};
 };
 
@@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
 	return vb2_dma_contig_plane_dma_addr(vb, 0);
 }
 
+static inline size_t
+hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].size;
+	return vb2_plane_size(vb, 0);
+}
+
+static inline void *
+hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
+{
+	if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
+		return ctx->postproc.dec_q[vb->index].cpu;
+	return vb2_plane_vaddr(vb, 0);
+}
+
 void hantro_postproc_disable(struct hantro_ctx *ctx);
 void hantro_postproc_enable(struct hantro_ctx *ctx);
 void hantro_postproc_free(struct hantro_ctx *ctx);
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index e3e6df28f470..bc90a52f4d3d 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -30,6 +30,13 @@
 
 #define DRIVER_NAME "hantro-vpu"
 
+/*
+ * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
+ * the number of data (in bits) to skip in the
+ * slice segment header syntax after 'slice type' token
+ */
+#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP	(V4L2_CID_USER_HANTRO_BASE + 0)
+
 int hantro_debug;
 module_param_named(debug, hantro_debug, int, 0644);
 MODULE_PARM_DESC(debug,
@@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
 	return 0;
 }
 
+static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
+{
+	struct hantro_ctx *ctx;
+
+	ctx = container_of(ctrl->handler,
+			   struct hantro_ctx, ctrl_handler);
+
+	vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
+
+	switch (ctrl->id) {
+	case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
+		ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
 	.try_ctrl = hantro_try_ctrl,
 };
@@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
 	.s_ctrl = hantro_jpeg_s_ctrl,
 };
 
+static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
+	.s_ctrl = hantro_hevc_s_ctrl,
+};
+
 static const struct hantro_ctrl controls[] = {
 	{
 		.codec = HANTRO_JPEG_ENCODER,
@@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
 		.cfg = {
 			.id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
 		},
+	}, {
+		.codec = HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
+			.name = "Hantro HEVC slice header skip bytes",
+			.type = V4L2_CTRL_TYPE_INTEGER,
+			.min = 0,
+			.def = 0,
+			.max = 0x7fffffff,
+			.step = 1,
+			.ops = &hantro_hevc_ctrl_ops,
+		},
+	}, {
+		.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
+			 HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
+			 HANTRO_HEVC_DECODER,
+		.cfg = {
+			.id = V4L2_CID_USER_CLASS,
+			.name = "HANTRO controls",
+			.type = V4L2_CTRL_TYPE_CTRL_CLASS,
+			.flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
+		},
 	},
 };
 
diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
new file mode 100644
index 000000000000..5d75b36bc40c
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -0,0 +1,587 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include "hantro_hw.h"
+#include "hantro_g2_regs.h"
+
+#define HEVC_DEC_MODE	0xC
+
+#define BUS_WIDTH_32		0
+#define BUS_WIDTH_64		1
+#define BUS_WIDTH_128		2
+#define BUS_WIDTH_256		3
+
+static inline void hantro_write_addr(struct hantro_dev *vpu,
+				     unsigned long offset,
+				     dma_addr_t addr)
+{
+	vdpu_write(vpu, addr & 0xffffffff, offset);
+}
+
+static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
+	unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
+	unsigned int max_log2_ctb_size, ctb_size;
+	bool tiles_enabled, uniform_spacing;
+	u32 no_chroma = 0;
+
+	tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
+	uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
+
+	hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
+
+	max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
+			    sps->log2_diff_max_min_luma_coding_block_size;
+	pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
+			    (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
+	pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
+			     >> max_log2_ctb_size;
+	ctb_size = 1 << max_log2_ctb_size;
+
+	vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
+		  pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
+
+	if (tiles_enabled) {
+		unsigned int i, j, h;
+
+		vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
+
+		hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
+		hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
+
+		/* write width + height for each tile in pic */
+		if (!uniform_spacing) {
+			u32 tmp_w = 0, tmp_h = 0;
+
+			for (i = 0; i < num_tile_rows; i++) {
+				if (i == num_tile_rows - 1)
+					h = pic_height_in_ctbs - tmp_h;
+				else
+					h = pps->row_height_minus1[i] + 1;
+				tmp_h += h;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
+					tmp_w += pps->column_width_minus1[j] + 1;
+					*p++ = pps->column_width_minus1[j + 1];
+					*p++ = h;
+					if (i == 0 && h == 1 && ctb_size == 16)
+						no_chroma = 1;
+				}
+				/* last column */
+				*p++ = pic_width_in_ctbs - tmp_w;
+				*p++ = h;
+			}
+		} else { /* uniform spacing */
+			u32 tmp, prev_h, prev_w;
+
+			for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
+				tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
+				h = tmp - prev_h;
+				prev_h = tmp;
+				if (i == 0 && h == 1 && ctb_size == 16)
+					no_chroma = 1;
+				for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
+					tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
+					*p++ = tmp - prev_w;
+					*p++ = h;
+					if (j == 0 &&
+					    (pps->column_width_minus1[0] + 1) == 1 &&
+					    ctb_size == 16)
+						no_chroma = 1;
+					prev_w = tmp;
+				}
+			}
+		}
+	} else {
+		hantro_reg_write(vpu, hevc_num_tile_rows, 1);
+		hantro_reg_write(vpu, hevc_num_tile_cols, 1);
+
+		/* There's one tile, with dimensions equal to pic size. */
+		p[0] = pic_width_in_ctbs;
+		p[1] = pic_height_in_ctbs;
+	}
+
+	if (no_chroma)
+		vpu_debug(1, "%s: no chroma!\n", __func__);
+}
+
+static void set_params(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	struct hantro_dev *vpu = ctx->dev;
+	u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
+	u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
+	u32 pic_width_aligned, pic_height_aligned;
+	u32 partial_ctb_x, partial_ctb_y;
+
+	hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
+	hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
+
+	hantro_reg_write(vpu, hevc_output_8_bits, 0);
+
+	hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
+
+	min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
+	max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
+
+	hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
+	hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
+
+	min_cb_size = 1 << min_log2_cb_size;
+	max_ctb_size = 1 << max_log2_ctb_size;
+
+	pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
+	pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
+	pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
+	pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
+
+	partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
+	partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
+
+	hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
+	hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
+
+	hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
+	hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
+
+	hantro_reg_write(vpu, hevc_pic_width_4x4,
+			 (pic_width_in_min_cbs * min_cb_size) / 4);
+	hantro_reg_write(vpu, hevc_pic_height_4x4,
+			 (pic_height_in_min_cbs * min_cb_size) / 4);
+
+	hantro_reg_write(vpu, hevc_max_inter_hierdepth,
+			 sps->max_transform_hierarchy_depth_inter);
+	hantro_reg_write(vpu, hevc_max_intra_hierdepth,
+			 sps->max_transform_hierarchy_depth_intra);
+	hantro_reg_write(vpu, hevc_min_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2);
+	hantro_reg_write(vpu, hevc_max_trb_size,
+			 sps->log2_min_luma_transform_block_size_minus2 + 2 +
+			 sps->log2_diff_max_min_luma_transform_block_size);
+
+	hantro_reg_write(vpu, hevc_tempor_mvp_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
+			 !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
+	hantro_reg_write(vpu, hevc_strong_smooth_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
+	hantro_reg_write(vpu, hevc_asym_pred_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
+	hantro_reg_write(vpu, hevc_sao_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
+	hantro_reg_write(vpu, hevc_sign_data_hide,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
+	} else {
+		hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
+		hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
+	}
+
+	if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
+	} else {
+		hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
+		hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
+	hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
+	hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
+	hantro_reg_write(vpu, hevc_slice_chqp_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
+	hantro_reg_write(vpu, hevc_weight_bipr_idc,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
+	hantro_reg_write(vpu, hevc_transq_bypass,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
+	hantro_reg_write(vpu, hevc_list_mod_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
+	hantro_reg_write(vpu, hevc_entropy_sync_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_idr_pic_e,
+			 !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
+	hantro_reg_write(vpu, hevc_parallel_merge,
+			 pps->log2_parallel_merge_level_minus2 + 2);
+	hantro_reg_write(vpu, hevc_pcm_filt_d,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
+	hantro_reg_write(vpu, hevc_pcm_e,
+			 !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
+	if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
+		hantro_reg_write(vpu, hevc_max_pcm_size,
+				 sps->log2_diff_max_min_pcm_luma_coding_block_size +
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_min_pcm_size,
+				 sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
+				 sps->pcm_sample_bit_depth_luma_minus1 + 1);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
+				 sps->pcm_sample_bit_depth_chroma_minus1 + 1);
+	} else {
+		hantro_reg_write(vpu, hevc_max_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_min_pcm_size, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
+		hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
+	}
+
+	hantro_reg_write(vpu, hevc_start_code_e, 1);
+	hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
+	hantro_reg_write(vpu, hevc_weight_pred_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
+	hantro_reg_write(vpu, hevc_cabac_init_present,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
+	hantro_reg_write(vpu, hevc_const_intra_e,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
+	hantro_reg_write(vpu, hevc_transform_skip,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
+	hantro_reg_write(vpu, hevc_out_filtering_dis,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
+	hantro_reg_write(vpu, hevc_filt_ctrl_pres,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
+	hantro_reg_write(vpu, hevc_dependent_slice,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
+	hantro_reg_write(vpu, hevc_filter_override,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
+	hantro_reg_write(vpu, hevc_refidx0_active,
+			 pps->num_ref_idx_l0_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_refidx1_active,
+			 pps->num_ref_idx_l1_default_active_minus1 + 1);
+	hantro_reg_write(vpu, hevc_apf_threshold, 8);
+}
+
+static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
+{
+	int i;
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
+			return i;
+	}
+
+	return 0x0;
+}
+
+static void set_ref_pic_list(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
+	const struct hantro_reg *ref_pic_regs0[] = {
+		hevc_rlist_f0,
+		hevc_rlist_f1,
+		hevc_rlist_f2,
+		hevc_rlist_f3,
+		hevc_rlist_f4,
+		hevc_rlist_f5,
+		hevc_rlist_f6,
+		hevc_rlist_f7,
+		hevc_rlist_f8,
+		hevc_rlist_f9,
+		hevc_rlist_f10,
+		hevc_rlist_f11,
+		hevc_rlist_f12,
+		hevc_rlist_f13,
+		hevc_rlist_f14,
+		hevc_rlist_f15,
+	};
+	const struct hantro_reg *ref_pic_regs1[] = {
+		hevc_rlist_b0,
+		hevc_rlist_b1,
+		hevc_rlist_b2,
+		hevc_rlist_b3,
+		hevc_rlist_b4,
+		hevc_rlist_b5,
+		hevc_rlist_b6,
+		hevc_rlist_b7,
+		hevc_rlist_b8,
+		hevc_rlist_b9,
+		hevc_rlist_b10,
+		hevc_rlist_b11,
+		hevc_rlist_b12,
+		hevc_rlist_b13,
+		hevc_rlist_b14,
+		hevc_rlist_b15,
+	};
+	unsigned int i, j;
+
+	/* List 0 contains: short term before, short term after and long term */
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
+		list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	/* Fill the list, copying over and over */
+	i = 0;
+	while (j < ARRAY_SIZE(list0))
+		list0[j++] = list0[i++];
+
+	j = 0;
+	for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
+	for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
+	for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
+		list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
+
+	i = 0;
+	while (j < ARRAY_SIZE(list1))
+		list1[j++] = list1[i++];
+
+	for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
+		hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
+	}
+}
+
+static int set_ref(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+	const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+	dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
+	struct hantro_dev *vpu = ctx->dev;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
+	u32 max_ref_frames;
+	u16 dpb_longterm_e;
+
+	const struct hantro_reg *cur_poc[] = {
+		hevc_cur_poc_00,
+		hevc_cur_poc_01,
+		hevc_cur_poc_02,
+		hevc_cur_poc_03,
+		hevc_cur_poc_04,
+		hevc_cur_poc_05,
+		hevc_cur_poc_06,
+		hevc_cur_poc_07,
+		hevc_cur_poc_08,
+		hevc_cur_poc_09,
+		hevc_cur_poc_10,
+		hevc_cur_poc_11,
+		hevc_cur_poc_12,
+		hevc_cur_poc_13,
+		hevc_cur_poc_14,
+		hevc_cur_poc_15,
+	};
+	unsigned int i;
+
+	max_ref_frames = decode_params->num_poc_lt_curr +
+		decode_params->num_poc_st_curr_before +
+		decode_params->num_poc_st_curr_after;
+	/*
+	 * Set max_ref_frames to non-zero to avoid HW hang when decoding
+	 * badly marked I-frames.
+	 */
+	max_ref_frames = max_ref_frames ? max_ref_frames : 1;
+	hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
+	hantro_reg_write(vpu, hevc_filter_over_slices,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
+	hantro_reg_write(vpu, hevc_filter_over_tiles,
+			 !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+
+	/*
+	 * Write POC count diff from current pic. For frame decoding only compute
+	 * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
+	 */
+	for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
+		char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
+
+		hantro_reg_write(vpu, cur_poc[i], poc_diff);
+	}
+
+	if (i < ARRAY_SIZE(cur_poc)) {
+		/*
+		 * After the references, fill one entry pointing to itself,
+		 * i.e. difference is zero.
+		 */
+		hantro_reg_write(vpu, cur_poc[i], 0);
+		i++;
+	}
+
+	/* Fill the rest with the current picture */
+	for (; i < ARRAY_SIZE(cur_poc); i++)
+		hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
+
+	set_ref_pic_list(ctx);
+
+	/* We will only keep the references picture that are still used */
+	ctx->hevc_dec.ref_bufs_used = 0;
+
+	/* Set up addresses of DPB buffers */
+	dpb_longterm_e = 0;
+	for (i = 0; i < decode_params->num_active_dpb_entries &&
+	     i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
+		luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
+		if (!luma_addr)
+			return -ENOMEM;
+
+		chroma_addr = luma_addr + cr_offset;
+		mv_addr = luma_addr + mv_offset;
+
+		if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
+			dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
+
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
+	}
+
+	luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
+	if (!luma_addr)
+		return -ENOMEM;
+
+	chroma_addr = luma_addr + cr_offset;
+	mv_addr = luma_addr + mv_offset;
+
+	hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
+	hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
+	hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
+
+	hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
+	hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
+
+	hantro_hevc_ref_remove_unused(ctx);
+
+	for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+		hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
+		hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
+	}
+
+	hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
+
+	return 0;
+}
+
+static void set_buffers(struct hantro_ctx *ctx)
+{
+	struct vb2_v4l2_buffer *src_buf, *dst_buf;
+	struct hantro_dev *vpu = ctx->dev;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+	dma_addr_t src_dma, dst_dma;
+	u32 src_len, src_buf_len;
+
+	src_buf = hantro_get_src_buf(ctx);
+	dst_buf = hantro_get_dst_buf(ctx);
+
+	/* Source (stream) buffer. */
+	src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+	src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
+	src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
+
+	hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
+	hantro_reg_write(vpu, hevc_stream_len, src_len);
+	hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
+	hantro_reg_write(vpu, hevc_strm_start_offset, 0);
+	hantro_reg_write(vpu, hevc_write_mvs_e, 1);
+
+	/* Destination (decoded frame) buffer. */
+	dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
+
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
+	hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
+	hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
+	hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
+	hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
+	hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
+}
+
+void hantro_g2_check_idle(struct hantro_dev *vpu)
+{
+	int i;
+
+	for (i = 0; i < 3; i++) {
+		u32 status;
+
+		/* Make sure the VPU is idle */
+		status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+		if (status & HEVC_REG_INTERRUPT_DEC_E) {
+			pr_warn("%s: still enabled!!! resetting.\n", __func__);
+			status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
+			vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
+		}
+	}
+}
+
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	int ret;
+
+	hantro_g2_check_idle(vpu);
+
+	/* Prepare HEVC decoder context. */
+	ret = hantro_hevc_dec_prepare_run(ctx);
+	if (ret)
+		return ret;
+
+	/* Configure hardware registers. */
+	set_params(ctx);
+
+	/* set reference pictures */
+	ret = set_ref(ctx);
+	if (ret)
+		return ret;
+
+	set_buffers(ctx);
+	prepare_tile_info_buffer(ctx);
+
+	hantro_end_prepare_run(ctx);
+
+	hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
+	hantro_reg_write(vpu, hevc_clk_gate_e, 1);
+
+	/* Don't disable output */
+	hantro_reg_write(vpu, hevc_out_dis, 0);
+
+	/* Don't compress buffers */
+	hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
+
+	/* use NV12 as output format */
+	hantro_reg_write(vpu, hevc_out_rs_e, 1);
+
+	/* Bus width and max burst */
+	hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
+	hantro_reg_write(vpu, hevc_max_burst, 16);
+
+	/* Swap */
+	hantro_reg_write(vpu, hevc_strm_swap, 0xf);
+	hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
+	hantro_reg_write(vpu, hevc_compress_swap, 0xf);
+
+	/* Start decoding! */
+	vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
new file mode 100644
index 000000000000..a361c9ba911d
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -0,0 +1,198 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021, Collabora
+ *
+ * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
+ */
+
+#ifndef HANTRO_G2_REGS_H_
+#define HANTRO_G2_REGS_H_
+
+#include "hantro.h"
+
+#define G2_SWREG(nr)	((nr) * 4)
+
+#define HEVC_DEC_REG(name, base, shift, mask) \
+	static const struct hantro_reg _hevc_##name[] = { \
+		{ G2_SWREG(base), (shift), (mask) } \
+	}; \
+	static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
+
+#define HEVC_REG_VERSION		G2_SWREG(0)
+
+#define HEVC_REG_INTERRUPT		G2_SWREG(1)
+#define HEVC_REG_INTERRUPT_DEC_RDY_INT	BIT(12)
+#define HEVC_REG_INTERRUPT_DEC_ABORT_E	BIT(5)
+#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS	BIT(4)
+#define HEVC_REG_INTERRUPT_DEC_E	BIT(0)
+
+HEVC_DEC_REG(strm_swap,		2, 28,	0xf)
+HEVC_DEC_REG(dirmv_swap,	2, 20,	0xf)
+
+HEVC_DEC_REG(mode,		  3, 27, 0x1f)
+HEVC_DEC_REG(compress_swap,	  3, 20, 0xf)
+HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
+HEVC_DEC_REG(out_rs_e,		  3, 16, 0x1)
+HEVC_DEC_REG(out_dis,		  3, 15, 0x1)
+HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
+HEVC_DEC_REG(write_mvs_e,	  3, 12, 0x1)
+
+HEVC_DEC_REG(pic_width_in_cbs,	4, 19,	0x1ff)
+HEVC_DEC_REG(pic_height_in_cbs,	4, 6,	0x1ff)
+HEVC_DEC_REG(num_ref_frames,	4, 0,	0x1f)
+
+HEVC_DEC_REG(scaling_list_e,	5, 24,	0x1)
+HEVC_DEC_REG(cb_qp_offset,	5, 19,	0x1f)
+HEVC_DEC_REG(cr_qp_offset,	5, 14,	0x1f)
+HEVC_DEC_REG(sign_data_hide,	5, 12,	0x1)
+HEVC_DEC_REG(tempor_mvp_e,	5, 11,	0x1)
+HEVC_DEC_REG(max_cu_qpd_depth,	5, 5,	0x3f)
+HEVC_DEC_REG(cu_qpd_e,		5, 4,	0x1)
+
+HEVC_DEC_REG(stream_len,	6, 0,	0xffffffff)
+
+HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
+HEVC_DEC_REG(weight_pred_e,	 7, 28, 0x1)
+HEVC_DEC_REG(weight_bipr_idc,	 7, 26, 0x3)
+HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
+HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
+HEVC_DEC_REG(asym_pred_e,	 7, 23, 0x1)
+HEVC_DEC_REG(sao_e,		 7, 22, 0x1)
+HEVC_DEC_REG(pcm_filt_d,	 7, 21, 0x1)
+HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
+HEVC_DEC_REG(dependent_slice,	 7, 19, 0x1)
+HEVC_DEC_REG(filter_override,	 7, 18, 0x1)
+HEVC_DEC_REG(strong_smooth_e,	 7, 17, 0x1)
+HEVC_DEC_REG(filt_offset_beta,	 7, 12, 0x1f)
+HEVC_DEC_REG(filt_offset_tc,	 7, 7,  0x1f)
+HEVC_DEC_REG(slice_hdr_ext_e,	 7, 6,	0x1)
+HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3,	0x7)
+
+HEVC_DEC_REG(const_intra_e,	 8, 31, 0x1)
+HEVC_DEC_REG(filt_ctrl_pres,	 8, 30, 0x1)
+HEVC_DEC_REG(idr_pic_e,		 8, 16, 0x1)
+HEVC_DEC_REG(bit_depth_pcm_y,	 8, 12, 0xf)
+HEVC_DEC_REG(bit_depth_pcm_c,	 8, 8,  0xf)
+HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
+HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
+HEVC_DEC_REG(output_8_bits,	 8, 3,  0x1)
+
+HEVC_DEC_REG(refidx1_active,	9, 19,	0x1f)
+HEVC_DEC_REG(refidx0_active,	9, 14,	0x1f)
+HEVC_DEC_REG(hdr_skip_length,	9, 0,	0x3fff)
+
+HEVC_DEC_REG(start_code_e,	10, 31, 0x1)
+HEVC_DEC_REG(init_qp,		10, 24, 0x3f)
+HEVC_DEC_REG(num_tile_cols,	10, 19, 0x1f)
+HEVC_DEC_REG(num_tile_rows,	10, 14, 0x1f)
+HEVC_DEC_REG(tile_e,		10, 1,	0x1)
+HEVC_DEC_REG(entropy_sync_e,	10, 0,	0x1)
+
+HEVC_DEC_REG(refer_lterm_e,	12, 16, 0xffff)
+HEVC_DEC_REG(min_cb_size,	12, 13, 0x7)
+HEVC_DEC_REG(max_cb_size,	12, 10, 0x7)
+HEVC_DEC_REG(min_pcm_size,	12, 7,  0x7)
+HEVC_DEC_REG(max_pcm_size,	12, 4,  0x7)
+HEVC_DEC_REG(pcm_e,		12, 3,  0x1)
+HEVC_DEC_REG(transform_skip,	12, 2,	0x1)
+HEVC_DEC_REG(transq_bypass,	12, 1,	0x1)
+HEVC_DEC_REG(list_mod_e,	12, 0,	0x1)
+
+HEVC_DEC_REG(min_trb_size,	  13, 13, 0x7)
+HEVC_DEC_REG(max_trb_size,	  13, 10, 0x7)
+HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
+HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
+HEVC_DEC_REG(parallel_merge,	  13, 0,  0xf)
+
+HEVC_DEC_REG(rlist_f0,		14, 0,	0x1f)
+HEVC_DEC_REG(rlist_f1,		14, 10,	0x1f)
+HEVC_DEC_REG(rlist_f2,		14, 20,	0x1f)
+HEVC_DEC_REG(rlist_b0,		14, 5,	0x1f)
+HEVC_DEC_REG(rlist_b1,		14, 15, 0x1f)
+HEVC_DEC_REG(rlist_b2,		14, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f3,		15, 0,	0x1f)
+HEVC_DEC_REG(rlist_f4,		15, 10, 0x1f)
+HEVC_DEC_REG(rlist_f5,		15, 20, 0x1f)
+HEVC_DEC_REG(rlist_b3,		15, 5,	0x1f)
+HEVC_DEC_REG(rlist_b4,		15, 15, 0x1f)
+HEVC_DEC_REG(rlist_b5,		15, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f6,		16, 0,	0x1f)
+HEVC_DEC_REG(rlist_f7,		16, 10, 0x1f)
+HEVC_DEC_REG(rlist_f8,		16, 20, 0x1f)
+HEVC_DEC_REG(rlist_b6,		16, 5,	0x1f)
+HEVC_DEC_REG(rlist_b7,		16, 15, 0x1f)
+HEVC_DEC_REG(rlist_b8,		16, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f9,		17, 0,	0x1f)
+HEVC_DEC_REG(rlist_f10,		17, 10, 0x1f)
+HEVC_DEC_REG(rlist_f11,		17, 20, 0x1f)
+HEVC_DEC_REG(rlist_b9,		17, 5,	0x1f)
+HEVC_DEC_REG(rlist_b10,		17, 15, 0x1f)
+HEVC_DEC_REG(rlist_b11,		17, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f12,		18, 0,	0x1f)
+HEVC_DEC_REG(rlist_f13,		18, 10, 0x1f)
+HEVC_DEC_REG(rlist_f14,		18, 20, 0x1f)
+HEVC_DEC_REG(rlist_b12,		18, 5,	0x1f)
+HEVC_DEC_REG(rlist_b13,		18, 15, 0x1f)
+HEVC_DEC_REG(rlist_b14,		18, 25, 0x1f)
+
+HEVC_DEC_REG(rlist_f15,		19, 0,	0x1f)
+HEVC_DEC_REG(rlist_b15,		19, 5,	0x1f)
+
+HEVC_DEC_REG(partial_ctb_x,	20, 31, 0x1)
+HEVC_DEC_REG(partial_ctb_y,	20, 30, 0x1)
+HEVC_DEC_REG(pic_width_4x4,	20, 16, 0xfff)
+HEVC_DEC_REG(pic_height_4x4,	20, 0,  0xfff)
+
+HEVC_DEC_REG(cur_poc_00,	46, 24,	0xff)
+HEVC_DEC_REG(cur_poc_01,	46, 16,	0xff)
+HEVC_DEC_REG(cur_poc_02,	46, 8,	0xff)
+HEVC_DEC_REG(cur_poc_03,	46, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_04,	47, 24,	0xff)
+HEVC_DEC_REG(cur_poc_05,	47, 16,	0xff)
+HEVC_DEC_REG(cur_poc_06,	47, 8,	0xff)
+HEVC_DEC_REG(cur_poc_07,	47, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_08,	48, 24,	0xff)
+HEVC_DEC_REG(cur_poc_09,	48, 16,	0xff)
+HEVC_DEC_REG(cur_poc_10,	48, 8,	0xff)
+HEVC_DEC_REG(cur_poc_11,	48, 0,	0xff)
+
+HEVC_DEC_REG(cur_poc_12,	49, 24, 0xff)
+HEVC_DEC_REG(cur_poc_13,	49, 16, 0xff)
+HEVC_DEC_REG(cur_poc_14,	49, 8,	0xff)
+HEVC_DEC_REG(cur_poc_15,	49, 0,	0xff)
+
+HEVC_DEC_REG(apf_threshold,	55, 0,	0xffff)
+
+HEVC_DEC_REG(clk_gate_e,	58, 16,	0x1)
+HEVC_DEC_REG(buswidth,		58, 8,	0x7)
+HEVC_DEC_REG(max_burst,		58, 0,	0xff)
+
+#define HEVC_REG_CONFIG				G2_SWREG(58)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_E		BIT(16)
+#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E	BIT(17)
+
+#define HEVC_ADDR_DST		(G2_SWREG(65))
+#define HEVC_REG_ADDR_REF(i)	(G2_SWREG(67)  + ((i) * 0x8))
+#define HEVC_ADDR_DST_CHR	(G2_SWREG(99))
+#define HEVC_REG_CHR_REF(i)	(G2_SWREG(101) + ((i) * 0x8))
+#define HEVC_ADDR_DST_MV	(G2_SWREG(133))
+#define HEVC_REG_DMV_REF(i)	(G2_SWREG(135) + ((i) * 0x8))
+#define HEVC_ADDR_TILE_SIZE	(G2_SWREG(167))
+#define HEVC_ADDR_STR		(G2_SWREG(169))
+#define HEVC_SCALING_LIST	(G2_SWREG(171))
+#define HEVC_RASTER_SCAN	(G2_SWREG(175))
+#define HEVC_RASTER_SCAN_CHR	(G2_SWREG(177))
+#define HEVC_TILE_FILTER	(G2_SWREG(179))
+#define HEVC_TILE_SAO		(G2_SWREG(181))
+#define HEVC_TILE_BSD		(G2_SWREG(183))
+
+HEVC_DEC_REG(strm_buffer_len,	258, 0,	0xffffffff)
+HEVC_DEC_REG(strm_start_offset,	259, 0,	0xffffffff)
+
+#endif
diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
new file mode 100644
index 000000000000..8e319a837ff3
--- /dev/null
+++ b/drivers/staging/media/hantro/hantro_hevc.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hantro VPU HEVC codec driver
+ *
+ * Copyright (C) 2020 Safran Passenger Innovations LLC
+ */
+
+#include <linux/types.h>
+#include <media/v4l2-mem2mem.h>
+
+#include "hantro.h"
+#include "hantro_hw.h"
+
+#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
+/*
+ * BSD control data of current picture at tile border
+ * 128 bits per 4x4 tile = 128/(8*4) bytes per row
+ */
+#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
+/* tile border coefficients of filter */
+#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
+
+#define MAX_TILE_COLS 20
+#define MAX_TILE_ROWS 22
+
+#define UNUSED_REF	-1
+
+#define G2_ALIGN		16
+#define MC_WORD_SIZE		32
+
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
+
+	return sps->pic_width_in_luma_samples *
+		sps->pic_height_in_luma_samples * bytes_per_pixel;
+}
+
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	size_t cr_offset = hantro_hevc_chroma_offset(sps);
+
+	return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
+}
+
+static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
+{
+	u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
+	u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
+	size_t mv_size;
+
+	mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
+		  (1 << (2 * (8 - 4))) * 16) + 32;
+
+	vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
+		  pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
+
+	return mv_size;
+}
+
+static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
+{
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+
+	return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
+}
+
+static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	struct hantro_dev *vpu = ctx->dev;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs[i].cpu) {
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
+					  hevc_dec->ref_bufs[i].cpu,
+					  hevc_dec->ref_bufs[i].dma);
+		}
+	}
+}
+
+static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	for (i = 0;  i < NUM_REF_PICTURES; i++)
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+}
+
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
+				   int poc)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Find the reference buffer in already know ones */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == poc) {
+			hevc_dec->ref_bufs_used |= 1 << i;
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	/* Allocate a new reference buffer */
+	for (i = 0; i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
+			if (!hevc_dec->ref_bufs[i].cpu) {
+				struct hantro_dev *vpu = ctx->dev;
+
+				hevc_dec->ref_bufs[i].cpu =
+					dma_alloc_coherent(vpu->dev,
+							   hantro_hevc_ref_size(ctx),
+							   &hevc_dec->ref_bufs[i].dma,
+							   GFP_KERNEL);
+				if (!hevc_dec->ref_bufs[i].cpu)
+					return 0;
+
+				hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
+			}
+			hevc_dec->ref_bufs_used |= 1 << i;
+			memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
+			hevc_dec->ref_bufs_poc[i] = poc;
+
+			return hevc_dec->ref_bufs[i].dma;
+		}
+	}
+
+	return 0;
+}
+
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	int i;
+
+	/* Just tag buffer as unused, do not free them */
+	for (i = 0;  i < NUM_REF_PICTURES; i++) {
+		if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF)
+			continue;
+
+		if (hevc_dec->ref_bufs_used & (1 << i))
+			continue;
+
+		hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+	}
+}
+
+static int tile_buffer_reallocate(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+	const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+	const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+	unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
+	unsigned int height64 = (sps->pic_height_in_luma_samples + 63) & ~63;
+	unsigned int size;
+
+	if (num_tile_cols <= 1 ||
+	    num_tile_cols <= hevc_dec->num_tile_cols_allocated)
+		return 0;
+
+	/* Need to reallocate due to tiles passed via PPS */
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+
+	size = VERT_FILTER_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_filter.cpu = dma_alloc_coherent(vpu->dev, size,
+						       &hevc_dec->tile_filter.dma,
+						       GFP_KERNEL);
+	if (!hevc_dec->tile_filter.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_filter.size = size;
+
+	size = VERT_SAO_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_sao.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_sao.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_sao.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_sao.size = size;
+
+	size = BSD_CTRL_RAM_SIZE * height64 * (num_tile_cols - 1);
+	hevc_dec->tile_bsd.cpu = dma_alloc_coherent(vpu->dev, size,
+						    &hevc_dec->tile_bsd.dma,
+						    GFP_KERNEL);
+	if (!hevc_dec->tile_bsd.cpu)
+		goto err_free_tile_buffers;
+	hevc_dec->tile_bsd.size = size;
+
+	hevc_dec->num_tile_cols_allocated = num_tile_cols;
+
+	return 0;
+
+err_free_tile_buffers:
+	if (hevc_dec->tile_filter.size)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	return -ENOMEM;
+}
+
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
+{
+	struct hantro_hevc_dec_hw_ctx *hevc_ctx = &ctx->hevc_dec;
+	struct hantro_hevc_dec_ctrls *ctrls = &hevc_ctx->ctrls;
+	int ret;
+
+	hantro_start_prepare_run(ctx);
+
+	ctrls->decode_params =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS);
+	if (WARN_ON(!ctrls->decode_params))
+		return -EINVAL;
+
+	ctrls->sps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_SPS);
+	if (WARN_ON(!ctrls->sps))
+		return -EINVAL;
+
+	ctrls->pps =
+		hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_PPS);
+	if (WARN_ON(!ctrls->pps))
+		return -EINVAL;
+
+	ret = tile_buffer_reallocate(ctx);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+
+	if (hevc_dec->tile_sizes.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sizes.size,
+				  hevc_dec->tile_sizes.cpu,
+				  hevc_dec->tile_sizes.dma);
+	hevc_dec->tile_sizes.cpu = 0;
+
+	if (hevc_dec->tile_filter.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_filter.size,
+				  hevc_dec->tile_filter.cpu,
+				  hevc_dec->tile_filter.dma);
+	hevc_dec->tile_filter.cpu = 0;
+
+	if (hevc_dec->tile_sao.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_sao.size,
+				  hevc_dec->tile_sao.cpu,
+				  hevc_dec->tile_sao.dma);
+	hevc_dec->tile_sao.cpu = 0;
+
+	if (hevc_dec->tile_bsd.cpu)
+		dma_free_coherent(vpu->dev, hevc_dec->tile_bsd.size,
+				  hevc_dec->tile_bsd.cpu,
+				  hevc_dec->tile_bsd.dma);
+	hevc_dec->tile_bsd.cpu = 0;
+
+	hantro_hevc_ref_free(ctx);
+}
+
+int hantro_hevc_dec_init(struct hantro_ctx *ctx)
+{
+	struct hantro_dev *vpu = ctx->dev;
+	struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+	unsigned int size;
+
+	memset(hevc_dec, 0, sizeof(*hevc_dec));
+
+	/*
+	 * Maximum number of tiles times width and height (2 bytes each),
+	 * rounding up to next 16 bytes boundary + one extra 16 byte
+	 * chunk (HW guys wanted to have this).
+	 */
+	size = round_up(MAX_TILE_COLS * MAX_TILE_ROWS * 4 * sizeof(u16) + 16, 16);
+	hevc_dec->tile_sizes.cpu = dma_alloc_coherent(vpu->dev, size,
+						      &hevc_dec->tile_sizes.dma,
+						      GFP_KERNEL);
+	if (!hevc_dec->tile_sizes.cpu)
+		return -ENOMEM;
+
+	hevc_dec->tile_sizes.size = size;
+
+	hantro_hevc_ref_init(ctx);
+
+	return 0;
+}
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index 4e2e7a5ed283..dade3b0769c1 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -20,6 +20,8 @@
 #define MB_WIDTH(w)		DIV_ROUND_UP(w, MB_DIM)
 #define MB_HEIGHT(h)		DIV_ROUND_UP(h, MB_DIM)
 
+#define NUM_REF_PICTURES	(V4L2_HEVC_DPB_ENTRIES_NUM_MAX + 1)
+
 struct hantro_dev;
 struct hantro_ctx;
 struct hantro_buf;
@@ -90,6 +92,44 @@ struct hantro_h264_dec_hw_ctx {
 	struct hantro_h264_dec_ctrls ctrls;
 };
 
+/**
+ * struct hantro_hevc_dec_ctrls
+ * @decode_params: Decode params
+ * @sps:	SPS info
+ * @pps:	PPS info
+ * @hevc_hdr_skip_length: the number of data (in bits) to skip in the
+ *			  slice segment header syntax after 'slice type'
+ *			  token
+ */
+struct hantro_hevc_dec_ctrls {
+	const struct v4l2_ctrl_hevc_decode_params *decode_params;
+	const struct v4l2_ctrl_hevc_sps *sps;
+	const struct v4l2_ctrl_hevc_pps *pps;
+	u32 hevc_hdr_skip_length;
+};
+
+/**
+ * struct hantro_hevc_dec_hw_ctx
+ * @tile_sizes:		Tile sizes buffer
+ * @tile_filter:	Tile vertical filter buffer
+ * @tile_sao:		Tile SAO buffer
+ * @tile_bsd:		Tile BSD control buffer
+ * @dpb:	DPB
+ * @reflists:	P/B0/B1 reflists
+ * @ctrls:	V4L2 controls attached to a run
+ */
+struct hantro_hevc_dec_hw_ctx {
+	struct hantro_aux_buf tile_sizes;
+	struct hantro_aux_buf tile_filter;
+	struct hantro_aux_buf tile_sao;
+	struct hantro_aux_buf tile_bsd;
+	struct hantro_aux_buf ref_bufs[NUM_REF_PICTURES];
+	int ref_bufs_poc[NUM_REF_PICTURES];
+	u32 ref_bufs_used;
+	struct hantro_hevc_dec_ctrls ctrls;
+	unsigned int num_tile_cols_allocated;
+};
+
 /**
  * struct hantro_mpeg2_dec_hw_ctx
  * @qtable:		Quantization table
@@ -178,6 +218,15 @@ int hantro_g1_h264_dec_run(struct hantro_ctx *ctx);
 int hantro_h264_dec_init(struct hantro_ctx *ctx);
 void hantro_h264_dec_exit(struct hantro_ctx *ctx);
 
+int hantro_hevc_dec_init(struct hantro_ctx *ctx);
+void hantro_hevc_dec_exit(struct hantro_ctx *ctx);
+int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
+int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc);
+void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
+size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
+size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
+
 static inline size_t
 hantro_h264_mv_size(unsigned int width, unsigned int height)
 {
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 08/11] media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Make sure that V4L2_PIX_FMT_HEVC_SLICE is correctly handle by v4l2
of the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_v4l2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 1bc118e375a1..e16d5fd0b9f7 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -390,6 +390,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 	case V4L2_PIX_FMT_MPEG2_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
 	case V4L2_PIX_FMT_H264_SLICE:
+	case V4L2_PIX_FMT_HEVC_SLICE:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 08/11] media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Make sure that V4L2_PIX_FMT_HEVC_SLICE is correctly handle by v4l2
of the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_v4l2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 1bc118e375a1..e16d5fd0b9f7 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -390,6 +390,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 	case V4L2_PIX_FMT_MPEG2_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
 	case V4L2_PIX_FMT_H264_SLICE:
+	case V4L2_PIX_FMT_HEVC_SLICE:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 08/11] media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Make sure that V4L2_PIX_FMT_HEVC_SLICE is correctly handle by v4l2
of the driver.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 drivers/staging/media/hantro/hantro_v4l2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 1bc118e375a1..e16d5fd0b9f7 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -390,6 +390,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc)
 	case V4L2_PIX_FMT_MPEG2_SLICE:
 	case V4L2_PIX_FMT_VP8_FRAME:
 	case V4L2_PIX_FMT_H264_SLICE:
+	case V4L2_PIX_FMT_HEVC_SLICE:
 		ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true;
 		break;
 	default:
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add variant to IMX8M to enable G2/HEVC codec.
Define the capabilities for the hardware up to 3840x2160.
Retrieve the hardware version at init to distinguish G1 from G2.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 2:
- remove useless clocks

 drivers/staging/media/hantro/hantro_drv.c   |  1 +
 drivers/staging/media/hantro/hantro_hw.h    |  1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
 3 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index bc90a52f4d3d..976be7b6ecfb 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
 #endif
 #ifdef CONFIG_VIDEO_HANTRO_IMX8M
 	{ .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
+	{ .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
 #endif
 	{ /* sentinel */ }
 };
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index dade3b0769c1..f61f58da05fe 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
 extern const struct hantro_variant rk3328_vpu_variant;
 extern const struct hantro_variant rk3288_vpu_variant;
 extern const struct hantro_variant imx8mq_vpu_variant;
+extern const struct hantro_variant imx8mq_vpu_g2_variant;
 
 extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
 
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index d5b4312b9391..46b33531be85 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -12,6 +12,7 @@
 #include "hantro.h"
 #include "hantro_jpeg.h"
 #include "hantro_g1_regs.h"
+#include "hantro_g2_regs.h"
 
 static int imx8mq_runtime_resume(struct hantro_dev *vpu)
 {
@@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
 	},
 };
 
+static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_NV12,
+		.codec_mode = HANTRO_MODE_NONE,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_HEVC_SLICE,
+		.codec_mode = HANTRO_MODE_HEVC_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = MB_DIM,
+		},
+	},
+};
+
 static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 {
 	struct hantro_dev *vpu = dev_id;
@@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
+{
+	struct hantro_dev *vpu = dev_id;
+	enum vb2_buffer_state state;
+	u32 status;
+
+	status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+	state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
+		 VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
+
+	vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
+	vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
+
+	hantro_irq_done(vpu, state);
+
+	return IRQ_HANDLED;
+}
+
 static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
 {
-	vpu->dec_base = vpu->reg_bases[0];
+	int ret;
+
+	/* Check variant version */
+	ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
+	if (ret) {
+		dev_err(vpu->dev, "Failed to enable clocks\n");
+		return ret;
+	}
+
+	/* Make that the device has been reset before read it id */
+	ret = device_reset(vpu->dev);
+	if (ret)
+		dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
+
+	vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
+	clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
 
 	return 0;
 }
@@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
 	},
 };
 
+static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
+	[HANTRO_MODE_HEVC_DEC] = {
+		.run = hantro_g2_hevc_dec_run,
+		.reset = imx8mq_vpu_reset,
+		.init = hantro_hevc_dec_init,
+		.exit = hantro_hevc_dec_exit,
+	},
+};
+
 /*
  * VPU variants.
  */
 
 static const struct hantro_irq imx8mq_irqs[] = {
 	{ "g1", imx8m_vpu_g1_irq },
-	{ "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
 };
 
-static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
-static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
+static const struct hantro_irq imx8mq_g2_irqs[] = {
+	{ "g2", imx8m_vpu_g2_irq },
+};
+
+static const char * const imx8mq_clk_names[] = { "g1", "bus"};
+static const char * const imx8mq_reg_names[] = { "g1"};
+
+static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
+static const char * const imx8mq_g2_reg_names[] = { "g2"};
 
 const struct hantro_variant imx8mq_vpu_variant = {
 	.dec_fmts = imx8m_vpu_dec_fmts,
@@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
 	.reg_names = imx8mq_reg_names,
 	.num_regs = ARRAY_SIZE(imx8mq_reg_names)
 };
+
+const struct hantro_variant imx8mq_vpu_g2_variant = {
+	.dec_offset = 0x0,
+	.dec_fmts = imx8m_vpu_g2_dec_fmts,
+	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
+	.postproc_fmts = imx8m_vpu_postproc_fmts,
+	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
+	.codec = HANTRO_HEVC_DECODER,
+	.codec_ops = imx8mq_vpu_g2_codec_ops,
+	.init = imx8mq_vpu_hw_init,
+	.runtime_resume = imx8mq_runtime_resume,
+	.irqs = imx8mq_g2_irqs,
+	.num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
+	.clk_names = imx8mq_g2_clk_names,
+	.num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
+	.reg_names = imx8mq_g2_reg_names,
+	.num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
+};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add variant to IMX8M to enable G2/HEVC codec.
Define the capabilities for the hardware up to 3840x2160.
Retrieve the hardware version at init to distinguish G1 from G2.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 2:
- remove useless clocks

 drivers/staging/media/hantro/hantro_drv.c   |  1 +
 drivers/staging/media/hantro/hantro_hw.h    |  1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
 3 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index bc90a52f4d3d..976be7b6ecfb 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
 #endif
 #ifdef CONFIG_VIDEO_HANTRO_IMX8M
 	{ .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
+	{ .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
 #endif
 	{ /* sentinel */ }
 };
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index dade3b0769c1..f61f58da05fe 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
 extern const struct hantro_variant rk3328_vpu_variant;
 extern const struct hantro_variant rk3288_vpu_variant;
 extern const struct hantro_variant imx8mq_vpu_variant;
+extern const struct hantro_variant imx8mq_vpu_g2_variant;
 
 extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
 
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index d5b4312b9391..46b33531be85 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -12,6 +12,7 @@
 #include "hantro.h"
 #include "hantro_jpeg.h"
 #include "hantro_g1_regs.h"
+#include "hantro_g2_regs.h"
 
 static int imx8mq_runtime_resume(struct hantro_dev *vpu)
 {
@@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
 	},
 };
 
+static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_NV12,
+		.codec_mode = HANTRO_MODE_NONE,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_HEVC_SLICE,
+		.codec_mode = HANTRO_MODE_HEVC_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = MB_DIM,
+		},
+	},
+};
+
 static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 {
 	struct hantro_dev *vpu = dev_id;
@@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
+{
+	struct hantro_dev *vpu = dev_id;
+	enum vb2_buffer_state state;
+	u32 status;
+
+	status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+	state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
+		 VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
+
+	vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
+	vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
+
+	hantro_irq_done(vpu, state);
+
+	return IRQ_HANDLED;
+}
+
 static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
 {
-	vpu->dec_base = vpu->reg_bases[0];
+	int ret;
+
+	/* Check variant version */
+	ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
+	if (ret) {
+		dev_err(vpu->dev, "Failed to enable clocks\n");
+		return ret;
+	}
+
+	/* Make that the device has been reset before read it id */
+	ret = device_reset(vpu->dev);
+	if (ret)
+		dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
+
+	vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
+	clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
 
 	return 0;
 }
@@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
 	},
 };
 
+static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
+	[HANTRO_MODE_HEVC_DEC] = {
+		.run = hantro_g2_hevc_dec_run,
+		.reset = imx8mq_vpu_reset,
+		.init = hantro_hevc_dec_init,
+		.exit = hantro_hevc_dec_exit,
+	},
+};
+
 /*
  * VPU variants.
  */
 
 static const struct hantro_irq imx8mq_irqs[] = {
 	{ "g1", imx8m_vpu_g1_irq },
-	{ "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
 };
 
-static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
-static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
+static const struct hantro_irq imx8mq_g2_irqs[] = {
+	{ "g2", imx8m_vpu_g2_irq },
+};
+
+static const char * const imx8mq_clk_names[] = { "g1", "bus"};
+static const char * const imx8mq_reg_names[] = { "g1"};
+
+static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
+static const char * const imx8mq_g2_reg_names[] = { "g2"};
 
 const struct hantro_variant imx8mq_vpu_variant = {
 	.dec_fmts = imx8m_vpu_dec_fmts,
@@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
 	.reg_names = imx8mq_reg_names,
 	.num_regs = ARRAY_SIZE(imx8mq_reg_names)
 };
+
+const struct hantro_variant imx8mq_vpu_g2_variant = {
+	.dec_offset = 0x0,
+	.dec_fmts = imx8m_vpu_g2_dec_fmts,
+	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
+	.postproc_fmts = imx8m_vpu_postproc_fmts,
+	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
+	.codec = HANTRO_HEVC_DECODER,
+	.codec_ops = imx8mq_vpu_g2_codec_ops,
+	.init = imx8mq_vpu_hw_init,
+	.runtime_resume = imx8mq_runtime_resume,
+	.irqs = imx8mq_g2_irqs,
+	.num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
+	.clk_names = imx8mq_g2_clk_names,
+	.num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
+	.reg_names = imx8mq_g2_reg_names,
+	.num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
+};
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Add variant to IMX8M to enable G2/HEVC codec.
Define the capabilities for the hardware up to 3840x2160.
Retrieve the hardware version at init to distinguish G1 from G2.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 2:
- remove useless clocks

 drivers/staging/media/hantro/hantro_drv.c   |  1 +
 drivers/staging/media/hantro/hantro_hw.h    |  1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
 3 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index bc90a52f4d3d..976be7b6ecfb 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
 #endif
 #ifdef CONFIG_VIDEO_HANTRO_IMX8M
 	{ .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
+	{ .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
 #endif
 	{ /* sentinel */ }
 };
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index dade3b0769c1..f61f58da05fe 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
 extern const struct hantro_variant rk3328_vpu_variant;
 extern const struct hantro_variant rk3288_vpu_variant;
 extern const struct hantro_variant imx8mq_vpu_variant;
+extern const struct hantro_variant imx8mq_vpu_g2_variant;
 
 extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
 
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index d5b4312b9391..46b33531be85 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -12,6 +12,7 @@
 #include "hantro.h"
 #include "hantro_jpeg.h"
 #include "hantro_g1_regs.h"
+#include "hantro_g2_regs.h"
 
 static int imx8mq_runtime_resume(struct hantro_dev *vpu)
 {
@@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
 	},
 };
 
+static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
+	{
+		.fourcc = V4L2_PIX_FMT_NV12,
+		.codec_mode = HANTRO_MODE_NONE,
+	},
+	{
+		.fourcc = V4L2_PIX_FMT_HEVC_SLICE,
+		.codec_mode = HANTRO_MODE_HEVC_DEC,
+		.max_depth = 2,
+		.frmsize = {
+			.min_width = 48,
+			.max_width = 3840,
+			.step_width = MB_DIM,
+			.min_height = 48,
+			.max_height = 2160,
+			.step_height = MB_DIM,
+		},
+	},
+};
+
 static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 {
 	struct hantro_dev *vpu = dev_id;
@@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
+{
+	struct hantro_dev *vpu = dev_id;
+	enum vb2_buffer_state state;
+	u32 status;
+
+	status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
+	state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
+		 VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
+
+	vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
+	vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
+
+	hantro_irq_done(vpu, state);
+
+	return IRQ_HANDLED;
+}
+
 static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
 {
-	vpu->dec_base = vpu->reg_bases[0];
+	int ret;
+
+	/* Check variant version */
+	ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
+	if (ret) {
+		dev_err(vpu->dev, "Failed to enable clocks\n");
+		return ret;
+	}
+
+	/* Make that the device has been reset before read it id */
+	ret = device_reset(vpu->dev);
+	if (ret)
+		dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
+
+	vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
+	clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
 
 	return 0;
 }
@@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
 	},
 };
 
+static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
+	[HANTRO_MODE_HEVC_DEC] = {
+		.run = hantro_g2_hevc_dec_run,
+		.reset = imx8mq_vpu_reset,
+		.init = hantro_hevc_dec_init,
+		.exit = hantro_hevc_dec_exit,
+	},
+};
+
 /*
  * VPU variants.
  */
 
 static const struct hantro_irq imx8mq_irqs[] = {
 	{ "g1", imx8m_vpu_g1_irq },
-	{ "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
 };
 
-static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
-static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
+static const struct hantro_irq imx8mq_g2_irqs[] = {
+	{ "g2", imx8m_vpu_g2_irq },
+};
+
+static const char * const imx8mq_clk_names[] = { "g1", "bus"};
+static const char * const imx8mq_reg_names[] = { "g1"};
+
+static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
+static const char * const imx8mq_g2_reg_names[] = { "g2"};
 
 const struct hantro_variant imx8mq_vpu_variant = {
 	.dec_fmts = imx8m_vpu_dec_fmts,
@@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
 	.reg_names = imx8mq_reg_names,
 	.num_regs = ARRAY_SIZE(imx8mq_reg_names)
 };
+
+const struct hantro_variant imx8mq_vpu_g2_variant = {
+	.dec_offset = 0x0,
+	.dec_fmts = imx8m_vpu_g2_dec_fmts,
+	.num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
+	.postproc_fmts = imx8m_vpu_postproc_fmts,
+	.num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
+	.codec = HANTRO_HEVC_DECODER,
+	.codec_ops = imx8mq_vpu_g2_codec_ops,
+	.init = imx8mq_vpu_hw_init,
+	.runtime_resume = imx8mq_runtime_resume,
+	.irqs = imx8mq_g2_irqs,
+	.num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
+	.clk_names = imx8mq_g2_clk_names,
+	.num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
+	.reg_names = imx8mq_g2_reg_names,
+	.num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
+};
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The current bindings seem to make the assumption that the
two VPUs hardware blocks (G1 and G2) are only one set of
registers.
After implementing the VPU reset driver and G2 decoder driver
it shows that all the VPUs are independent and don't need to
know about the registers of the other blocks.
Remove from the bindings the need to set all blocks register
but keep reg-names property because removing it from the driver
may affect other variants.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- be more verbose about why I change the bindings
Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
without that review and ack it won't work

 .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
index fd53a4e43572..468435c70eef 100644
--- a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
+++ b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
@@ -15,23 +15,25 @@ description:
 
 properties:
   compatible:
-    const: nxp,imx8mq-vpu
+    enum:
+      - nxp,imx8mq-vpu
+      - nxp,imx8mq-vpu-g2
 
   reg:
-    maxItems: 2
+    maxItems: 1
 
   reg-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   interrupts:
-    maxItems: 2
+    maxItems: 1
 
   interrupt-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   clocks:
     maxItems: 3
@@ -66,14 +68,12 @@ examples:
         #include <dt-bindings/interrupt-controller/arm-gic.h>
         #include <dt-bindings/reset/imx8mq-vpu-reset.h>
 
-        vpu: video-codec@38300000 {
+        vpu_g1: video-codec@38300000 {
                 compatible = "nxp,imx8mq-vpu";
-                reg = <0x38300000 0x10000>,
-                      <0x38310000 0x10000>;
-                reg-names = "g1", "g2";
-                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-                             <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-                interrupt-names = "g1", "g2";
+                reg = <0x38300000 0x10000>;
+                reg-names = "g1";
+                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g1";
                 clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
@@ -81,3 +81,17 @@ examples:
                 power-domains = <&pgc_vpu>;
                 resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
         };
+
+        vpu_g2: video-codec@38310000 {
+                compatible = "nxp,imx8mq-vpu-g2";
+                reg = <0x38310000 0x10000>;
+                reg-names = "g2";
+                interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g2";
+                clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+                clock-names = "g1", "g2", "bus";
+                power-domains = <&pgc_vpu>;
+                resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+        };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The current bindings seem to make the assumption that the
two VPUs hardware blocks (G1 and G2) are only one set of
registers.
After implementing the VPU reset driver and G2 decoder driver
it shows that all the VPUs are independent and don't need to
know about the registers of the other blocks.
Remove from the bindings the need to set all blocks register
but keep reg-names property because removing it from the driver
may affect other variants.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- be more verbose about why I change the bindings
Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
without that review and ack it won't work

 .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
index fd53a4e43572..468435c70eef 100644
--- a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
+++ b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
@@ -15,23 +15,25 @@ description:
 
 properties:
   compatible:
-    const: nxp,imx8mq-vpu
+    enum:
+      - nxp,imx8mq-vpu
+      - nxp,imx8mq-vpu-g2
 
   reg:
-    maxItems: 2
+    maxItems: 1
 
   reg-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   interrupts:
-    maxItems: 2
+    maxItems: 1
 
   interrupt-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   clocks:
     maxItems: 3
@@ -66,14 +68,12 @@ examples:
         #include <dt-bindings/interrupt-controller/arm-gic.h>
         #include <dt-bindings/reset/imx8mq-vpu-reset.h>
 
-        vpu: video-codec@38300000 {
+        vpu_g1: video-codec@38300000 {
                 compatible = "nxp,imx8mq-vpu";
-                reg = <0x38300000 0x10000>,
-                      <0x38310000 0x10000>;
-                reg-names = "g1", "g2";
-                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-                             <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-                interrupt-names = "g1", "g2";
+                reg = <0x38300000 0x10000>;
+                reg-names = "g1";
+                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g1";
                 clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
@@ -81,3 +81,17 @@ examples:
                 power-domains = <&pgc_vpu>;
                 resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
         };
+
+        vpu_g2: video-codec@38310000 {
+                compatible = "nxp,imx8mq-vpu-g2";
+                reg = <0x38310000 0x10000>;
+                reg-names = "g2";
+                interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g2";
+                clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+                clock-names = "g1", "g2", "bus";
+                power-domains = <&pgc_vpu>;
+                resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+        };
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

The current bindings seem to make the assumption that the
two VPUs hardware blocks (G1 and G2) are only one set of
registers.
After implementing the VPU reset driver and G2 decoder driver
it shows that all the VPUs are independent and don't need to
know about the registers of the other blocks.
Remove from the bindings the need to set all blocks register
but keep reg-names property because removing it from the driver
may affect other variants.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- be more verbose about why I change the bindings
Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
without that review and ack it won't work

 .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
index fd53a4e43572..468435c70eef 100644
--- a/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
+++ b/Documentation/devicetree/bindings/media/nxp,imx8mq-vpu.yaml
@@ -15,23 +15,25 @@ description:
 
 properties:
   compatible:
-    const: nxp,imx8mq-vpu
+    enum:
+      - nxp,imx8mq-vpu
+      - nxp,imx8mq-vpu-g2
 
   reg:
-    maxItems: 2
+    maxItems: 1
 
   reg-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   interrupts:
-    maxItems: 2
+    maxItems: 1
 
   interrupt-names:
-    items:
-      - const: g1
-      - const: g2
+    enum:
+      - g1
+      - g2
 
   clocks:
     maxItems: 3
@@ -66,14 +68,12 @@ examples:
         #include <dt-bindings/interrupt-controller/arm-gic.h>
         #include <dt-bindings/reset/imx8mq-vpu-reset.h>
 
-        vpu: video-codec@38300000 {
+        vpu_g1: video-codec@38300000 {
                 compatible = "nxp,imx8mq-vpu";
-                reg = <0x38300000 0x10000>,
-                      <0x38310000 0x10000>;
-                reg-names = "g1", "g2";
-                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-                             <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-                interrupt-names = "g1", "g2";
+                reg = <0x38300000 0x10000>;
+                reg-names = "g1";
+                interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g1";
                 clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
                          <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
@@ -81,3 +81,17 @@ examples:
                 power-domains = <&pgc_vpu>;
                 resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
         };
+
+        vpu_g2: video-codec@38310000 {
+                compatible = "nxp,imx8mq-vpu-g2";
+                reg = <0x38310000 0x10000>;
+                reg-names = "g2";
+                interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+                interrupt-names = "g2";
+                clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+                         <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+                clock-names = "g1", "g2", "bus";
+                power-domains = <&pgc_vpu>;
+                resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+        };
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 11/11] arm64: dts: imx8mq: Add node to G2 hardware
  2021-03-03 11:39 ` Benjamin Gaignard
  (?)
@ 2021-03-03 11:39   ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Split VPU node in two: one for G1 and one for G2 since they are
different hardware blocks.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- remove useless clocks in VPUs nodes

 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 41 +++++++++++++++++------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
index d9d9efc8592d..8358e214d696 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
@@ -1287,17 +1287,15 @@ vpu_reset: vpu-reset@38320000 {
 			#reset-cells = <1>;
 		};
 
-		vpu: video-codec@38300000 {
+		vpu_g1: video-codec@38300000 {
 			compatible = "nxp,imx8mq-vpu";
-			reg = <0x38300000 0x10000>,
-			      <0x38310000 0x10000>;
-			reg-names = "g1", "g2";
-			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-				     <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "g1", "g2";
+			reg = <0x38300000 0x10000>;
+			reg-names = "g1";
+			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g1";
 			clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
-				 <&clk IMX8MQ_CLK_VPU_G2_ROOT>;
-			clock-names = "g1", "g2";
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g1", "bus";
 			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
 					  <&clk IMX8MQ_CLK_VPU_G2>,
 					  <&clk IMX8MQ_CLK_VPU_BUS>,
@@ -1306,12 +1304,35 @@ vpu: video-codec@38300000 {
 						 <&clk IMX8MQ_VPU_PLL_OUT>,
 						 <&clk IMX8MQ_SYS1_PLL_800M>,
 						 <&clk IMX8MQ_VPU_PLL>;
-			assigned-clock-rates = <600000000>, <600000000>,
+			assigned-clock-rates = <600000000>, <300000000>,
 					       <800000000>, <0>;
 			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
 			power-domains = <&pgc_vpu>;
 		};
 
+		vpu_g2: video-codec@38310000 {
+			compatible = "nxp,imx8mq-vpu-g2";
+			reg = <0x38310000 0x10000>;
+			reg-names = "g2";
+			interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g2";
+			clocks = <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g2",  "bus";
+			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
+					  <&clk IMX8MQ_CLK_VPU_G2>,
+					  <&clk IMX8MQ_CLK_VPU_BUS>,
+					  <&clk IMX8MQ_VPU_PLL_BYPASS>;
+			assigned-clock-parents = <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_SYS1_PLL_800M>,
+						 <&clk IMX8MQ_VPU_PLL>;
+			assigned-clock-rates = <600000000>, <300000000>,
+					       <800000000>, <0>;
+			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+			power-domains = <&pgc_vpu>;
+		};
+
 		pcie0: pcie@33800000 {
 			compatible = "fsl,imx8mq-pcie";
 			reg = <0x33800000 0x400000>,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 11/11] arm64: dts: imx8mq: Add node to G2 hardware
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Split VPU node in two: one for G1 and one for G2 since they are
different hardware blocks.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- remove useless clocks in VPUs nodes

 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 41 +++++++++++++++++------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
index d9d9efc8592d..8358e214d696 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
@@ -1287,17 +1287,15 @@ vpu_reset: vpu-reset@38320000 {
 			#reset-cells = <1>;
 		};
 
-		vpu: video-codec@38300000 {
+		vpu_g1: video-codec@38300000 {
 			compatible = "nxp,imx8mq-vpu";
-			reg = <0x38300000 0x10000>,
-			      <0x38310000 0x10000>;
-			reg-names = "g1", "g2";
-			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-				     <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "g1", "g2";
+			reg = <0x38300000 0x10000>;
+			reg-names = "g1";
+			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g1";
 			clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
-				 <&clk IMX8MQ_CLK_VPU_G2_ROOT>;
-			clock-names = "g1", "g2";
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g1", "bus";
 			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
 					  <&clk IMX8MQ_CLK_VPU_G2>,
 					  <&clk IMX8MQ_CLK_VPU_BUS>,
@@ -1306,12 +1304,35 @@ vpu: video-codec@38300000 {
 						 <&clk IMX8MQ_VPU_PLL_OUT>,
 						 <&clk IMX8MQ_SYS1_PLL_800M>,
 						 <&clk IMX8MQ_VPU_PLL>;
-			assigned-clock-rates = <600000000>, <600000000>,
+			assigned-clock-rates = <600000000>, <300000000>,
 					       <800000000>, <0>;
 			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
 			power-domains = <&pgc_vpu>;
 		};
 
+		vpu_g2: video-codec@38310000 {
+			compatible = "nxp,imx8mq-vpu-g2";
+			reg = <0x38310000 0x10000>;
+			reg-names = "g2";
+			interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g2";
+			clocks = <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g2",  "bus";
+			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
+					  <&clk IMX8MQ_CLK_VPU_G2>,
+					  <&clk IMX8MQ_CLK_VPU_BUS>,
+					  <&clk IMX8MQ_VPU_PLL_BYPASS>;
+			assigned-clock-parents = <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_SYS1_PLL_800M>,
+						 <&clk IMX8MQ_VPU_PLL>;
+			assigned-clock-rates = <600000000>, <300000000>,
+					       <800000000>, <0>;
+			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+			power-domains = <&pgc_vpu>;
+		};
+
 		pcie0: pcie@33800000 {
 			compatible = "fsl,imx8mq-pcie";
 			reg = <0x33800000 0x400000>,
-- 
2.25.1


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 11/11] arm64: dts: imx8mq: Add node to G2 hardware
@ 2021-03-03 11:39   ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-03 11:39 UTC (permalink / raw)
  To: ezequiel, p.zabel, mchehab, robh+dt, shawnguo, s.hauer, kernel,
	festevam, linux-imx, gregkh, mripard, paul.kocialkowski, wens,
	jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel, Benjamin Gaignard

Split VPU node in two: one for G1 and one for G2 since they are
different hardware blocks.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
version 4:
- rebase the change on top of VPU reset patches:
  https://www.spinics.net/lists/arm-kernel/msg878440.html

version 2:
- remove useless clocks in VPUs nodes

 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 41 +++++++++++++++++------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
index d9d9efc8592d..8358e214d696 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
@@ -1287,17 +1287,15 @@ vpu_reset: vpu-reset@38320000 {
 			#reset-cells = <1>;
 		};
 
-		vpu: video-codec@38300000 {
+		vpu_g1: video-codec@38300000 {
 			compatible = "nxp,imx8mq-vpu";
-			reg = <0x38300000 0x10000>,
-			      <0x38310000 0x10000>;
-			reg-names = "g1", "g2";
-			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
-				     <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "g1", "g2";
+			reg = <0x38300000 0x10000>;
+			reg-names = "g1";
+			interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g1";
 			clocks = <&clk IMX8MQ_CLK_VPU_G1_ROOT>,
-				 <&clk IMX8MQ_CLK_VPU_G2_ROOT>;
-			clock-names = "g1", "g2";
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g1", "bus";
 			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
 					  <&clk IMX8MQ_CLK_VPU_G2>,
 					  <&clk IMX8MQ_CLK_VPU_BUS>,
@@ -1306,12 +1304,35 @@ vpu: video-codec@38300000 {
 						 <&clk IMX8MQ_VPU_PLL_OUT>,
 						 <&clk IMX8MQ_SYS1_PLL_800M>,
 						 <&clk IMX8MQ_VPU_PLL>;
-			assigned-clock-rates = <600000000>, <600000000>,
+			assigned-clock-rates = <600000000>, <300000000>,
 					       <800000000>, <0>;
 			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G1>;
 			power-domains = <&pgc_vpu>;
 		};
 
+		vpu_g2: video-codec@38310000 {
+			compatible = "nxp,imx8mq-vpu-g2";
+			reg = <0x38310000 0x10000>;
+			reg-names = "g2";
+			interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "g2";
+			clocks = <&clk IMX8MQ_CLK_VPU_G2_ROOT>,
+				 <&clk IMX8MQ_CLK_VPU_DEC_ROOT>;
+			clock-names = "g2",  "bus";
+			assigned-clocks = <&clk IMX8MQ_CLK_VPU_G1>,
+					  <&clk IMX8MQ_CLK_VPU_G2>,
+					  <&clk IMX8MQ_CLK_VPU_BUS>,
+					  <&clk IMX8MQ_VPU_PLL_BYPASS>;
+			assigned-clock-parents = <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_VPU_PLL_OUT>,
+						 <&clk IMX8MQ_SYS1_PLL_800M>,
+						 <&clk IMX8MQ_VPU_PLL>;
+			assigned-clock-rates = <600000000>, <300000000>,
+					       <800000000>, <0>;
+			resets = <&vpu_reset IMX8MQ_RESET_VPU_RESET_G2>;
+			power-domains = <&pgc_vpu>;
+		};
+
 		pcie0: pcie@33800000 {
 			compatible = "fsl,imx8mq-pcie";
 			reg = <0x33800000 0x400000>,
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
  2021-03-03 11:39   ` Benjamin Gaignard
  (?)
@ 2021-03-03 21:56     ` Ezequiel Garcia
  -1 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:56 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Change hantro_codec_ops run prototype from 'void' to 'int'.
> This allow to cancel the job if an error occur while configuring
> the hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>  .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>  .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>  .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>  .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>  drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>  .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>  9 files changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e5f200e64993..ac1429f00b33 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>  
>         v4l2_m2m_buf_copy_metadata(src, dst, true);
>  
> -       ctx->codec_ops->run(ctx);
> +       if (ctx->codec_ops->run(ctx))
> +               goto err_cancel_job;
> +
>         return;
>  
>  err_cancel_job:
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 845bef73d218..fcd4db13c9fe 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>         vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>  }
>  
> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;
>  
>         /* Prepare the H264 decoder context. */
>         if (hantro_h264_dec_prepare_run(ctx))
> -               return;
> +               return -EINVAL;

This should be returning the value from hantro_h264_dec_prepare_run.

Thanks!
Ezequiel


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-03 21:56     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:56 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Change hantro_codec_ops run prototype from 'void' to 'int'.
> This allow to cancel the job if an error occur while configuring
> the hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>  .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>  .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>  .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>  .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>  drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>  .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>  9 files changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e5f200e64993..ac1429f00b33 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>  
>         v4l2_m2m_buf_copy_metadata(src, dst, true);
>  
> -       ctx->codec_ops->run(ctx);
> +       if (ctx->codec_ops->run(ctx))
> +               goto err_cancel_job;
> +
>         return;
>  
>  err_cancel_job:
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 845bef73d218..fcd4db13c9fe 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>         vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>  }
>  
> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;
>  
>         /* Prepare the H264 decoder context. */
>         if (hantro_h264_dec_prepare_run(ctx))
> -               return;
> +               return -EINVAL;

This should be returning the value from hantro_h264_dec_prepare_run.

Thanks!
Ezequiel


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-03 21:56     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 21:56 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Change hantro_codec_ops run prototype from 'void' to 'int'.
> This allow to cancel the job if an error occur while configuring
> the hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>  .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>  .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>  .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>  .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>  drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>  .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>  .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>  9 files changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e5f200e64993..ac1429f00b33 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>  
>         v4l2_m2m_buf_copy_metadata(src, dst, true);
>  
> -       ctx->codec_ops->run(ctx);
> +       if (ctx->codec_ops->run(ctx))
> +               goto err_cancel_job;
> +
>         return;
>  
>  err_cancel_job:
> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> index 845bef73d218..fcd4db13c9fe 100644
> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>         vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>  }
>  
> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;
>  
>         /* Prepare the H264 decoder context. */
>         if (hantro_h264_dec_prepare_run(ctx))
> -               return;
> +               return -EINVAL;

This should be returning the value from hantro_h264_dec_prepare_run.

Thanks!
Ezequiel


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
  2021-03-03 11:39   ` Benjamin Gaignard
  (?)
@ 2021-03-03 22:05     ` Ezequiel Garcia
  -1 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:05 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Decoders hardware blocks could exist in multiple versions: add
> a field to distinguish them at runtime.
> G2 hardware block doesn't have postprocessor hantro_needs_postproc
> function should always returns false in for this hardware.
> hantro_needs_postproc function becoming to much complex to
> stay inline in .h file move it to .c file.
> 

Note that I already questioned this patch before:

https://lkml.org/lkml/2021/2/17/722

I think it's better to rely on of_device_id.data for this
type of thing.

In particular, I was expecting that just using
hantro_variant.postproc_regs would be enough.

Can you try if that works and avoid reading swreg(0)
and probing the hardware core?

Thanks!
Ezequiel

> Keep the default behavoir to be G1 hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>  drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>  drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>  3 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index a76a0d79db9f..05876e426419 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>  #define HANTRO_HEVC_DECODER    BIT(19)
>  #define HANTRO_DECODERS                0xffff0000
>  
> +#define HANTRO_G1_REV          0x6731
> +#define HANTRO_G2_REV          0x6732
> +
>  /**
>   * struct hantro_irq - irq handler and name
>   *
> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>   * @enc_base:          Mapped address of VPU encoder register for convenience.
>   * @dec_base:          Mapped address of VPU decoder register for convenience.
>   * @ctrl_base:         Mapped address of VPU control block.
> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>   * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>   * @irqlock:           Spinlock to synchronize access to data structures
>   *                     shared with interrupt handlers.
> @@ -190,6 +194,7 @@ struct hantro_dev {
>         void __iomem *enc_base;
>         void __iomem *dec_base;
>         void __iomem *ctrl_base;
> +       u32 core_hw_dec_rev;
>  
>         struct mutex vpu_mutex; /* video_device lock */
>         spinlock_t irqlock;
> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>         return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>  }
>  
> -static inline bool
> -hantro_needs_postproc(const struct hantro_ctx *ctx,
> -                     const struct hantro_fmt *fmt)
> -{
> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
> -}
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt);
>  
>  static inline dma_addr_t
>  hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index f0b68e16fcc0..e3e6df28f470 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>         }
>         vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>         vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
> +       /* by default decoder is G1 */
> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>  
>         ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>         if (ret) {
> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
> index 6d2a8f2a8f0b..050880f720d6 100644
> --- a/drivers/staging/media/hantro/hantro_postproc.c
> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>         .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>  };
>  
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +
> +       if (ctx->is_encoder)
> +               return false;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q

> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
> +               return false;
> +
> +       return false;
> +}
> +
>  void hantro_postproc_enable(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-03 22:05     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:05 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Decoders hardware blocks could exist in multiple versions: add
> a field to distinguish them at runtime.
> G2 hardware block doesn't have postprocessor hantro_needs_postproc
> function should always returns false in for this hardware.
> hantro_needs_postproc function becoming to much complex to
> stay inline in .h file move it to .c file.
> 

Note that I already questioned this patch before:

https://lkml.org/lkml/2021/2/17/722

I think it's better to rely on of_device_id.data for this
type of thing.

In particular, I was expecting that just using
hantro_variant.postproc_regs would be enough.

Can you try if that works and avoid reading swreg(0)
and probing the hardware core?

Thanks!
Ezequiel

> Keep the default behavoir to be G1 hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>  drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>  drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>  3 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index a76a0d79db9f..05876e426419 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>  #define HANTRO_HEVC_DECODER    BIT(19)
>  #define HANTRO_DECODERS                0xffff0000
>  
> +#define HANTRO_G1_REV          0x6731
> +#define HANTRO_G2_REV          0x6732
> +
>  /**
>   * struct hantro_irq - irq handler and name
>   *
> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>   * @enc_base:          Mapped address of VPU encoder register for convenience.
>   * @dec_base:          Mapped address of VPU decoder register for convenience.
>   * @ctrl_base:         Mapped address of VPU control block.
> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>   * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>   * @irqlock:           Spinlock to synchronize access to data structures
>   *                     shared with interrupt handlers.
> @@ -190,6 +194,7 @@ struct hantro_dev {
>         void __iomem *enc_base;
>         void __iomem *dec_base;
>         void __iomem *ctrl_base;
> +       u32 core_hw_dec_rev;
>  
>         struct mutex vpu_mutex; /* video_device lock */
>         spinlock_t irqlock;
> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>         return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>  }
>  
> -static inline bool
> -hantro_needs_postproc(const struct hantro_ctx *ctx,
> -                     const struct hantro_fmt *fmt)
> -{
> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
> -}
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt);
>  
>  static inline dma_addr_t
>  hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index f0b68e16fcc0..e3e6df28f470 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>         }
>         vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>         vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
> +       /* by default decoder is G1 */
> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>  
>         ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>         if (ret) {
> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
> index 6d2a8f2a8f0b..050880f720d6 100644
> --- a/drivers/staging/media/hantro/hantro_postproc.c
> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>         .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>  };
>  
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +
> +       if (ctx->is_encoder)
> +               return false;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q

> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
> +               return false;
> +
> +       return false;
> +}
> +
>  void hantro_postproc_enable(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;



_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-03 22:05     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:05 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Decoders hardware blocks could exist in multiple versions: add
> a field to distinguish them at runtime.
> G2 hardware block doesn't have postprocessor hantro_needs_postproc
> function should always returns false in for this hardware.
> hantro_needs_postproc function becoming to much complex to
> stay inline in .h file move it to .c file.
> 

Note that I already questioned this patch before:

https://lkml.org/lkml/2021/2/17/722

I think it's better to rely on of_device_id.data for this
type of thing.

In particular, I was expecting that just using
hantro_variant.postproc_regs would be enough.

Can you try if that works and avoid reading swreg(0)
and probing the hardware core?

Thanks!
Ezequiel

> Keep the default behavoir to be G1 hardware.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>  drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>  drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>  3 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index a76a0d79db9f..05876e426419 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>  #define HANTRO_HEVC_DECODER    BIT(19)
>  #define HANTRO_DECODERS                0xffff0000
>  
> +#define HANTRO_G1_REV          0x6731
> +#define HANTRO_G2_REV          0x6732
> +
>  /**
>   * struct hantro_irq - irq handler and name
>   *
> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>   * @enc_base:          Mapped address of VPU encoder register for convenience.
>   * @dec_base:          Mapped address of VPU decoder register for convenience.
>   * @ctrl_base:         Mapped address of VPU control block.
> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>   * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>   * @irqlock:           Spinlock to synchronize access to data structures
>   *                     shared with interrupt handlers.
> @@ -190,6 +194,7 @@ struct hantro_dev {
>         void __iomem *enc_base;
>         void __iomem *dec_base;
>         void __iomem *ctrl_base;
> +       u32 core_hw_dec_rev;
>  
>         struct mutex vpu_mutex; /* video_device lock */
>         spinlock_t irqlock;
> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>         return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>  }
>  
> -static inline bool
> -hantro_needs_postproc(const struct hantro_ctx *ctx,
> -                     const struct hantro_fmt *fmt)
> -{
> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
> -}
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt);
>  
>  static inline dma_addr_t
>  hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index f0b68e16fcc0..e3e6df28f470 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>         }
>         vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>         vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
> +       /* by default decoder is G1 */
> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>  
>         ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>         if (ret) {
> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
> index 6d2a8f2a8f0b..050880f720d6 100644
> --- a/drivers/staging/media/hantro/hantro_postproc.c
> +++ b/drivers/staging/media/hantro/hantro_postproc.c
> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>         .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>  };
>  
> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
> +                          const struct hantro_fmt *fmt)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +
> +       if (ctx->is_encoder)
> +               return false;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q

> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
> +
> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
> +               return false;
> +
> +       return false;
> +}
> +
>  void hantro_postproc_enable(struct hantro_ctx *ctx)
>  {
>         struct hantro_dev *vpu = ctx->dev;



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
  2021-03-03 11:39   ` Benjamin Gaignard
  (?)
@ 2021-03-03 22:08     ` Ezequiel Garcia
  -1 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:08 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Add variant to IMX8M to enable G2/HEVC codec.
> Define the capabilities for the hardware up to 3840x2160.
> Retrieve the hardware version at init to distinguish G1 from G2.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 2:
> - remove useless clocks
> 
>  drivers/staging/media/hantro/hantro_drv.c   |  1 +
>  drivers/staging/media/hantro/hantro_hw.h    |  1 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>  3 files changed, 93 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index bc90a52f4d3d..976be7b6ecfb 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>  #endif
>  #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>         { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>  #endif
>         { /* sentinel */ }
>  };
> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
> index dade3b0769c1..f61f58da05fe 100644
> --- a/drivers/staging/media/hantro/hantro_hw.h
> +++ b/drivers/staging/media/hantro/hantro_hw.h
> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>  extern const struct hantro_variant rk3328_vpu_variant;
>  extern const struct hantro_variant rk3288_vpu_variant;
>  extern const struct hantro_variant imx8mq_vpu_variant;
> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>  
>  extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>  
> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> index d5b4312b9391..46b33531be85 100644
> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> @@ -12,6 +12,7 @@
>  #include "hantro.h"
>  #include "hantro_jpeg.h"
>  #include "hantro_g1_regs.h"
> +#include "hantro_g2_regs.h"
>  
>  static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>  {
> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>         },
>  };
>  
> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_NV12,
> +               .codec_mode = HANTRO_MODE_NONE,
> +       },
> +       {
> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
> +               .max_depth = 2,
> +               .frmsize = {
> +                       .min_width = 48,
> +                       .max_width = 3840,
> +                       .step_width = MB_DIM,
> +                       .min_height = 48,
> +                       .max_height = 2160,
> +                       .step_height = MB_DIM,
> +               },
> +       },
> +};
> +
>  static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>  {
>         struct hantro_dev *vpu = dev_id;
> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>         return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
> +{
> +       struct hantro_dev *vpu = dev_id;
> +       enum vb2_buffer_state state;
> +       u32 status;
> +
> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
> +
> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);

Is this clock gate enable needed on each interrupt?

> +
> +       hantro_irq_done(vpu, state);
> +
> +       return IRQ_HANDLED;
> +}
> +
>  static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>  {
> -       vpu->dec_base = vpu->reg_bases[0];
> +       int ret;
> +
> +       /* Check variant version */
> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
> +       if (ret) {
> +               dev_err(vpu->dev, "Failed to enable clocks\n");
> +               return ret;
> +       }
> +
> +       /* Make that the device has been reset before read it id */
> +       ret = device_reset(vpu->dev);
> +       if (ret)
> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
> +
> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>  
>         return 0;
>  }
> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>         },
>  };
>  
> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
> +       [HANTRO_MODE_HEVC_DEC] = {
> +               .run = hantro_g2_hevc_dec_run,
> +               .reset = imx8mq_vpu_reset,
> +               .init = hantro_hevc_dec_init,
> +               .exit = hantro_hevc_dec_exit,
> +       },
> +};
> +
>  /*
>   * VPU variants.
>   */
>  
>  static const struct hantro_irq imx8mq_irqs[] = {
>         { "g1", imx8m_vpu_g1_irq },
> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>  };
>  
> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
> +static const struct hantro_irq imx8mq_g2_irqs[] = {
> +       { "g2", imx8m_vpu_g2_irq },
> +};
> +
> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
> +static const char * const imx8mq_reg_names[] = { "g1"};
> +
> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>  
>  const struct hantro_variant imx8mq_vpu_variant = {
>         .dec_fmts = imx8m_vpu_dec_fmts,
> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>         .reg_names = imx8mq_reg_names,
>         .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>  };
> +
> +const struct hantro_variant imx8mq_vpu_g2_variant = {
> +       .dec_offset = 0x0,
> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),

Is this postproc_fmts correct?

Thanks!
Ezequiel

> +       .codec = HANTRO_HEVC_DECODER,
> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
> +       .init = imx8mq_vpu_hw_init,
> +       .runtime_resume = imx8mq_runtime_resume,
> +       .irqs = imx8mq_g2_irqs,
> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
> +       .clk_names = imx8mq_g2_clk_names,
> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
> +       .reg_names = imx8mq_g2_reg_names,
> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
> +};



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-03 22:08     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:08 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Add variant to IMX8M to enable G2/HEVC codec.
> Define the capabilities for the hardware up to 3840x2160.
> Retrieve the hardware version at init to distinguish G1 from G2.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 2:
> - remove useless clocks
> 
>  drivers/staging/media/hantro/hantro_drv.c   |  1 +
>  drivers/staging/media/hantro/hantro_hw.h    |  1 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>  3 files changed, 93 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index bc90a52f4d3d..976be7b6ecfb 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>  #endif
>  #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>         { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>  #endif
>         { /* sentinel */ }
>  };
> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
> index dade3b0769c1..f61f58da05fe 100644
> --- a/drivers/staging/media/hantro/hantro_hw.h
> +++ b/drivers/staging/media/hantro/hantro_hw.h
> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>  extern const struct hantro_variant rk3328_vpu_variant;
>  extern const struct hantro_variant rk3288_vpu_variant;
>  extern const struct hantro_variant imx8mq_vpu_variant;
> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>  
>  extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>  
> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> index d5b4312b9391..46b33531be85 100644
> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> @@ -12,6 +12,7 @@
>  #include "hantro.h"
>  #include "hantro_jpeg.h"
>  #include "hantro_g1_regs.h"
> +#include "hantro_g2_regs.h"
>  
>  static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>  {
> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>         },
>  };
>  
> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_NV12,
> +               .codec_mode = HANTRO_MODE_NONE,
> +       },
> +       {
> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
> +               .max_depth = 2,
> +               .frmsize = {
> +                       .min_width = 48,
> +                       .max_width = 3840,
> +                       .step_width = MB_DIM,
> +                       .min_height = 48,
> +                       .max_height = 2160,
> +                       .step_height = MB_DIM,
> +               },
> +       },
> +};
> +
>  static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>  {
>         struct hantro_dev *vpu = dev_id;
> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>         return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
> +{
> +       struct hantro_dev *vpu = dev_id;
> +       enum vb2_buffer_state state;
> +       u32 status;
> +
> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
> +
> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);

Is this clock gate enable needed on each interrupt?

> +
> +       hantro_irq_done(vpu, state);
> +
> +       return IRQ_HANDLED;
> +}
> +
>  static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>  {
> -       vpu->dec_base = vpu->reg_bases[0];
> +       int ret;
> +
> +       /* Check variant version */
> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
> +       if (ret) {
> +               dev_err(vpu->dev, "Failed to enable clocks\n");
> +               return ret;
> +       }
> +
> +       /* Make that the device has been reset before read it id */
> +       ret = device_reset(vpu->dev);
> +       if (ret)
> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
> +
> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>  
>         return 0;
>  }
> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>         },
>  };
>  
> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
> +       [HANTRO_MODE_HEVC_DEC] = {
> +               .run = hantro_g2_hevc_dec_run,
> +               .reset = imx8mq_vpu_reset,
> +               .init = hantro_hevc_dec_init,
> +               .exit = hantro_hevc_dec_exit,
> +       },
> +};
> +
>  /*
>   * VPU variants.
>   */
>  
>  static const struct hantro_irq imx8mq_irqs[] = {
>         { "g1", imx8m_vpu_g1_irq },
> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>  };
>  
> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
> +static const struct hantro_irq imx8mq_g2_irqs[] = {
> +       { "g2", imx8m_vpu_g2_irq },
> +};
> +
> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
> +static const char * const imx8mq_reg_names[] = { "g1"};
> +
> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>  
>  const struct hantro_variant imx8mq_vpu_variant = {
>         .dec_fmts = imx8m_vpu_dec_fmts,
> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>         .reg_names = imx8mq_reg_names,
>         .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>  };
> +
> +const struct hantro_variant imx8mq_vpu_g2_variant = {
> +       .dec_offset = 0x0,
> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),

Is this postproc_fmts correct?

Thanks!
Ezequiel

> +       .codec = HANTRO_HEVC_DECODER,
> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
> +       .init = imx8mq_vpu_hw_init,
> +       .runtime_resume = imx8mq_runtime_resume,
> +       .irqs = imx8mq_g2_irqs,
> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
> +       .clk_names = imx8mq_g2_clk_names,
> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
> +       .reg_names = imx8mq_g2_reg_names,
> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
> +};



_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-03 22:08     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-03 22:08 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Add variant to IMX8M to enable G2/HEVC codec.
> Define the capabilities for the hardware up to 3840x2160.
> Retrieve the hardware version at init to distinguish G1 from G2.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 2:
> - remove useless clocks
> 
>  drivers/staging/media/hantro/hantro_drv.c   |  1 +
>  drivers/staging/media/hantro/hantro_hw.h    |  1 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>  3 files changed, 93 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index bc90a52f4d3d..976be7b6ecfb 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>  #endif
>  #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>         { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>  #endif
>         { /* sentinel */ }
>  };
> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
> index dade3b0769c1..f61f58da05fe 100644
> --- a/drivers/staging/media/hantro/hantro_hw.h
> +++ b/drivers/staging/media/hantro/hantro_hw.h
> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>  extern const struct hantro_variant rk3328_vpu_variant;
>  extern const struct hantro_variant rk3288_vpu_variant;
>  extern const struct hantro_variant imx8mq_vpu_variant;
> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>  
>  extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>  
> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> index d5b4312b9391..46b33531be85 100644
> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
> @@ -12,6 +12,7 @@
>  #include "hantro.h"
>  #include "hantro_jpeg.h"
>  #include "hantro_g1_regs.h"
> +#include "hantro_g2_regs.h"
>  
>  static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>  {
> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>         },
>  };
>  
> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
> +       {
> +               .fourcc = V4L2_PIX_FMT_NV12,
> +               .codec_mode = HANTRO_MODE_NONE,
> +       },
> +       {
> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
> +               .max_depth = 2,
> +               .frmsize = {
> +                       .min_width = 48,
> +                       .max_width = 3840,
> +                       .step_width = MB_DIM,
> +                       .min_height = 48,
> +                       .max_height = 2160,
> +                       .step_height = MB_DIM,
> +               },
> +       },
> +};
> +
>  static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>  {
>         struct hantro_dev *vpu = dev_id;
> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>         return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
> +{
> +       struct hantro_dev *vpu = dev_id;
> +       enum vb2_buffer_state state;
> +       u32 status;
> +
> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
> +
> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);

Is this clock gate enable needed on each interrupt?

> +
> +       hantro_irq_done(vpu, state);
> +
> +       return IRQ_HANDLED;
> +}
> +
>  static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>  {
> -       vpu->dec_base = vpu->reg_bases[0];
> +       int ret;
> +
> +       /* Check variant version */
> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
> +       if (ret) {
> +               dev_err(vpu->dev, "Failed to enable clocks\n");
> +               return ret;
> +       }
> +
> +       /* Make that the device has been reset before read it id */
> +       ret = device_reset(vpu->dev);
> +       if (ret)
> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
> +
> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>  
>         return 0;
>  }
> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>         },
>  };
>  
> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
> +       [HANTRO_MODE_HEVC_DEC] = {
> +               .run = hantro_g2_hevc_dec_run,
> +               .reset = imx8mq_vpu_reset,
> +               .init = hantro_hevc_dec_init,
> +               .exit = hantro_hevc_dec_exit,
> +       },
> +};
> +
>  /*
>   * VPU variants.
>   */
>  
>  static const struct hantro_irq imx8mq_irqs[] = {
>         { "g1", imx8m_vpu_g1_irq },
> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>  };
>  
> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
> +static const struct hantro_irq imx8mq_g2_irqs[] = {
> +       { "g2", imx8m_vpu_g2_irq },
> +};
> +
> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
> +static const char * const imx8mq_reg_names[] = { "g1"};
> +
> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>  
>  const struct hantro_variant imx8mq_vpu_variant = {
>         .dec_fmts = imx8m_vpu_dec_fmts,
> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>         .reg_names = imx8mq_reg_names,
>         .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>  };
> +
> +const struct hantro_variant imx8mq_vpu_g2_variant = {
> +       .dec_offset = 0x0,
> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),

Is this postproc_fmts correct?

Thanks!
Ezequiel

> +       .codec = HANTRO_HEVC_DECODER,
> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
> +       .init = imx8mq_vpu_hw_init,
> +       .runtime_resume = imx8mq_runtime_resume,
> +       .irqs = imx8mq_g2_irqs,
> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
> +       .clk_names = imx8mq_g2_clk_names,
> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
> +       .reg_names = imx8mq_g2_reg_names,
> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
> +};



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
  2021-03-03 21:56     ` Ezequiel Garcia
  (?)
@ 2021-03-05  9:24       ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:24 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 22:56, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Change hantro_codec_ops run prototype from 'void' to 'int'.
>> This allow to cancel the job if an error occur while configuring
>> the hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>>   .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>>   .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>>   .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>>   .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>>   drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>>   .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>>   9 files changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e5f200e64993..ac1429f00b33 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>>   
>>          v4l2_m2m_buf_copy_metadata(src, dst, true);
>>   
>> -       ctx->codec_ops->run(ctx);
>> +       if (ctx->codec_ops->run(ctx))
>> +               goto err_cancel_job;
>> +
>>          return;
>>   
>>   err_cancel_job:
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 845bef73d218..fcd4db13c9fe 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>>          vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>>   }
>>   
>> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>>   
>>          /* Prepare the H264 decoder context. */
>>          if (hantro_h264_dec_prepare_run(ctx))
>> -               return;
>> +               return -EINVAL;
> This should be returning the value from hantro_h264_dec_prepare_run.

That will be fixed in the next version, thanks

Benjamin

>
> Thanks!
> Ezequiel
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-05  9:24       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:24 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 22:56, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Change hantro_codec_ops run prototype from 'void' to 'int'.
>> This allow to cancel the job if an error occur while configuring
>> the hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>>   .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>>   .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>>   .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>>   .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>>   drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>>   .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>>   9 files changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e5f200e64993..ac1429f00b33 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>>   
>>          v4l2_m2m_buf_copy_metadata(src, dst, true);
>>   
>> -       ctx->codec_ops->run(ctx);
>> +       if (ctx->codec_ops->run(ctx))
>> +               goto err_cancel_job;
>> +
>>          return;
>>   
>>   err_cancel_job:
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 845bef73d218..fcd4db13c9fe 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>>          vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>>   }
>>   
>> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>>   
>>          /* Prepare the H264 decoder context. */
>>          if (hantro_h264_dec_prepare_run(ctx))
>> -               return;
>> +               return -EINVAL;
> This should be returning the value from hantro_h264_dec_prepare_run.

That will be fixed in the next version, thanks

Benjamin

>
> Thanks!
> Ezequiel
>

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors
@ 2021-03-05  9:24       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:24 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 22:56, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Change hantro_codec_ops run prototype from 'void' to 'int'.
>> This allow to cancel the job if an error occur while configuring
>> the hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro_drv.c     |  4 +++-
>>   .../staging/media/hantro/hantro_g1_h264_dec.c |  6 ++++--
>>   .../media/hantro/hantro_g1_mpeg2_dec.c        |  4 +++-
>>   .../staging/media/hantro/hantro_g1_vp8_dec.c  |  6 ++++--
>>   .../staging/media/hantro/hantro_h1_jpeg_enc.c |  4 +++-
>>   drivers/staging/media/hantro/hantro_hw.h      | 19 ++++++++++---------
>>   .../media/hantro/rk3399_vpu_hw_jpeg_enc.c     |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_mpeg2_dec.c    |  4 +++-
>>   .../media/hantro/rk3399_vpu_hw_vp8_dec.c      |  6 ++++--
>>   9 files changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e5f200e64993..ac1429f00b33 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -161,7 +161,9 @@ static void device_run(void *priv)
>>   
>>          v4l2_m2m_buf_copy_metadata(src, dst, true);
>>   
>> -       ctx->codec_ops->run(ctx);
>> +       if (ctx->codec_ops->run(ctx))
>> +               goto err_cancel_job;
>> +
>>          return;
>>   
>>   err_cancel_job:
>> diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> index 845bef73d218..fcd4db13c9fe 100644
>> --- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> +++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
>> @@ -273,13 +273,13 @@ static void set_buffers(struct hantro_ctx *ctx)
>>          vdpu_write_relaxed(vpu, ctx->h264_dec.priv.dma, G1_REG_ADDR_QTABLE);
>>   }
>>   
>> -void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>> +int hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>>   
>>          /* Prepare the H264 decoder context. */
>>          if (hantro_h264_dec_prepare_run(ctx))
>> -               return;
>> +               return -EINVAL;
> This should be returning the value from hantro_h264_dec_prepare_run.

That will be fixed in the next version, thanks

Benjamin

>
> Thanks!
> Ezequiel
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
  2021-03-03 22:05     ` Ezequiel Garcia
  (?)
@ 2021-03-05  9:27       ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:27 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:05, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Decoders hardware blocks could exist in multiple versions: add
>> a field to distinguish them at runtime.
>> G2 hardware block doesn't have postprocessor hantro_needs_postproc
>> function should always returns false in for this hardware.
>> hantro_needs_postproc function becoming to much complex to
>> stay inline in .h file move it to .c file.
>>
> Note that I already questioned this patch before:
>
> https://lkml.org/lkml/2021/2/17/722
>
> I think it's better to rely on of_device_id.data for this
> type of thing.
>
> In particular, I was expecting that just using
> hantro_variant.postproc_regs would be enough.
>
> Can you try if that works and avoid reading swreg(0)
> and probing the hardware core?

I have found a way to remove this: if the variant doesn't define
post processor formats, needs_postproc function will always returns
false and that what the only useful usage of this version field.

Benjamin

>
> Thanks!
> Ezequiel
>
>> Keep the default behavoir to be G1 hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>>   drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>>   drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>>   3 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index a76a0d79db9f..05876e426419 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>>   #define HANTRO_HEVC_DECODER    BIT(19)
>>   #define HANTRO_DECODERS                0xffff0000
>>   
>> +#define HANTRO_G1_REV          0x6731
>> +#define HANTRO_G2_REV          0x6732
>> +
>>   /**
>>    * struct hantro_irq - irq handler and name
>>    *
>> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>>    * @enc_base:          Mapped address of VPU encoder register for convenience.
>>    * @dec_base:          Mapped address of VPU decoder register for convenience.
>>    * @ctrl_base:         Mapped address of VPU control block.
>> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>>    * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>>    * @irqlock:           Spinlock to synchronize access to data structures
>>    *                     shared with interrupt handlers.
>> @@ -190,6 +194,7 @@ struct hantro_dev {
>>          void __iomem *enc_base;
>>          void __iomem *dec_base;
>>          void __iomem *ctrl_base;
>> +       u32 core_hw_dec_rev;
>>   
>>          struct mutex vpu_mutex; /* video_device lock */
>>          spinlock_t irqlock;
>> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>>          return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>>   }
>>   
>> -static inline bool
>> -hantro_needs_postproc(const struct hantro_ctx *ctx,
>> -                     const struct hantro_fmt *fmt)
>> -{
>> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
>> -}
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt);
>>   
>>   static inline dma_addr_t
>>   hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index f0b68e16fcc0..e3e6df28f470 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>>          }
>>          vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>>          vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
>> +       /* by default decoder is G1 */
>> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>>   
>>          ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>>          if (ret) {
>> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
>> index 6d2a8f2a8f0b..050880f720d6 100644
>> --- a/drivers/staging/media/hantro/hantro_postproc.c
>> +++ b/drivers/staging/media/hantro/hantro_postproc.c
>> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>>          .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>>   };
>>   
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +
>> +       if (ctx->is_encoder)
>> +               return false;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q
>> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
>> +               return false;
>> +
>> +       return false;
>> +}
>> +
>>   void hantro_postproc_enable(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-05  9:27       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:27 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:05, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Decoders hardware blocks could exist in multiple versions: add
>> a field to distinguish them at runtime.
>> G2 hardware block doesn't have postprocessor hantro_needs_postproc
>> function should always returns false in for this hardware.
>> hantro_needs_postproc function becoming to much complex to
>> stay inline in .h file move it to .c file.
>>
> Note that I already questioned this patch before:
>
> https://lkml.org/lkml/2021/2/17/722
>
> I think it's better to rely on of_device_id.data for this
> type of thing.
>
> In particular, I was expecting that just using
> hantro_variant.postproc_regs would be enough.
>
> Can you try if that works and avoid reading swreg(0)
> and probing the hardware core?

I have found a way to remove this: if the variant doesn't define
post processor formats, needs_postproc function will always returns
false and that what the only useful usage of this version field.

Benjamin

>
> Thanks!
> Ezequiel
>
>> Keep the default behavoir to be G1 hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>>   drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>>   drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>>   3 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index a76a0d79db9f..05876e426419 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>>   #define HANTRO_HEVC_DECODER    BIT(19)
>>   #define HANTRO_DECODERS                0xffff0000
>>   
>> +#define HANTRO_G1_REV          0x6731
>> +#define HANTRO_G2_REV          0x6732
>> +
>>   /**
>>    * struct hantro_irq - irq handler and name
>>    *
>> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>>    * @enc_base:          Mapped address of VPU encoder register for convenience.
>>    * @dec_base:          Mapped address of VPU decoder register for convenience.
>>    * @ctrl_base:         Mapped address of VPU control block.
>> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>>    * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>>    * @irqlock:           Spinlock to synchronize access to data structures
>>    *                     shared with interrupt handlers.
>> @@ -190,6 +194,7 @@ struct hantro_dev {
>>          void __iomem *enc_base;
>>          void __iomem *dec_base;
>>          void __iomem *ctrl_base;
>> +       u32 core_hw_dec_rev;
>>   
>>          struct mutex vpu_mutex; /* video_device lock */
>>          spinlock_t irqlock;
>> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>>          return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>>   }
>>   
>> -static inline bool
>> -hantro_needs_postproc(const struct hantro_ctx *ctx,
>> -                     const struct hantro_fmt *fmt)
>> -{
>> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
>> -}
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt);
>>   
>>   static inline dma_addr_t
>>   hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index f0b68e16fcc0..e3e6df28f470 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>>          }
>>          vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>>          vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
>> +       /* by default decoder is G1 */
>> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>>   
>>          ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>>          if (ret) {
>> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
>> index 6d2a8f2a8f0b..050880f720d6 100644
>> --- a/drivers/staging/media/hantro/hantro_postproc.c
>> +++ b/drivers/staging/media/hantro/hantro_postproc.c
>> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>>          .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>>   };
>>   
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +
>> +       if (ctx->is_encoder)
>> +               return false;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q
>> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
>> +               return false;
>> +
>> +       return false;
>> +}
>> +
>>   void hantro_postproc_enable(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>
>

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions
@ 2021-03-05  9:27       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:27 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:05, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Decoders hardware blocks could exist in multiple versions: add
>> a field to distinguish them at runtime.
>> G2 hardware block doesn't have postprocessor hantro_needs_postproc
>> function should always returns false in for this hardware.
>> hantro_needs_postproc function becoming to much complex to
>> stay inline in .h file move it to .c file.
>>
> Note that I already questioned this patch before:
>
> https://lkml.org/lkml/2021/2/17/722
>
> I think it's better to rely on of_device_id.data for this
> type of thing.
>
> In particular, I was expecting that just using
> hantro_variant.postproc_regs would be enough.
>
> Can you try if that works and avoid reading swreg(0)
> and probing the hardware core?

I have found a way to remove this: if the variant doesn't define
post processor formats, needs_postproc function will always returns
false and that what the only useful usage of this version field.

Benjamin

>
> Thanks!
> Ezequiel
>
>> Keep the default behavoir to be G1 hardware.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   drivers/staging/media/hantro/hantro.h          | 13 +++++++------
>>   drivers/staging/media/hantro/hantro_drv.c      |  2 ++
>>   drivers/staging/media/hantro/hantro_postproc.c | 17 +++++++++++++++++
>>   3 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index a76a0d79db9f..05876e426419 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -37,6 +37,9 @@ struct hantro_codec_ops;
>>   #define HANTRO_HEVC_DECODER    BIT(19)
>>   #define HANTRO_DECODERS                0xffff0000
>>   
>> +#define HANTRO_G1_REV          0x6731
>> +#define HANTRO_G2_REV          0x6732
>> +
>>   /**
>>    * struct hantro_irq - irq handler and name
>>    *
>> @@ -171,6 +174,7 @@ hantro_vdev_to_func(struct video_device *vdev)
>>    * @enc_base:          Mapped address of VPU encoder register for convenience.
>>    * @dec_base:          Mapped address of VPU decoder register for convenience.
>>    * @ctrl_base:         Mapped address of VPU control block.
>> + * @core_hw_dec_rev    Runtime detected HW decoder core revision
>>    * @vpu_mutex:         Mutex to synchronize V4L2 calls.
>>    * @irqlock:           Spinlock to synchronize access to data structures
>>    *                     shared with interrupt handlers.
>> @@ -190,6 +194,7 @@ struct hantro_dev {
>>          void __iomem *enc_base;
>>          void __iomem *dec_base;
>>          void __iomem *ctrl_base;
>> +       u32 core_hw_dec_rev;
>>   
>>          struct mutex vpu_mutex; /* video_device lock */
>>          spinlock_t irqlock;
>> @@ -412,12 +417,8 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
>>          return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
>>   }
>>   
>> -static inline bool
>> -hantro_needs_postproc(const struct hantro_ctx *ctx,
>> -                     const struct hantro_fmt *fmt)
>> -{
>> -       return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
>> -}
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt);
>>   
>>   static inline dma_addr_t
>>   hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index f0b68e16fcc0..e3e6df28f470 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -836,6 +836,8 @@ static int hantro_probe(struct platform_device *pdev)
>>          }
>>          vpu->enc_base = vpu->reg_bases[0] + vpu->variant->enc_offset;
>>          vpu->dec_base = vpu->reg_bases[0] + vpu->variant->dec_offset;
>> +       /* by default decoder is G1 */
>> +       vpu->core_hw_dec_rev = HANTRO_G1_REV;
>>   
>>          ret = dma_set_coherent_mask(vpu->dev, DMA_BIT_MASK(32));
>>          if (ret) {
>> diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c
>> index 6d2a8f2a8f0b..050880f720d6 100644
>> --- a/drivers/staging/media/hantro/hantro_postproc.c
>> +++ b/drivers/staging/media/hantro/hantro_postproc.c
>> @@ -50,6 +50,23 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = {
>>          .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
>>   };
>>   
>> +bool hantro_needs_postproc(const struct hantro_ctx *ctx,
>> +                          const struct hantro_fmt *fmt)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +
>> +       if (ctx->is_encoder)
>> +               return false;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G1_REV):q
>> +               return fmt->fourcc != V4L2_PIX_FMT_NV12;
>> +
>> +       if (vpu->core_hw_dec_rev == HANTRO_G2_REV)
>> +               return false;
>> +
>> +       return false;
>> +}
>> +
>>   void hantro_postproc_enable(struct hantro_ctx *ctx)
>>   {
>>          struct hantro_dev *vpu = ctx->dev;
>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
  2021-03-03 22:08     ` Ezequiel Garcia
  (?)
@ 2021-03-05  9:32       ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:32 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:08, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Add variant to IMX8M to enable G2/HEVC codec.
>> Define the capabilities for the hardware up to 3840x2160.
>> Retrieve the hardware version at init to distinguish G1 from G2.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 2:
>> - remove useless clocks
>>
>>   drivers/staging/media/hantro/hantro_drv.c   |  1 +
>>   drivers/staging/media/hantro/hantro_hw.h    |  1 +
>>   drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>>   3 files changed, 93 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index bc90a52f4d3d..976be7b6ecfb 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>>   #endif
>>   #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>>          { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
>> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>>   #endif
>>          { /* sentinel */ }
>>   };
>> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>> index dade3b0769c1..f61f58da05fe 100644
>> --- a/drivers/staging/media/hantro/hantro_hw.h
>> +++ b/drivers/staging/media/hantro/hantro_hw.h
>> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>>   extern const struct hantro_variant rk3328_vpu_variant;
>>   extern const struct hantro_variant rk3288_vpu_variant;
>>   extern const struct hantro_variant imx8mq_vpu_variant;
>> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>>   
>>   extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>>   
>> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> index d5b4312b9391..46b33531be85 100644
>> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> @@ -12,6 +12,7 @@
>>   #include "hantro.h"
>>   #include "hantro_jpeg.h"
>>   #include "hantro_g1_regs.h"
>> +#include "hantro_g2_regs.h"
>>   
>>   static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>>   {
>> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_NV12,
>> +               .codec_mode = HANTRO_MODE_NONE,
>> +       },
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
>> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
>> +               .max_depth = 2,
>> +               .frmsize = {
>> +                       .min_width = 48,
>> +                       .max_width = 3840,
>> +                       .step_width = MB_DIM,
>> +                       .min_height = 48,
>> +                       .max_height = 2160,
>> +                       .step_height = MB_DIM,
>> +               },
>> +       },
>> +};
>> +
>>   static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>   {
>>          struct hantro_dev *vpu = dev_id;
>> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>          return IRQ_HANDLED;
>>   }
>>   
>> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
>> +{
>> +       struct hantro_dev *vpu = dev_id;
>> +       enum vb2_buffer_state state;
>> +       u32 status;
>> +
>> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
>> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
>> +
>> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
>> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
> Is this clock gate enable needed on each interrupt?

Yes because if a reset as occur after init, it is the only
platform specific piece of code that is called.

>
>> +
>> +       hantro_irq_done(vpu, state);
>> +
>> +       return IRQ_HANDLED;
>> +}
>> +
>>   static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>>   {
>> -       vpu->dec_base = vpu->reg_bases[0];
>> +       int ret;
>> +
>> +       /* Check variant version */
>> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
>> +       if (ret) {
>> +               dev_err(vpu->dev, "Failed to enable clocks\n");
>> +               return ret;
>> +       }
>> +
>> +       /* Make that the device has been reset before read it id */
>> +       ret = device_reset(vpu->dev);
>> +       if (ret)
>> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
>> +
>> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
>> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>>   
>>          return 0;
>>   }
>> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
>> +       [HANTRO_MODE_HEVC_DEC] = {
>> +               .run = hantro_g2_hevc_dec_run,
>> +               .reset = imx8mq_vpu_reset,
>> +               .init = hantro_hevc_dec_init,
>> +               .exit = hantro_hevc_dec_exit,
>> +       },
>> +};
>> +
>>   /*
>>    * VPU variants.
>>    */
>>   
>>   static const struct hantro_irq imx8mq_irqs[] = {
>>          { "g1", imx8m_vpu_g1_irq },
>> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>>   };
>>   
>> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
>> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
>> +static const struct hantro_irq imx8mq_g2_irqs[] = {
>> +       { "g2", imx8m_vpu_g2_irq },
>> +};
>> +
>> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
>> +static const char * const imx8mq_reg_names[] = { "g1"};
>> +
>> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
>> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>>   
>>   const struct hantro_variant imx8mq_vpu_variant = {
>>          .dec_fmts = imx8m_vpu_dec_fmts,
>> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>>          .reg_names = imx8mq_reg_names,
>>          .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>>   };
>> +
>> +const struct hantro_variant imx8mq_vpu_g2_variant = {
>> +       .dec_offset = 0x0,
>> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
>> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
>> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
>> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
> Is this postproc_fmts correct?

No, I will remove it since G2 doesn't have postproc.

Benjamin

>
> Thanks!
> Ezequiel
>
>> +       .codec = HANTRO_HEVC_DECODER,
>> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
>> +       .init = imx8mq_vpu_hw_init,
>> +       .runtime_resume = imx8mq_runtime_resume,
>> +       .irqs = imx8mq_g2_irqs,
>> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
>> +       .clk_names = imx8mq_g2_clk_names,
>> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
>> +       .reg_names = imx8mq_g2_reg_names,
>> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
>> +};
>
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-05  9:32       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:32 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:08, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Add variant to IMX8M to enable G2/HEVC codec.
>> Define the capabilities for the hardware up to 3840x2160.
>> Retrieve the hardware version at init to distinguish G1 from G2.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 2:
>> - remove useless clocks
>>
>>   drivers/staging/media/hantro/hantro_drv.c   |  1 +
>>   drivers/staging/media/hantro/hantro_hw.h    |  1 +
>>   drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>>   3 files changed, 93 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index bc90a52f4d3d..976be7b6ecfb 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>>   #endif
>>   #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>>          { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
>> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>>   #endif
>>          { /* sentinel */ }
>>   };
>> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>> index dade3b0769c1..f61f58da05fe 100644
>> --- a/drivers/staging/media/hantro/hantro_hw.h
>> +++ b/drivers/staging/media/hantro/hantro_hw.h
>> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>>   extern const struct hantro_variant rk3328_vpu_variant;
>>   extern const struct hantro_variant rk3288_vpu_variant;
>>   extern const struct hantro_variant imx8mq_vpu_variant;
>> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>>   
>>   extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>>   
>> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> index d5b4312b9391..46b33531be85 100644
>> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> @@ -12,6 +12,7 @@
>>   #include "hantro.h"
>>   #include "hantro_jpeg.h"
>>   #include "hantro_g1_regs.h"
>> +#include "hantro_g2_regs.h"
>>   
>>   static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>>   {
>> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_NV12,
>> +               .codec_mode = HANTRO_MODE_NONE,
>> +       },
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
>> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
>> +               .max_depth = 2,
>> +               .frmsize = {
>> +                       .min_width = 48,
>> +                       .max_width = 3840,
>> +                       .step_width = MB_DIM,
>> +                       .min_height = 48,
>> +                       .max_height = 2160,
>> +                       .step_height = MB_DIM,
>> +               },
>> +       },
>> +};
>> +
>>   static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>   {
>>          struct hantro_dev *vpu = dev_id;
>> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>          return IRQ_HANDLED;
>>   }
>>   
>> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
>> +{
>> +       struct hantro_dev *vpu = dev_id;
>> +       enum vb2_buffer_state state;
>> +       u32 status;
>> +
>> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
>> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
>> +
>> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
>> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
> Is this clock gate enable needed on each interrupt?

Yes because if a reset as occur after init, it is the only
platform specific piece of code that is called.

>
>> +
>> +       hantro_irq_done(vpu, state);
>> +
>> +       return IRQ_HANDLED;
>> +}
>> +
>>   static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>>   {
>> -       vpu->dec_base = vpu->reg_bases[0];
>> +       int ret;
>> +
>> +       /* Check variant version */
>> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
>> +       if (ret) {
>> +               dev_err(vpu->dev, "Failed to enable clocks\n");
>> +               return ret;
>> +       }
>> +
>> +       /* Make that the device has been reset before read it id */
>> +       ret = device_reset(vpu->dev);
>> +       if (ret)
>> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
>> +
>> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
>> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>>   
>>          return 0;
>>   }
>> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
>> +       [HANTRO_MODE_HEVC_DEC] = {
>> +               .run = hantro_g2_hevc_dec_run,
>> +               .reset = imx8mq_vpu_reset,
>> +               .init = hantro_hevc_dec_init,
>> +               .exit = hantro_hevc_dec_exit,
>> +       },
>> +};
>> +
>>   /*
>>    * VPU variants.
>>    */
>>   
>>   static const struct hantro_irq imx8mq_irqs[] = {
>>          { "g1", imx8m_vpu_g1_irq },
>> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>>   };
>>   
>> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
>> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
>> +static const struct hantro_irq imx8mq_g2_irqs[] = {
>> +       { "g2", imx8m_vpu_g2_irq },
>> +};
>> +
>> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
>> +static const char * const imx8mq_reg_names[] = { "g1"};
>> +
>> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
>> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>>   
>>   const struct hantro_variant imx8mq_vpu_variant = {
>>          .dec_fmts = imx8m_vpu_dec_fmts,
>> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>>          .reg_names = imx8mq_reg_names,
>>          .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>>   };
>> +
>> +const struct hantro_variant imx8mq_vpu_g2_variant = {
>> +       .dec_offset = 0x0,
>> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
>> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
>> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
>> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
> Is this postproc_fmts correct?

No, I will remove it since G2 doesn't have postproc.

Benjamin

>
> Thanks!
> Ezequiel
>
>> +       .codec = HANTRO_HEVC_DECODER,
>> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
>> +       .init = imx8mq_vpu_hw_init,
>> +       .runtime_resume = imx8mq_runtime_resume,
>> +       .irqs = imx8mq_g2_irqs,
>> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
>> +       .clk_names = imx8mq_g2_clk_names,
>> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
>> +       .reg_names = imx8mq_g2_reg_names,
>> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
>> +};
>
>

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec
@ 2021-03-05  9:32       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-05  9:32 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 03/03/2021 à 23:08, Ezequiel Garcia a écrit :
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Add variant to IMX8M to enable G2/HEVC codec.
>> Define the capabilities for the hardware up to 3840x2160.
>> Retrieve the hardware version at init to distinguish G1 from G2.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 2:
>> - remove useless clocks
>>
>>   drivers/staging/media/hantro/hantro_drv.c   |  1 +
>>   drivers/staging/media/hantro/hantro_hw.h    |  1 +
>>   drivers/staging/media/hantro/imx8m_vpu_hw.c | 95 ++++++++++++++++++++-
>>   3 files changed, 93 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index bc90a52f4d3d..976be7b6ecfb 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -591,6 +591,7 @@ static const struct of_device_id of_hantro_match[] = {
>>   #endif
>>   #ifdef CONFIG_VIDEO_HANTRO_IMX8M
>>          { .compatible = "nxp,imx8mq-vpu", .data = &imx8mq_vpu_variant, },
>> +       { .compatible = "nxp,imx8mq-vpu-g2", .data = &imx8mq_vpu_g2_variant },
>>   #endif
>>          { /* sentinel */ }
>>   };
>> diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
>> index dade3b0769c1..f61f58da05fe 100644
>> --- a/drivers/staging/media/hantro/hantro_hw.h
>> +++ b/drivers/staging/media/hantro/hantro_hw.h
>> @@ -193,6 +193,7 @@ extern const struct hantro_variant rk3399_vpu_variant;
>>   extern const struct hantro_variant rk3328_vpu_variant;
>>   extern const struct hantro_variant rk3288_vpu_variant;
>>   extern const struct hantro_variant imx8mq_vpu_variant;
>> +extern const struct hantro_variant imx8mq_vpu_g2_variant;
>>   
>>   extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
>>   
>> diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> index d5b4312b9391..46b33531be85 100644
>> --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
>> @@ -12,6 +12,7 @@
>>   #include "hantro.h"
>>   #include "hantro_jpeg.h"
>>   #include "hantro_g1_regs.h"
>> +#include "hantro_g2_regs.h"
>>   
>>   static int imx8mq_runtime_resume(struct hantro_dev *vpu)
>>   {
>> @@ -90,6 +91,26 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_NV12,
>> +               .codec_mode = HANTRO_MODE_NONE,
>> +       },
>> +       {
>> +               .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
>> +               .codec_mode = HANTRO_MODE_HEVC_DEC,
>> +               .max_depth = 2,
>> +               .frmsize = {
>> +                       .min_width = 48,
>> +                       .max_width = 3840,
>> +                       .step_width = MB_DIM,
>> +                       .min_height = 48,
>> +                       .max_height = 2160,
>> +                       .step_height = MB_DIM,
>> +               },
>> +       },
>> +};
>> +
>>   static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>   {
>>          struct hantro_dev *vpu = dev_id;
>> @@ -108,9 +129,42 @@ static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id)
>>          return IRQ_HANDLED;
>>   }
>>   
>> +static irqreturn_t imx8m_vpu_g2_irq(int irq, void *dev_id)
>> +{
>> +       struct hantro_dev *vpu = dev_id;
>> +       enum vb2_buffer_state state;
>> +       u32 status;
>> +
>> +       status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +       state = (status & HEVC_REG_INTERRUPT_DEC_RDY_INT) ?
>> +                VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
>> +
>> +       vdpu_write(vpu, 0, HEVC_REG_INTERRUPT);
>> +       vdpu_write(vpu, HEVC_REG_CONFIG_DEC_CLK_GATE_E, HEVC_REG_CONFIG);
> Is this clock gate enable needed on each interrupt?

Yes because if a reset as occur after init, it is the only
platform specific piece of code that is called.

>
>> +
>> +       hantro_irq_done(vpu, state);
>> +
>> +       return IRQ_HANDLED;
>> +}
>> +
>>   static int imx8mq_vpu_hw_init(struct hantro_dev *vpu)
>>   {
>> -       vpu->dec_base = vpu->reg_bases[0];
>> +       int ret;
>> +
>> +       /* Check variant version */
>> +       ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
>> +       if (ret) {
>> +               dev_err(vpu->dev, "Failed to enable clocks\n");
>> +               return ret;
>> +       }
>> +
>> +       /* Make that the device has been reset before read it id */
>> +       ret = device_reset(vpu->dev);
>> +       if (ret)
>> +               dev_err(vpu->dev, "Failed to reset Hantro VPU\n");
>> +
>> +       vpu->core_hw_dec_rev = (vdpu_read(vpu, HEVC_REG_VERSION) >> 16) & 0xffff;
>> +       clk_bulk_disable_unprepare(vpu->variant->num_clocks, vpu->clocks);
>>   
>>          return 0;
>>   }
>> @@ -149,17 +203,32 @@ static const struct hantro_codec_ops imx8mq_vpu_codec_ops[] = {
>>          },
>>   };
>>   
>> +static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = {
>> +       [HANTRO_MODE_HEVC_DEC] = {
>> +               .run = hantro_g2_hevc_dec_run,
>> +               .reset = imx8mq_vpu_reset,
>> +               .init = hantro_hevc_dec_init,
>> +               .exit = hantro_hevc_dec_exit,
>> +       },
>> +};
>> +
>>   /*
>>    * VPU variants.
>>    */
>>   
>>   static const struct hantro_irq imx8mq_irqs[] = {
>>          { "g1", imx8m_vpu_g1_irq },
>> -       { "g2", NULL /* TODO: imx8m_vpu_g2_irq */ },
>>   };
>>   
>> -static const char * const imx8mq_clk_names[] = { "g1", "g2", "bus" };
>> -static const char * const imx8mq_reg_names[] = { "g1", "g2", "ctrl" };
>> +static const struct hantro_irq imx8mq_g2_irqs[] = {
>> +       { "g2", imx8m_vpu_g2_irq },
>> +};
>> +
>> +static const char * const imx8mq_clk_names[] = { "g1", "bus"};
>> +static const char * const imx8mq_reg_names[] = { "g1"};
>> +
>> +static const char * const imx8mq_g2_clk_names[] = { "g2", "bus"};
>> +static const char * const imx8mq_g2_reg_names[] = { "g2"};
>>   
>>   const struct hantro_variant imx8mq_vpu_variant = {
>>          .dec_fmts = imx8m_vpu_dec_fmts,
>> @@ -179,3 +248,21 @@ const struct hantro_variant imx8mq_vpu_variant = {
>>          .reg_names = imx8mq_reg_names,
>>          .num_regs = ARRAY_SIZE(imx8mq_reg_names)
>>   };
>> +
>> +const struct hantro_variant imx8mq_vpu_g2_variant = {
>> +       .dec_offset = 0x0,
>> +       .dec_fmts = imx8m_vpu_g2_dec_fmts,
>> +       .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts),
>> +       .postproc_fmts = imx8m_vpu_postproc_fmts,
>> +       .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts),
> Is this postproc_fmts correct?

No, I will remove it since G2 doesn't have postproc.

Benjamin

>
> Thanks!
> Ezequiel
>
>> +       .codec = HANTRO_HEVC_DECODER,
>> +       .codec_ops = imx8mq_vpu_g2_codec_ops,
>> +       .init = imx8mq_vpu_hw_init,
>> +       .runtime_resume = imx8mq_runtime_resume,
>> +       .irqs = imx8mq_g2_irqs,
>> +       .num_irqs = ARRAY_SIZE(imx8mq_g2_irqs),
>> +       .clk_names = imx8mq_g2_clk_names,
>> +       .num_clocks = ARRAY_SIZE(imx8mq_g2_clk_names),
>> +       .reg_names = imx8mq_g2_reg_names,
>> +       .num_regs = ARRAY_SIZE(imx8mq_g2_reg_names),
>> +};
>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  2021-03-03 11:39   ` Benjamin Gaignard
  (?)
@ 2021-03-08 20:08     ` Rob Herring
  -1 siblings, 0 replies; 66+ messages in thread
From: Rob Herring @ 2021-03-08 20:08 UTC (permalink / raw)
  To: Benjamin Gaignard
  Cc: p.zabel, peng.fan, linux-arm-kernel, linux-imx, linux-rockchip,
	shawnguo, mchehab, linux-kernel, robh+dt, dan.carpenter, kernel,
	gregkh, kernel, devicetree, wens, festevam, s.hauer, ezequiel,
	hverkuil-cisco, mripard, paul.kocialkowski, linux-media,
	jernej.skrabec

On Wed, 03 Mar 2021 12:39:51 +0100, Benjamin Gaignard wrote:
> The current bindings seem to make the assumption that the
> two VPUs hardware blocks (G1 and G2) are only one set of
> registers.
> After implementing the VPU reset driver and G2 decoder driver
> it shows that all the VPUs are independent and don't need to
> know about the registers of the other blocks.
> Remove from the bindings the need to set all blocks register
> but keep reg-names property because removing it from the driver
> may affect other variants.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - rebase the change on top of VPU reset patches:
>   https://www.spinics.net/lists/arm-kernel/msg878440.html
> 
> version 2:
> - be more verbose about why I change the bindings
> Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
> without that review and ack it won't work
> 
>  .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
>  1 file changed, 30 insertions(+), 16 deletions(-)
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
@ 2021-03-08 20:08     ` Rob Herring
  0 siblings, 0 replies; 66+ messages in thread
From: Rob Herring @ 2021-03-08 20:08 UTC (permalink / raw)
  To: Benjamin Gaignard
  Cc: p.zabel, peng.fan, linux-arm-kernel, linux-imx, linux-rockchip,
	shawnguo, mchehab, linux-kernel, robh+dt, dan.carpenter, kernel,
	gregkh, kernel, devicetree, wens, festevam, s.hauer, ezequiel,
	hverkuil-cisco, mripard, paul.kocialkowski, linux-media,
	jernej.skrabec

On Wed, 03 Mar 2021 12:39:51 +0100, Benjamin Gaignard wrote:
> The current bindings seem to make the assumption that the
> two VPUs hardware blocks (G1 and G2) are only one set of
> registers.
> After implementing the VPU reset driver and G2 decoder driver
> it shows that all the VPUs are independent and don't need to
> know about the registers of the other blocks.
> Remove from the bindings the need to set all blocks register
> but keep reg-names property because removing it from the driver
> may affect other variants.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - rebase the change on top of VPU reset patches:
>   https://www.spinics.net/lists/arm-kernel/msg878440.html
> 
> version 2:
> - be more verbose about why I change the bindings
> Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
> without that review and ack it won't work
> 
>  .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
>  1 file changed, 30 insertions(+), 16 deletions(-)
> 

Reviewed-by: Rob Herring <robh@kernel.org>

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings
@ 2021-03-08 20:08     ` Rob Herring
  0 siblings, 0 replies; 66+ messages in thread
From: Rob Herring @ 2021-03-08 20:08 UTC (permalink / raw)
  To: Benjamin Gaignard
  Cc: p.zabel, peng.fan, linux-arm-kernel, linux-imx, linux-rockchip,
	shawnguo, mchehab, linux-kernel, robh+dt, dan.carpenter, kernel,
	gregkh, kernel, devicetree, wens, festevam, s.hauer, ezequiel,
	hverkuil-cisco, mripard, paul.kocialkowski, linux-media,
	jernej.skrabec

On Wed, 03 Mar 2021 12:39:51 +0100, Benjamin Gaignard wrote:
> The current bindings seem to make the assumption that the
> two VPUs hardware blocks (G1 and G2) are only one set of
> registers.
> After implementing the VPU reset driver and G2 decoder driver
> it shows that all the VPUs are independent and don't need to
> know about the registers of the other blocks.
> Remove from the bindings the need to set all blocks register
> but keep reg-names property because removing it from the driver
> may affect other variants.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - rebase the change on top of VPU reset patches:
>   https://www.spinics.net/lists/arm-kernel/msg878440.html
> 
> version 2:
> - be more verbose about why I change the bindings
> Keep in mind that series comes after: https://www.spinics.net/lists/arm-kernel/msg875766.html
> without that review and ack it won't work
> 
>  .../bindings/media/nxp,imx8mq-vpu.yaml        | 46 ++++++++++++-------
>  1 file changed, 30 insertions(+), 16 deletions(-)
> 

Reviewed-by: Rob Herring <robh@kernel.org>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
  2021-03-03 11:39   ` Benjamin Gaignard
  (?)
@ 2021-03-16 18:46     ` Ezequiel Garcia
  -1 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 18:46 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

Hi Benjamin,

The series is looking really good. Some comments below.

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Implement all the logic to get G2 hardware decoding HEVC frames.
> It support up level 5.1 HEVC stream.
> It doesn't support yet 10 bits formats or scaling feature.
> 
> Add HANTRO HEVC dedicated control to skip some bits at the beginning
> of the slice header. That is very specific to this hardware so can't
> go into uapi structures. Compute the needed value is complex and require
> information from the stream that only the userland knows so let it
> provide the correct value to the driver.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - fix Ezequiel comments
> - use dedicated control as an integer
> - change hantro_g2_hevc_dec_run prototype to return errors
> 
> version 2:
> - squash multiple commits in this one.
> - fix the comments done by Ezequiel about dma_alloc_coherent usage
> - fix Dan's comments about control copy, reverse the test logic
> in tile_buffer_reallocate, rework some goto and return cases.
> 
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  18 +
>  drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>  .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>  drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>  7 files changed, 1228 insertions(+)
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>  create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> 
> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> index 743ce08eb184..0357f1772267 100644
> --- a/drivers/staging/media/hantro/Makefile
> +++ b/drivers/staging/media/hantro/Makefile
> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>                 hantro_h1_jpeg_enc.o \
>                 hantro_g1_h264_dec.o \
>                 hantro_g1_mpeg2_dec.o \
> +               hantro_g2_hevc_dec.o \
>                 hantro_g1_vp8_dec.o \
>                 rk3399_vpu_hw_jpeg_enc.o \
>                 rk3399_vpu_hw_mpeg2_dec.o \
>                 rk3399_vpu_hw_vp8_dec.o \
>                 hantro_jpeg.o \
>                 hantro_h264.o \
> +               hantro_hevc.o \
>                 hantro_mpeg2.o \
>                 hantro_vp8.o
>  
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index 05876e426419..a9b80b2c9124 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -225,6 +225,7 @@ struct hantro_dev {
>   * @jpeg_enc:          JPEG-encoding context.
>   * @mpeg2_dec:         MPEG-2-decoding context.
>   * @vp8_dec:           VP8-decoding context.
> + * @hevc_dec:          HEVC-decoding context.
>   */
>  struct hantro_ctx {
>         struct hantro_dev *dev;
> @@ -251,6 +252,7 @@ struct hantro_ctx {
>                 struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>                 struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>                 struct hantro_vp8_dec_hw_ctx vp8_dec;
> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>         };
>  };
>  
> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>         return vb2_dma_contig_plane_dma_addr(vb, 0);
>  }
>  
> +static inline size_t
> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].size;
> +       return vb2_plane_size(vb, 0);
> +}
> +
> +static inline void *
> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].cpu;
> +       return vb2_plane_vaddr(vb, 0);
> +}
> +

Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

>  void hantro_postproc_disable(struct hantro_ctx *ctx);
>  void hantro_postproc_enable(struct hantro_ctx *ctx);
>  void hantro_postproc_free(struct hantro_ctx *ctx);
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e3e6df28f470..bc90a52f4d3d 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -30,6 +30,13 @@
>  
>  #define DRIVER_NAME "hantro-vpu"
>  
> +/*
> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> + * the number of data (in bits) to skip in the
> + * slice segment header syntax after 'slice type' token
> + */

I think we need to document this better, so applications can
correctly use the control. From i.MX reference code, it seems
this needs to be used as follows:

If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
to before syntax element "slice_temporal_mvp_enabled_flag".

If IDR, the skipped bits are just "pic_output_flag"
(separate_colour_plane_flag is not supported).

And it seems this needs to be passed parsing only the first slice,
given this syntax remains invariant across all the slices.

> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> +
>  int hantro_debug;
>  module_param_named(debug, hantro_debug, int, 0644);
>  MODULE_PARM_DESC(debug,
> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>         return 0;
>  }
>  
> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> +{
> +       struct hantro_ctx *ctx;
> +
> +       ctx = container_of(ctrl->handler,
> +                          struct hantro_ctx, ctrl_handler);
> +
> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> +
> +       switch (ctrl->id) {
> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
>  static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>         .try_ctrl = hantro_try_ctrl,
>  };
> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>         .s_ctrl = hantro_jpeg_s_ctrl,
>  };
>  
> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> +       .s_ctrl = hantro_hevc_s_ctrl,
> +};
> +
>  static const struct hantro_ctrl controls[] = {
>         {
>                 .codec = HANTRO_JPEG_ENCODER,
> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>                 .cfg = {
>                         .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>                 },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> +                       .name = "Hantro HEVC slice header skip bytes",
> +                       .type = V4L2_CTRL_TYPE_INTEGER,
> +                       .min = 0,
> +                       .def = 0,
> +                       .max = 0x7fffffff,
> +                       .step = 1,
> +                       .ops = &hantro_hevc_ctrl_ops,
> +               },
> +       }, {
> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> +                        HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_USER_CLASS,

This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
or by the spec? 

> +                       .name = "HANTRO controls",
> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> +               },
>         },
>  };
>  
> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> new file mode 100644
> index 000000000000..5d75b36bc40c
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> @@ -0,0 +1,587 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include "hantro_hw.h"
> +#include "hantro_g2_regs.h"
> +
> +#define HEVC_DEC_MODE  0xC
> +
> +#define BUS_WIDTH_32           0
> +#define BUS_WIDTH_64           1
> +#define BUS_WIDTH_128          2
> +#define BUS_WIDTH_256          3
> +
> +static inline void hantro_write_addr(struct hantro_dev *vpu,
> +                                    unsigned long offset,
> +                                    dma_addr_t addr)
> +{
> +       vdpu_write(vpu, addr & 0xffffffff, offset);
> +}
> +
> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> +       unsigned int max_log2_ctb_size, ctb_size;
> +       bool tiles_enabled, uniform_spacing;
> +       u32 no_chroma = 0;
> +
> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> +
> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> +
> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> +                           sps->log2_diff_max_min_luma_coding_block_size;
> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> +                            >> max_log2_ctb_size;
> +       ctb_size = 1 << max_log2_ctb_size;
> +
> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> +
> +       if (tiles_enabled) {
> +               unsigned int i, j, h;
> +
> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> +
> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> +
> +               /* write width + height for each tile in pic */
> +               if (!uniform_spacing) {
> +                       u32 tmp_w = 0, tmp_h = 0;
> +
> +                       for (i = 0; i < num_tile_rows; i++) {
> +                               if (i == num_tile_rows - 1)
> +                                       h = pic_height_in_ctbs - tmp_h;
> +                               else
> +                                       h = pps->row_height_minus1[i] + 1;
> +                               tmp_h += h;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> +                                       tmp_w += pps->column_width_minus1[j] + 1;
> +                                       *p++ = pps->column_width_minus1[j + 1];
> +                                       *p++ = h;
> +                                       if (i == 0 && h == 1 && ctb_size == 16)
> +                                               no_chroma = 1;
> +                               }
> +                               /* last column */
> +                               *p++ = pic_width_in_ctbs - tmp_w;
> +                               *p++ = h;
> +                       }
> +               } else { /* uniform spacing */
> +                       u32 tmp, prev_h, prev_w;
> +
> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> +                               h = tmp - prev_h;
> +                               prev_h = tmp;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> +                                       *p++ = tmp - prev_w;
> +                                       *p++ = h;
> +                                       if (j == 0 &&
> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> +                                           ctb_size == 16)
> +                                               no_chroma = 1;
> +                                       prev_w = tmp;
> +                               }
> +                       }
> +               }
> +       } else {
> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> +
> +               /* There's one tile, with dimensions equal to pic size. */
> +               p[0] = pic_width_in_ctbs;
> +               p[1] = pic_height_in_ctbs;
> +       }
> +
> +       if (no_chroma)
> +               vpu_debug(1, "%s: no chroma!\n", __func__);
> +}
> +
> +static void set_params(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       struct hantro_dev *vpu = ctx->dev;
> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> +       u32 pic_width_aligned, pic_height_aligned;
> +       u32 partial_ctb_x, partial_ctb_y;
> +
> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> +
> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> +
> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> +
> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> +
> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> +
> +       min_cb_size = 1 << min_log2_cb_size;
> +       max_ctb_size = 1 << max_log2_ctb_size;
> +
> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> +
> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> +
> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> +
> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> +                        sps->max_transform_hierarchy_depth_inter);
> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> +                        sps->max_transform_hierarchy_depth_intra);
> +       hantro_reg_write(vpu, hevc_min_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_max_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> +                        sps->log2_diff_max_min_luma_transform_block_size);
> +
> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> +       hantro_reg_write(vpu, hevc_asym_pred_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> +       hantro_reg_write(vpu, hevc_sao_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> +       hantro_reg_write(vpu, hevc_sign_data_hide,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> +       }
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> +       hantro_reg_write(vpu, hevc_transq_bypass,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> +       hantro_reg_write(vpu, hevc_list_mod_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_idr_pic_e,
> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> +       hantro_reg_write(vpu, hevc_parallel_merge,
> +                        pps->log2_parallel_merge_level_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> +       hantro_reg_write(vpu, hevc_pcm_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> +               hantro_reg_write(vpu, hevc_max_pcm_size,
> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_min_pcm_size,
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> +       } else {
> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> +       hantro_reg_write(vpu, hevc_weight_pred_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_const_intra_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> +       hantro_reg_write(vpu, hevc_transform_skip,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> +       hantro_reg_write(vpu, hevc_dependent_slice,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> +       hantro_reg_write(vpu, hevc_filter_override,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> +       hantro_reg_write(vpu, hevc_refidx0_active,
> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_refidx1_active,
> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> +}
> +
> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> +{
> +       int i;
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> +                       return i;
> +       }
> +
> +       return 0x0;
> +}
> +
> +static void set_ref_pic_list(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       const struct hantro_reg *ref_pic_regs0[] = {
> +               hevc_rlist_f0,
> +               hevc_rlist_f1,
> +               hevc_rlist_f2,
> +               hevc_rlist_f3,
> +               hevc_rlist_f4,
> +               hevc_rlist_f5,
> +               hevc_rlist_f6,
> +               hevc_rlist_f7,
> +               hevc_rlist_f8,
> +               hevc_rlist_f9,
> +               hevc_rlist_f10,
> +               hevc_rlist_f11,
> +               hevc_rlist_f12,
> +               hevc_rlist_f13,
> +               hevc_rlist_f14,
> +               hevc_rlist_f15,
> +       };
> +       const struct hantro_reg *ref_pic_regs1[] = {
> +               hevc_rlist_b0,
> +               hevc_rlist_b1,
> +               hevc_rlist_b2,
> +               hevc_rlist_b3,
> +               hevc_rlist_b4,
> +               hevc_rlist_b5,
> +               hevc_rlist_b6,
> +               hevc_rlist_b7,
> +               hevc_rlist_b8,
> +               hevc_rlist_b9,
> +               hevc_rlist_b10,
> +               hevc_rlist_b11,
> +               hevc_rlist_b12,
> +               hevc_rlist_b13,
> +               hevc_rlist_b14,
> +               hevc_rlist_b15,
> +       };
> +       unsigned int i, j;
> +
> +       /* List 0 contains: short term before, short term after and long term */
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       /* Fill the list, copying over and over */
> +       i = 0;
> +       while (j < ARRAY_SIZE(list0))
> +               list0[j++] = list0[i++];
> +
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       i = 0;
> +       while (j < ARRAY_SIZE(list1))
> +               list1[j++] = list1[i++];
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> +       }
> +}
> +
> +static int set_ref(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> +       struct hantro_dev *vpu = ctx->dev;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> +       u32 max_ref_frames;
> +       u16 dpb_longterm_e;
> +
> +       const struct hantro_reg *cur_poc[] = {
> +               hevc_cur_poc_00,
> +               hevc_cur_poc_01,
> +               hevc_cur_poc_02,
> +               hevc_cur_poc_03,
> +               hevc_cur_poc_04,
> +               hevc_cur_poc_05,
> +               hevc_cur_poc_06,
> +               hevc_cur_poc_07,
> +               hevc_cur_poc_08,
> +               hevc_cur_poc_09,
> +               hevc_cur_poc_10,
> +               hevc_cur_poc_11,
> +               hevc_cur_poc_12,
> +               hevc_cur_poc_13,
> +               hevc_cur_poc_14,
> +               hevc_cur_poc_15,
> +       };
> +       unsigned int i;
> +
> +       max_ref_frames = decode_params->num_poc_lt_curr +
> +               decode_params->num_poc_st_curr_before +
> +               decode_params->num_poc_st_curr_after;
> +       /*
> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> +        * badly marked I-frames.
> +        */
> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> +       hantro_reg_write(vpu, hevc_filter_over_slices,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> +
> +       /*
> +        * Write POC count diff from current pic. For frame decoding only compute
> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> +        */
> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> +
> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> +       }
> +
> +       if (i < ARRAY_SIZE(cur_poc)) {
> +               /*
> +                * After the references, fill one entry pointing to itself,
> +                * i.e. difference is zero.
> +                */
> +               hantro_reg_write(vpu, cur_poc[i], 0);
> +               i++;
> +       }
> +
> +       /* Fill the rest with the current picture */
> +       for (; i < ARRAY_SIZE(cur_poc); i++)
> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> +
> +       set_ref_pic_list(ctx);
> +
> +       /* We will only keep the references picture that are still used */
> +       ctx->hevc_dec.ref_bufs_used = 0;
> +
> +       /* Set up addresses of DPB buffers */
> +       dpb_longterm_e = 0;
> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> +               if (!luma_addr)
> +                       return -ENOMEM;
> +
> +               chroma_addr = luma_addr + cr_offset;
> +               mv_addr = luma_addr + mv_offset;
> +
> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> +
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> +       }
> +
> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> +       if (!luma_addr)
> +               return -ENOMEM;
> +
> +       chroma_addr = luma_addr + cr_offset;
> +       mv_addr = luma_addr + mv_offset;
> +
> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> +
> +       hantro_hevc_ref_remove_unused(ctx);
> +
> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> +
> +       return 0;
> +}
> +
> +static void set_buffers(struct hantro_ctx *ctx)
> +{
> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       dma_addr_t src_dma, dst_dma;
> +       u32 src_len, src_buf_len;
> +
> +       src_buf = hantro_get_src_buf(ctx);
> +       dst_buf = hantro_get_dst_buf(ctx);
> +
> +       /* Source (stream) buffer. */
> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> +
> +       /* Destination (decoded frame) buffer. */
> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> +
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> +}
> +
> +void hantro_g2_check_idle(struct hantro_dev *vpu)
> +{
> +       int i;
> +
> +       for (i = 0; i < 3; i++) {
> +               u32 status;
> +
> +               /* Make sure the VPU is idle */
> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);

How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
the warning "device still running, aborting" (I personally dislike the abort
metaphor, but guess it's OK here).

> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> +               }
> +       }
> +}
> +
> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       int ret;
> +
> +       hantro_g2_check_idle(vpu);
> +
> +       /* Prepare HEVC decoder context. */
> +       ret = hantro_hevc_dec_prepare_run(ctx);
> +       if (ret)
> +               return ret;
> +
> +       /* Configure hardware registers. */
> +       set_params(ctx);
> +
> +       /* set reference pictures */
> +       ret = set_ref(ctx);
> +       if (ret)
> +               return ret;
> +
> +       set_buffers(ctx);
> +       prepare_tile_info_buffer(ctx);
> +
> +       hantro_end_prepare_run(ctx);
> +
> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> +
> +       /* Don't disable output */
> +       hantro_reg_write(vpu, hevc_out_dis, 0);
> +
> +       /* Don't compress buffers */
> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> +
> +       /* use NV12 as output format */
> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> +
> +       /* Bus width and max burst */
> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> +       hantro_reg_write(vpu, hevc_max_burst, 16);
> +
> +       /* Swap */
> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> +
> +       /* Start decoding! */
> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> +
> +       return 0;
> +}
> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> new file mode 100644
> index 000000000000..a361c9ba911d
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> @@ -0,0 +1,198 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2021, Collabora
> + *
> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> + */
> +
> +#ifndef HANTRO_G2_REGS_H_
> +#define HANTRO_G2_REGS_H_
> +
> +#include "hantro.h"
> +
> +#define G2_SWREG(nr)   ((nr) * 4)
> +
> +#define HEVC_DEC_REG(name, base, shift, mask) \
> +       static const struct hantro_reg _hevc_##name[] = { \
> +               { G2_SWREG(base), (shift), (mask) } \
> +       }; \
> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> +
> +#define HEVC_REG_VERSION               G2_SWREG(0)
> +
> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> +
> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> +
> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> +
> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> +
> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> +
> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> +
> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> +
> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> +
> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> +
> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> +
> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> +
> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> +
> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> +
> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> +
> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> +
> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> +
> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> +
> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> +
> +#define HEVC_ADDR_DST          (G2_SWREG(65))
> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> +#define HEVC_ADDR_STR          (G2_SWREG(169))
> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> +#define HEVC_TILE_SAO          (G2_SWREG(181))
> +#define HEVC_TILE_BSD          (G2_SWREG(183))
> +
> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> +
> +#endif
> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> new file mode 100644
> index 000000000000..8e319a837ff3
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_hevc.c
> @@ -0,0 +1,321 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include <linux/types.h>
> +#include <media/v4l2-mem2mem.h>
> +
> +#include "hantro.h"
> +#include "hantro_hw.h"
> +
> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> +/*
> + * BSD control data of current picture at tile border
> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> + */
> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> +/* tile border coefficients of filter */
> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> +
> +#define MAX_TILE_COLS 20
> +#define MAX_TILE_ROWS 22
> +
> +#define UNUSED_REF     -1
> +
> +#define G2_ALIGN               16
> +#define MC_WORD_SIZE           32
> +
> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> +
> +       return sps->pic_width_in_luma_samples *
> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> +}
> +
> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +
> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> +}
> +
> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> +       size_t mv_size;
> +
> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> +                 (1 << (2 * (8 - 4))) * 16) + 32;
> +
> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> +
> +       return mv_size;
> +}
> +
> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +
> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> +}
> +
> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       struct hantro_dev *vpu = ctx->dev;
> +       int i;
> +
> +       /* Just tag buffer as unused, do not free them */

This comment seems wrong.

> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs[i].cpu) {
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));

Is this memset clearing the buffer required? If we're getting artifacts
from previous decodes, then that would be more of a bug somewhere.

> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> +                                         hevc_dec->ref_bufs[i].cpu,
> +                                         hevc_dec->ref_bufs[i].dma);
> +               }
> +       }
> +}
> +
> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> +}
> +
> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> +                                  int poc)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       /* Find the reference buffer in already know ones */
> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       return hevc_dec->ref_bufs[i].dma;
> +               }
> +       }
> +
> +       /* Allocate a new reference buffer */
> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> +                       if (!hevc_dec->ref_bufs[i].cpu) {
> +                               struct hantro_dev *vpu = ctx->dev;
> +
> +                               hevc_dec->ref_bufs[i].cpu =
> +                                       dma_alloc_coherent(vpu->dev,
> +                                                          hantro_hevc_ref_size(ctx),
> +                                                          &hevc_dec->ref_bufs[i].dma,
> +                                                          GFP_KERNEL);

Is there any reason why we need to allocate reference buffers and MV contiguously?

> +                               if (!hevc_dec->ref_bufs[i].cpu)
> +                                       return 0;
> +
> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> +                       }
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));


I believe the coherent allocation is to be able to clear each reference, but is this
really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?

Also, if that's the case, then allocating the MV buffer separatedly will allow
to not allocate the reference buffers coherently (note that we use NO_MAPPING
in the vb2_queue, so the vb2_buffers shouldn't be coherent).

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 18:46     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 18:46 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

Hi Benjamin,

The series is looking really good. Some comments below.

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Implement all the logic to get G2 hardware decoding HEVC frames.
> It support up level 5.1 HEVC stream.
> It doesn't support yet 10 bits formats or scaling feature.
> 
> Add HANTRO HEVC dedicated control to skip some bits at the beginning
> of the slice header. That is very specific to this hardware so can't
> go into uapi structures. Compute the needed value is complex and require
> information from the stream that only the userland knows so let it
> provide the correct value to the driver.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - fix Ezequiel comments
> - use dedicated control as an integer
> - change hantro_g2_hevc_dec_run prototype to return errors
> 
> version 2:
> - squash multiple commits in this one.
> - fix the comments done by Ezequiel about dma_alloc_coherent usage
> - fix Dan's comments about control copy, reverse the test logic
> in tile_buffer_reallocate, rework some goto and return cases.
> 
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  18 +
>  drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>  .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>  drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>  7 files changed, 1228 insertions(+)
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>  create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> 
> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> index 743ce08eb184..0357f1772267 100644
> --- a/drivers/staging/media/hantro/Makefile
> +++ b/drivers/staging/media/hantro/Makefile
> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>                 hantro_h1_jpeg_enc.o \
>                 hantro_g1_h264_dec.o \
>                 hantro_g1_mpeg2_dec.o \
> +               hantro_g2_hevc_dec.o \
>                 hantro_g1_vp8_dec.o \
>                 rk3399_vpu_hw_jpeg_enc.o \
>                 rk3399_vpu_hw_mpeg2_dec.o \
>                 rk3399_vpu_hw_vp8_dec.o \
>                 hantro_jpeg.o \
>                 hantro_h264.o \
> +               hantro_hevc.o \
>                 hantro_mpeg2.o \
>                 hantro_vp8.o
>  
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index 05876e426419..a9b80b2c9124 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -225,6 +225,7 @@ struct hantro_dev {
>   * @jpeg_enc:          JPEG-encoding context.
>   * @mpeg2_dec:         MPEG-2-decoding context.
>   * @vp8_dec:           VP8-decoding context.
> + * @hevc_dec:          HEVC-decoding context.
>   */
>  struct hantro_ctx {
>         struct hantro_dev *dev;
> @@ -251,6 +252,7 @@ struct hantro_ctx {
>                 struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>                 struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>                 struct hantro_vp8_dec_hw_ctx vp8_dec;
> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>         };
>  };
>  
> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>         return vb2_dma_contig_plane_dma_addr(vb, 0);
>  }
>  
> +static inline size_t
> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].size;
> +       return vb2_plane_size(vb, 0);
> +}
> +
> +static inline void *
> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].cpu;
> +       return vb2_plane_vaddr(vb, 0);
> +}
> +

Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

>  void hantro_postproc_disable(struct hantro_ctx *ctx);
>  void hantro_postproc_enable(struct hantro_ctx *ctx);
>  void hantro_postproc_free(struct hantro_ctx *ctx);
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e3e6df28f470..bc90a52f4d3d 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -30,6 +30,13 @@
>  
>  #define DRIVER_NAME "hantro-vpu"
>  
> +/*
> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> + * the number of data (in bits) to skip in the
> + * slice segment header syntax after 'slice type' token
> + */

I think we need to document this better, so applications can
correctly use the control. From i.MX reference code, it seems
this needs to be used as follows:

If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
to before syntax element "slice_temporal_mvp_enabled_flag".

If IDR, the skipped bits are just "pic_output_flag"
(separate_colour_plane_flag is not supported).

And it seems this needs to be passed parsing only the first slice,
given this syntax remains invariant across all the slices.

> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> +
>  int hantro_debug;
>  module_param_named(debug, hantro_debug, int, 0644);
>  MODULE_PARM_DESC(debug,
> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>         return 0;
>  }
>  
> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> +{
> +       struct hantro_ctx *ctx;
> +
> +       ctx = container_of(ctrl->handler,
> +                          struct hantro_ctx, ctrl_handler);
> +
> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> +
> +       switch (ctrl->id) {
> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
>  static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>         .try_ctrl = hantro_try_ctrl,
>  };
> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>         .s_ctrl = hantro_jpeg_s_ctrl,
>  };
>  
> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> +       .s_ctrl = hantro_hevc_s_ctrl,
> +};
> +
>  static const struct hantro_ctrl controls[] = {
>         {
>                 .codec = HANTRO_JPEG_ENCODER,
> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>                 .cfg = {
>                         .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>                 },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> +                       .name = "Hantro HEVC slice header skip bytes",
> +                       .type = V4L2_CTRL_TYPE_INTEGER,
> +                       .min = 0,
> +                       .def = 0,
> +                       .max = 0x7fffffff,
> +                       .step = 1,
> +                       .ops = &hantro_hevc_ctrl_ops,
> +               },
> +       }, {
> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> +                        HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_USER_CLASS,

This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
or by the spec? 

> +                       .name = "HANTRO controls",
> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> +               },
>         },
>  };
>  
> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> new file mode 100644
> index 000000000000..5d75b36bc40c
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> @@ -0,0 +1,587 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include "hantro_hw.h"
> +#include "hantro_g2_regs.h"
> +
> +#define HEVC_DEC_MODE  0xC
> +
> +#define BUS_WIDTH_32           0
> +#define BUS_WIDTH_64           1
> +#define BUS_WIDTH_128          2
> +#define BUS_WIDTH_256          3
> +
> +static inline void hantro_write_addr(struct hantro_dev *vpu,
> +                                    unsigned long offset,
> +                                    dma_addr_t addr)
> +{
> +       vdpu_write(vpu, addr & 0xffffffff, offset);
> +}
> +
> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> +       unsigned int max_log2_ctb_size, ctb_size;
> +       bool tiles_enabled, uniform_spacing;
> +       u32 no_chroma = 0;
> +
> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> +
> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> +
> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> +                           sps->log2_diff_max_min_luma_coding_block_size;
> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> +                            >> max_log2_ctb_size;
> +       ctb_size = 1 << max_log2_ctb_size;
> +
> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> +
> +       if (tiles_enabled) {
> +               unsigned int i, j, h;
> +
> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> +
> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> +
> +               /* write width + height for each tile in pic */
> +               if (!uniform_spacing) {
> +                       u32 tmp_w = 0, tmp_h = 0;
> +
> +                       for (i = 0; i < num_tile_rows; i++) {
> +                               if (i == num_tile_rows - 1)
> +                                       h = pic_height_in_ctbs - tmp_h;
> +                               else
> +                                       h = pps->row_height_minus1[i] + 1;
> +                               tmp_h += h;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> +                                       tmp_w += pps->column_width_minus1[j] + 1;
> +                                       *p++ = pps->column_width_minus1[j + 1];
> +                                       *p++ = h;
> +                                       if (i == 0 && h == 1 && ctb_size == 16)
> +                                               no_chroma = 1;
> +                               }
> +                               /* last column */
> +                               *p++ = pic_width_in_ctbs - tmp_w;
> +                               *p++ = h;
> +                       }
> +               } else { /* uniform spacing */
> +                       u32 tmp, prev_h, prev_w;
> +
> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> +                               h = tmp - prev_h;
> +                               prev_h = tmp;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> +                                       *p++ = tmp - prev_w;
> +                                       *p++ = h;
> +                                       if (j == 0 &&
> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> +                                           ctb_size == 16)
> +                                               no_chroma = 1;
> +                                       prev_w = tmp;
> +                               }
> +                       }
> +               }
> +       } else {
> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> +
> +               /* There's one tile, with dimensions equal to pic size. */
> +               p[0] = pic_width_in_ctbs;
> +               p[1] = pic_height_in_ctbs;
> +       }
> +
> +       if (no_chroma)
> +               vpu_debug(1, "%s: no chroma!\n", __func__);
> +}
> +
> +static void set_params(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       struct hantro_dev *vpu = ctx->dev;
> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> +       u32 pic_width_aligned, pic_height_aligned;
> +       u32 partial_ctb_x, partial_ctb_y;
> +
> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> +
> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> +
> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> +
> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> +
> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> +
> +       min_cb_size = 1 << min_log2_cb_size;
> +       max_ctb_size = 1 << max_log2_ctb_size;
> +
> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> +
> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> +
> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> +
> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> +                        sps->max_transform_hierarchy_depth_inter);
> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> +                        sps->max_transform_hierarchy_depth_intra);
> +       hantro_reg_write(vpu, hevc_min_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_max_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> +                        sps->log2_diff_max_min_luma_transform_block_size);
> +
> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> +       hantro_reg_write(vpu, hevc_asym_pred_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> +       hantro_reg_write(vpu, hevc_sao_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> +       hantro_reg_write(vpu, hevc_sign_data_hide,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> +       }
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> +       hantro_reg_write(vpu, hevc_transq_bypass,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> +       hantro_reg_write(vpu, hevc_list_mod_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_idr_pic_e,
> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> +       hantro_reg_write(vpu, hevc_parallel_merge,
> +                        pps->log2_parallel_merge_level_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> +       hantro_reg_write(vpu, hevc_pcm_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> +               hantro_reg_write(vpu, hevc_max_pcm_size,
> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_min_pcm_size,
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> +       } else {
> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> +       hantro_reg_write(vpu, hevc_weight_pred_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_const_intra_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> +       hantro_reg_write(vpu, hevc_transform_skip,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> +       hantro_reg_write(vpu, hevc_dependent_slice,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> +       hantro_reg_write(vpu, hevc_filter_override,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> +       hantro_reg_write(vpu, hevc_refidx0_active,
> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_refidx1_active,
> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> +}
> +
> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> +{
> +       int i;
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> +                       return i;
> +       }
> +
> +       return 0x0;
> +}
> +
> +static void set_ref_pic_list(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       const struct hantro_reg *ref_pic_regs0[] = {
> +               hevc_rlist_f0,
> +               hevc_rlist_f1,
> +               hevc_rlist_f2,
> +               hevc_rlist_f3,
> +               hevc_rlist_f4,
> +               hevc_rlist_f5,
> +               hevc_rlist_f6,
> +               hevc_rlist_f7,
> +               hevc_rlist_f8,
> +               hevc_rlist_f9,
> +               hevc_rlist_f10,
> +               hevc_rlist_f11,
> +               hevc_rlist_f12,
> +               hevc_rlist_f13,
> +               hevc_rlist_f14,
> +               hevc_rlist_f15,
> +       };
> +       const struct hantro_reg *ref_pic_regs1[] = {
> +               hevc_rlist_b0,
> +               hevc_rlist_b1,
> +               hevc_rlist_b2,
> +               hevc_rlist_b3,
> +               hevc_rlist_b4,
> +               hevc_rlist_b5,
> +               hevc_rlist_b6,
> +               hevc_rlist_b7,
> +               hevc_rlist_b8,
> +               hevc_rlist_b9,
> +               hevc_rlist_b10,
> +               hevc_rlist_b11,
> +               hevc_rlist_b12,
> +               hevc_rlist_b13,
> +               hevc_rlist_b14,
> +               hevc_rlist_b15,
> +       };
> +       unsigned int i, j;
> +
> +       /* List 0 contains: short term before, short term after and long term */
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       /* Fill the list, copying over and over */
> +       i = 0;
> +       while (j < ARRAY_SIZE(list0))
> +               list0[j++] = list0[i++];
> +
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       i = 0;
> +       while (j < ARRAY_SIZE(list1))
> +               list1[j++] = list1[i++];
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> +       }
> +}
> +
> +static int set_ref(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> +       struct hantro_dev *vpu = ctx->dev;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> +       u32 max_ref_frames;
> +       u16 dpb_longterm_e;
> +
> +       const struct hantro_reg *cur_poc[] = {
> +               hevc_cur_poc_00,
> +               hevc_cur_poc_01,
> +               hevc_cur_poc_02,
> +               hevc_cur_poc_03,
> +               hevc_cur_poc_04,
> +               hevc_cur_poc_05,
> +               hevc_cur_poc_06,
> +               hevc_cur_poc_07,
> +               hevc_cur_poc_08,
> +               hevc_cur_poc_09,
> +               hevc_cur_poc_10,
> +               hevc_cur_poc_11,
> +               hevc_cur_poc_12,
> +               hevc_cur_poc_13,
> +               hevc_cur_poc_14,
> +               hevc_cur_poc_15,
> +       };
> +       unsigned int i;
> +
> +       max_ref_frames = decode_params->num_poc_lt_curr +
> +               decode_params->num_poc_st_curr_before +
> +               decode_params->num_poc_st_curr_after;
> +       /*
> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> +        * badly marked I-frames.
> +        */
> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> +       hantro_reg_write(vpu, hevc_filter_over_slices,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> +
> +       /*
> +        * Write POC count diff from current pic. For frame decoding only compute
> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> +        */
> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> +
> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> +       }
> +
> +       if (i < ARRAY_SIZE(cur_poc)) {
> +               /*
> +                * After the references, fill one entry pointing to itself,
> +                * i.e. difference is zero.
> +                */
> +               hantro_reg_write(vpu, cur_poc[i], 0);
> +               i++;
> +       }
> +
> +       /* Fill the rest with the current picture */
> +       for (; i < ARRAY_SIZE(cur_poc); i++)
> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> +
> +       set_ref_pic_list(ctx);
> +
> +       /* We will only keep the references picture that are still used */
> +       ctx->hevc_dec.ref_bufs_used = 0;
> +
> +       /* Set up addresses of DPB buffers */
> +       dpb_longterm_e = 0;
> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> +               if (!luma_addr)
> +                       return -ENOMEM;
> +
> +               chroma_addr = luma_addr + cr_offset;
> +               mv_addr = luma_addr + mv_offset;
> +
> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> +
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> +       }
> +
> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> +       if (!luma_addr)
> +               return -ENOMEM;
> +
> +       chroma_addr = luma_addr + cr_offset;
> +       mv_addr = luma_addr + mv_offset;
> +
> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> +
> +       hantro_hevc_ref_remove_unused(ctx);
> +
> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> +
> +       return 0;
> +}
> +
> +static void set_buffers(struct hantro_ctx *ctx)
> +{
> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       dma_addr_t src_dma, dst_dma;
> +       u32 src_len, src_buf_len;
> +
> +       src_buf = hantro_get_src_buf(ctx);
> +       dst_buf = hantro_get_dst_buf(ctx);
> +
> +       /* Source (stream) buffer. */
> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> +
> +       /* Destination (decoded frame) buffer. */
> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> +
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> +}
> +
> +void hantro_g2_check_idle(struct hantro_dev *vpu)
> +{
> +       int i;
> +
> +       for (i = 0; i < 3; i++) {
> +               u32 status;
> +
> +               /* Make sure the VPU is idle */
> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);

How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
the warning "device still running, aborting" (I personally dislike the abort
metaphor, but guess it's OK here).

> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> +               }
> +       }
> +}
> +
> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       int ret;
> +
> +       hantro_g2_check_idle(vpu);
> +
> +       /* Prepare HEVC decoder context. */
> +       ret = hantro_hevc_dec_prepare_run(ctx);
> +       if (ret)
> +               return ret;
> +
> +       /* Configure hardware registers. */
> +       set_params(ctx);
> +
> +       /* set reference pictures */
> +       ret = set_ref(ctx);
> +       if (ret)
> +               return ret;
> +
> +       set_buffers(ctx);
> +       prepare_tile_info_buffer(ctx);
> +
> +       hantro_end_prepare_run(ctx);
> +
> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> +
> +       /* Don't disable output */
> +       hantro_reg_write(vpu, hevc_out_dis, 0);
> +
> +       /* Don't compress buffers */
> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> +
> +       /* use NV12 as output format */
> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> +
> +       /* Bus width and max burst */
> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> +       hantro_reg_write(vpu, hevc_max_burst, 16);
> +
> +       /* Swap */
> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> +
> +       /* Start decoding! */
> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> +
> +       return 0;
> +}
> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> new file mode 100644
> index 000000000000..a361c9ba911d
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> @@ -0,0 +1,198 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2021, Collabora
> + *
> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> + */
> +
> +#ifndef HANTRO_G2_REGS_H_
> +#define HANTRO_G2_REGS_H_
> +
> +#include "hantro.h"
> +
> +#define G2_SWREG(nr)   ((nr) * 4)
> +
> +#define HEVC_DEC_REG(name, base, shift, mask) \
> +       static const struct hantro_reg _hevc_##name[] = { \
> +               { G2_SWREG(base), (shift), (mask) } \
> +       }; \
> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> +
> +#define HEVC_REG_VERSION               G2_SWREG(0)
> +
> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> +
> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> +
> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> +
> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> +
> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> +
> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> +
> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> +
> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> +
> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> +
> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> +
> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> +
> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> +
> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> +
> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> +
> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> +
> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> +
> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> +
> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> +
> +#define HEVC_ADDR_DST          (G2_SWREG(65))
> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> +#define HEVC_ADDR_STR          (G2_SWREG(169))
> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> +#define HEVC_TILE_SAO          (G2_SWREG(181))
> +#define HEVC_TILE_BSD          (G2_SWREG(183))
> +
> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> +
> +#endif
> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> new file mode 100644
> index 000000000000..8e319a837ff3
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_hevc.c
> @@ -0,0 +1,321 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include <linux/types.h>
> +#include <media/v4l2-mem2mem.h>
> +
> +#include "hantro.h"
> +#include "hantro_hw.h"
> +
> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> +/*
> + * BSD control data of current picture at tile border
> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> + */
> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> +/* tile border coefficients of filter */
> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> +
> +#define MAX_TILE_COLS 20
> +#define MAX_TILE_ROWS 22
> +
> +#define UNUSED_REF     -1
> +
> +#define G2_ALIGN               16
> +#define MC_WORD_SIZE           32
> +
> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> +
> +       return sps->pic_width_in_luma_samples *
> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> +}
> +
> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +
> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> +}
> +
> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> +       size_t mv_size;
> +
> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> +                 (1 << (2 * (8 - 4))) * 16) + 32;
> +
> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> +
> +       return mv_size;
> +}
> +
> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +
> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> +}
> +
> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       struct hantro_dev *vpu = ctx->dev;
> +       int i;
> +
> +       /* Just tag buffer as unused, do not free them */

This comment seems wrong.

> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs[i].cpu) {
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));

Is this memset clearing the buffer required? If we're getting artifacts
from previous decodes, then that would be more of a bug somewhere.

> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> +                                         hevc_dec->ref_bufs[i].cpu,
> +                                         hevc_dec->ref_bufs[i].dma);
> +               }
> +       }
> +}
> +
> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> +}
> +
> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> +                                  int poc)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       /* Find the reference buffer in already know ones */
> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       return hevc_dec->ref_bufs[i].dma;
> +               }
> +       }
> +
> +       /* Allocate a new reference buffer */
> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> +                       if (!hevc_dec->ref_bufs[i].cpu) {
> +                               struct hantro_dev *vpu = ctx->dev;
> +
> +                               hevc_dec->ref_bufs[i].cpu =
> +                                       dma_alloc_coherent(vpu->dev,
> +                                                          hantro_hevc_ref_size(ctx),
> +                                                          &hevc_dec->ref_bufs[i].dma,
> +                                                          GFP_KERNEL);

Is there any reason why we need to allocate reference buffers and MV contiguously?

> +                               if (!hevc_dec->ref_bufs[i].cpu)
> +                                       return 0;
> +
> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> +                       }
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));


I believe the coherent allocation is to be able to clear each reference, but is this
really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?

Also, if that's the case, then allocating the MV buffer separatedly will allow
to not allocate the reference buffers coherently (note that we use NO_MAPPING
in the vb2_queue, so the vb2_buffers shouldn't be coherent).

Thanks,
Ezequiel


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 18:46     ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 18:46 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

Hi Benjamin,

The series is looking really good. Some comments below.

On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> Implement all the logic to get G2 hardware decoding HEVC frames.
> It support up level 5.1 HEVC stream.
> It doesn't support yet 10 bits formats or scaling feature.
> 
> Add HANTRO HEVC dedicated control to skip some bits at the beginning
> of the slice header. That is very specific to this hardware so can't
> go into uapi structures. Compute the needed value is complex and require
> information from the stream that only the userland knows so let it
> provide the correct value to the driver.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 4:
> - fix Ezequiel comments
> - use dedicated control as an integer
> - change hantro_g2_hevc_dec_run prototype to return errors
> 
> version 2:
> - squash multiple commits in this one.
> - fix the comments done by Ezequiel about dma_alloc_coherent usage
> - fix Dan's comments about control copy, reverse the test logic
> in tile_buffer_reallocate, rework some goto and return cases.
> 
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  18 +
>  drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>  .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>  drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>  7 files changed, 1228 insertions(+)
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>  create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> 
> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> index 743ce08eb184..0357f1772267 100644
> --- a/drivers/staging/media/hantro/Makefile
> +++ b/drivers/staging/media/hantro/Makefile
> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>                 hantro_h1_jpeg_enc.o \
>                 hantro_g1_h264_dec.o \
>                 hantro_g1_mpeg2_dec.o \
> +               hantro_g2_hevc_dec.o \
>                 hantro_g1_vp8_dec.o \
>                 rk3399_vpu_hw_jpeg_enc.o \
>                 rk3399_vpu_hw_mpeg2_dec.o \
>                 rk3399_vpu_hw_vp8_dec.o \
>                 hantro_jpeg.o \
>                 hantro_h264.o \
> +               hantro_hevc.o \
>                 hantro_mpeg2.o \
>                 hantro_vp8.o
>  
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index 05876e426419..a9b80b2c9124 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -225,6 +225,7 @@ struct hantro_dev {
>   * @jpeg_enc:          JPEG-encoding context.
>   * @mpeg2_dec:         MPEG-2-decoding context.
>   * @vp8_dec:           VP8-decoding context.
> + * @hevc_dec:          HEVC-decoding context.
>   */
>  struct hantro_ctx {
>         struct hantro_dev *dev;
> @@ -251,6 +252,7 @@ struct hantro_ctx {
>                 struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>                 struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>                 struct hantro_vp8_dec_hw_ctx vp8_dec;
> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>         };
>  };
>  
> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>         return vb2_dma_contig_plane_dma_addr(vb, 0);
>  }
>  
> +static inline size_t
> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].size;
> +       return vb2_plane_size(vb, 0);
> +}
> +
> +static inline void *
> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> +{
> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> +               return ctx->postproc.dec_q[vb->index].cpu;
> +       return vb2_plane_vaddr(vb, 0);
> +}
> +

Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

>  void hantro_postproc_disable(struct hantro_ctx *ctx);
>  void hantro_postproc_enable(struct hantro_ctx *ctx);
>  void hantro_postproc_free(struct hantro_ctx *ctx);
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e3e6df28f470..bc90a52f4d3d 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -30,6 +30,13 @@
>  
>  #define DRIVER_NAME "hantro-vpu"
>  
> +/*
> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> + * the number of data (in bits) to skip in the
> + * slice segment header syntax after 'slice type' token
> + */

I think we need to document this better, so applications can
correctly use the control. From i.MX reference code, it seems
this needs to be used as follows:

If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
to before syntax element "slice_temporal_mvp_enabled_flag".

If IDR, the skipped bits are just "pic_output_flag"
(separate_colour_plane_flag is not supported).

And it seems this needs to be passed parsing only the first slice,
given this syntax remains invariant across all the slices.

> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> +
>  int hantro_debug;
>  module_param_named(debug, hantro_debug, int, 0644);
>  MODULE_PARM_DESC(debug,
> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>         return 0;
>  }
>  
> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> +{
> +       struct hantro_ctx *ctx;
> +
> +       ctx = container_of(ctrl->handler,
> +                          struct hantro_ctx, ctrl_handler);
> +
> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> +
> +       switch (ctrl->id) {
> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
>  static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>         .try_ctrl = hantro_try_ctrl,
>  };
> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>         .s_ctrl = hantro_jpeg_s_ctrl,
>  };
>  
> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> +       .s_ctrl = hantro_hevc_s_ctrl,
> +};
> +
>  static const struct hantro_ctrl controls[] = {
>         {
>                 .codec = HANTRO_JPEG_ENCODER,
> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>                 .cfg = {
>                         .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>                 },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> +                       .name = "Hantro HEVC slice header skip bytes",
> +                       .type = V4L2_CTRL_TYPE_INTEGER,
> +                       .min = 0,
> +                       .def = 0,
> +                       .max = 0x7fffffff,
> +                       .step = 1,
> +                       .ops = &hantro_hevc_ctrl_ops,
> +               },
> +       }, {
> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> +                        HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_USER_CLASS,

This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
or by the spec? 

> +                       .name = "HANTRO controls",
> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> +               },
>         },
>  };
>  
> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> new file mode 100644
> index 000000000000..5d75b36bc40c
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> @@ -0,0 +1,587 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include "hantro_hw.h"
> +#include "hantro_g2_regs.h"
> +
> +#define HEVC_DEC_MODE  0xC
> +
> +#define BUS_WIDTH_32           0
> +#define BUS_WIDTH_64           1
> +#define BUS_WIDTH_128          2
> +#define BUS_WIDTH_256          3
> +
> +static inline void hantro_write_addr(struct hantro_dev *vpu,
> +                                    unsigned long offset,
> +                                    dma_addr_t addr)
> +{
> +       vdpu_write(vpu, addr & 0xffffffff, offset);
> +}
> +
> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> +       unsigned int max_log2_ctb_size, ctb_size;
> +       bool tiles_enabled, uniform_spacing;
> +       u32 no_chroma = 0;
> +
> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> +
> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> +
> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> +                           sps->log2_diff_max_min_luma_coding_block_size;
> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> +                            >> max_log2_ctb_size;
> +       ctb_size = 1 << max_log2_ctb_size;
> +
> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> +
> +       if (tiles_enabled) {
> +               unsigned int i, j, h;
> +
> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> +
> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> +
> +               /* write width + height for each tile in pic */
> +               if (!uniform_spacing) {
> +                       u32 tmp_w = 0, tmp_h = 0;
> +
> +                       for (i = 0; i < num_tile_rows; i++) {
> +                               if (i == num_tile_rows - 1)
> +                                       h = pic_height_in_ctbs - tmp_h;
> +                               else
> +                                       h = pps->row_height_minus1[i] + 1;
> +                               tmp_h += h;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> +                                       tmp_w += pps->column_width_minus1[j] + 1;
> +                                       *p++ = pps->column_width_minus1[j + 1];
> +                                       *p++ = h;
> +                                       if (i == 0 && h == 1 && ctb_size == 16)
> +                                               no_chroma = 1;
> +                               }
> +                               /* last column */
> +                               *p++ = pic_width_in_ctbs - tmp_w;
> +                               *p++ = h;
> +                       }
> +               } else { /* uniform spacing */
> +                       u32 tmp, prev_h, prev_w;
> +
> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> +                               h = tmp - prev_h;
> +                               prev_h = tmp;
> +                               if (i == 0 && h == 1 && ctb_size == 16)
> +                                       no_chroma = 1;
> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> +                                       *p++ = tmp - prev_w;
> +                                       *p++ = h;
> +                                       if (j == 0 &&
> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> +                                           ctb_size == 16)
> +                                               no_chroma = 1;
> +                                       prev_w = tmp;
> +                               }
> +                       }
> +               }
> +       } else {
> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> +
> +               /* There's one tile, with dimensions equal to pic size. */
> +               p[0] = pic_width_in_ctbs;
> +               p[1] = pic_height_in_ctbs;
> +       }
> +
> +       if (no_chroma)
> +               vpu_debug(1, "%s: no chroma!\n", __func__);
> +}
> +
> +static void set_params(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       struct hantro_dev *vpu = ctx->dev;
> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> +       u32 pic_width_aligned, pic_height_aligned;
> +       u32 partial_ctb_x, partial_ctb_y;
> +
> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> +
> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> +
> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> +
> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> +
> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> +
> +       min_cb_size = 1 << min_log2_cb_size;
> +       max_ctb_size = 1 << max_log2_ctb_size;
> +
> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> +
> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> +
> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> +
> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> +
> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> +                        sps->max_transform_hierarchy_depth_inter);
> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> +                        sps->max_transform_hierarchy_depth_intra);
> +       hantro_reg_write(vpu, hevc_min_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_max_trb_size,
> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> +                        sps->log2_diff_max_min_luma_transform_block_size);
> +
> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> +       hantro_reg_write(vpu, hevc_asym_pred_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> +       hantro_reg_write(vpu, hevc_sao_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> +       hantro_reg_write(vpu, hevc_sign_data_hide,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> +       }
> +
> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> +       } else {
> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> +       hantro_reg_write(vpu, hevc_transq_bypass,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> +       hantro_reg_write(vpu, hevc_list_mod_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_idr_pic_e,
> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> +       hantro_reg_write(vpu, hevc_parallel_merge,
> +                        pps->log2_parallel_merge_level_minus2 + 2);
> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> +       hantro_reg_write(vpu, hevc_pcm_e,
> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> +               hantro_reg_write(vpu, hevc_max_pcm_size,
> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_min_pcm_size,
> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> +       } else {
> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> +       hantro_reg_write(vpu, hevc_weight_pred_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> +       hantro_reg_write(vpu, hevc_cabac_init_present,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> +       hantro_reg_write(vpu, hevc_const_intra_e,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> +       hantro_reg_write(vpu, hevc_transform_skip,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> +       hantro_reg_write(vpu, hevc_dependent_slice,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> +       hantro_reg_write(vpu, hevc_filter_override,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> +       hantro_reg_write(vpu, hevc_refidx0_active,
> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_refidx1_active,
> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> +}
> +
> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> +{
> +       int i;
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> +                       return i;
> +       }
> +
> +       return 0x0;
> +}
> +
> +static void set_ref_pic_list(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> +       const struct hantro_reg *ref_pic_regs0[] = {
> +               hevc_rlist_f0,
> +               hevc_rlist_f1,
> +               hevc_rlist_f2,
> +               hevc_rlist_f3,
> +               hevc_rlist_f4,
> +               hevc_rlist_f5,
> +               hevc_rlist_f6,
> +               hevc_rlist_f7,
> +               hevc_rlist_f8,
> +               hevc_rlist_f9,
> +               hevc_rlist_f10,
> +               hevc_rlist_f11,
> +               hevc_rlist_f12,
> +               hevc_rlist_f13,
> +               hevc_rlist_f14,
> +               hevc_rlist_f15,
> +       };
> +       const struct hantro_reg *ref_pic_regs1[] = {
> +               hevc_rlist_b0,
> +               hevc_rlist_b1,
> +               hevc_rlist_b2,
> +               hevc_rlist_b3,
> +               hevc_rlist_b4,
> +               hevc_rlist_b5,
> +               hevc_rlist_b6,
> +               hevc_rlist_b7,
> +               hevc_rlist_b8,
> +               hevc_rlist_b9,
> +               hevc_rlist_b10,
> +               hevc_rlist_b11,
> +               hevc_rlist_b12,
> +               hevc_rlist_b13,
> +               hevc_rlist_b14,
> +               hevc_rlist_b15,
> +       };
> +       unsigned int i, j;
> +
> +       /* List 0 contains: short term before, short term after and long term */
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       /* Fill the list, copying over and over */
> +       i = 0;
> +       while (j < ARRAY_SIZE(list0))
> +               list0[j++] = list0[i++];
> +
> +       j = 0;
> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> +
> +       i = 0;
> +       while (j < ARRAY_SIZE(list1))
> +               list1[j++] = list1[i++];
> +
> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> +       }
> +}
> +
> +static int set_ref(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> +       struct hantro_dev *vpu = ctx->dev;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> +       u32 max_ref_frames;
> +       u16 dpb_longterm_e;
> +
> +       const struct hantro_reg *cur_poc[] = {
> +               hevc_cur_poc_00,
> +               hevc_cur_poc_01,
> +               hevc_cur_poc_02,
> +               hevc_cur_poc_03,
> +               hevc_cur_poc_04,
> +               hevc_cur_poc_05,
> +               hevc_cur_poc_06,
> +               hevc_cur_poc_07,
> +               hevc_cur_poc_08,
> +               hevc_cur_poc_09,
> +               hevc_cur_poc_10,
> +               hevc_cur_poc_11,
> +               hevc_cur_poc_12,
> +               hevc_cur_poc_13,
> +               hevc_cur_poc_14,
> +               hevc_cur_poc_15,
> +       };
> +       unsigned int i;
> +
> +       max_ref_frames = decode_params->num_poc_lt_curr +
> +               decode_params->num_poc_st_curr_before +
> +               decode_params->num_poc_st_curr_after;
> +       /*
> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> +        * badly marked I-frames.
> +        */
> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> +       hantro_reg_write(vpu, hevc_filter_over_slices,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> +
> +       /*
> +        * Write POC count diff from current pic. For frame decoding only compute
> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> +        */
> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> +
> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> +       }
> +
> +       if (i < ARRAY_SIZE(cur_poc)) {
> +               /*
> +                * After the references, fill one entry pointing to itself,
> +                * i.e. difference is zero.
> +                */
> +               hantro_reg_write(vpu, cur_poc[i], 0);
> +               i++;
> +       }
> +
> +       /* Fill the rest with the current picture */
> +       for (; i < ARRAY_SIZE(cur_poc); i++)
> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> +
> +       set_ref_pic_list(ctx);
> +
> +       /* We will only keep the references picture that are still used */
> +       ctx->hevc_dec.ref_bufs_used = 0;
> +
> +       /* Set up addresses of DPB buffers */
> +       dpb_longterm_e = 0;
> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> +               if (!luma_addr)
> +                       return -ENOMEM;
> +
> +               chroma_addr = luma_addr + cr_offset;
> +               mv_addr = luma_addr + mv_offset;
> +
> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> +
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> +       }
> +
> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> +       if (!luma_addr)
> +               return -ENOMEM;
> +
> +       chroma_addr = luma_addr + cr_offset;
> +       mv_addr = luma_addr + mv_offset;
> +
> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> +
> +       hantro_hevc_ref_remove_unused(ctx);
> +
> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> +       }
> +
> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> +
> +       return 0;
> +}
> +
> +static void set_buffers(struct hantro_ctx *ctx)
> +{
> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       dma_addr_t src_dma, dst_dma;
> +       u32 src_len, src_buf_len;
> +
> +       src_buf = hantro_get_src_buf(ctx);
> +       dst_buf = hantro_get_dst_buf(ctx);
> +
> +       /* Source (stream) buffer. */
> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> +
> +       /* Destination (decoded frame) buffer. */
> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> +
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> +}
> +
> +void hantro_g2_check_idle(struct hantro_dev *vpu)
> +{
> +       int i;
> +
> +       for (i = 0; i < 3; i++) {
> +               u32 status;
> +
> +               /* Make sure the VPU is idle */
> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);

How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
the warning "device still running, aborting" (I personally dislike the abort
metaphor, but guess it's OK here).

> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> +               }
> +       }
> +}
> +
> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +       int ret;
> +
> +       hantro_g2_check_idle(vpu);
> +
> +       /* Prepare HEVC decoder context. */
> +       ret = hantro_hevc_dec_prepare_run(ctx);
> +       if (ret)
> +               return ret;
> +
> +       /* Configure hardware registers. */
> +       set_params(ctx);
> +
> +       /* set reference pictures */
> +       ret = set_ref(ctx);
> +       if (ret)
> +               return ret;
> +
> +       set_buffers(ctx);
> +       prepare_tile_info_buffer(ctx);
> +
> +       hantro_end_prepare_run(ctx);
> +
> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> +
> +       /* Don't disable output */
> +       hantro_reg_write(vpu, hevc_out_dis, 0);
> +
> +       /* Don't compress buffers */
> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> +
> +       /* use NV12 as output format */
> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> +
> +       /* Bus width and max burst */
> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> +       hantro_reg_write(vpu, hevc_max_burst, 16);
> +
> +       /* Swap */
> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> +
> +       /* Start decoding! */
> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> +
> +       return 0;
> +}
> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> new file mode 100644
> index 000000000000..a361c9ba911d
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> @@ -0,0 +1,198 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2021, Collabora
> + *
> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> + */
> +
> +#ifndef HANTRO_G2_REGS_H_
> +#define HANTRO_G2_REGS_H_
> +
> +#include "hantro.h"
> +
> +#define G2_SWREG(nr)   ((nr) * 4)
> +
> +#define HEVC_DEC_REG(name, base, shift, mask) \
> +       static const struct hantro_reg _hevc_##name[] = { \
> +               { G2_SWREG(base), (shift), (mask) } \
> +       }; \
> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> +
> +#define HEVC_REG_VERSION               G2_SWREG(0)
> +
> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> +
> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> +
> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> +
> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> +
> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> +
> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> +
> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> +
> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> +
> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> +
> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> +
> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> +
> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> +
> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> +
> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> +
> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> +
> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> +
> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> +
> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> +
> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> +
> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> +
> +#define HEVC_ADDR_DST          (G2_SWREG(65))
> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> +#define HEVC_ADDR_STR          (G2_SWREG(169))
> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> +#define HEVC_TILE_SAO          (G2_SWREG(181))
> +#define HEVC_TILE_BSD          (G2_SWREG(183))
> +
> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> +
> +#endif
> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> new file mode 100644
> index 000000000000..8e319a837ff3
> --- /dev/null
> +++ b/drivers/staging/media/hantro/hantro_hevc.c
> @@ -0,0 +1,321 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Hantro VPU HEVC codec driver
> + *
> + * Copyright (C) 2020 Safran Passenger Innovations LLC
> + */
> +
> +#include <linux/types.h>
> +#include <media/v4l2-mem2mem.h>
> +
> +#include "hantro.h"
> +#include "hantro_hw.h"
> +
> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> +/*
> + * BSD control data of current picture at tile border
> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> + */
> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> +/* tile border coefficients of filter */
> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> +
> +#define MAX_TILE_COLS 20
> +#define MAX_TILE_ROWS 22
> +
> +#define UNUSED_REF     -1
> +
> +#define G2_ALIGN               16
> +#define MC_WORD_SIZE           32
> +
> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> +
> +       return sps->pic_width_in_luma_samples *
> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> +}
> +
> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +
> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> +}
> +
> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> +{
> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> +       size_t mv_size;
> +
> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> +                 (1 << (2 * (8 - 4))) * 16) + 32;
> +
> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> +
> +       return mv_size;
> +}
> +
> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> +{
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +
> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> +}
> +
> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       struct hantro_dev *vpu = ctx->dev;
> +       int i;
> +
> +       /* Just tag buffer as unused, do not free them */

This comment seems wrong.

> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs[i].cpu) {
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));

Is this memset clearing the buffer required? If we're getting artifacts
from previous decodes, then that would be more of a bug somewhere.

> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> +                                         hevc_dec->ref_bufs[i].cpu,
> +                                         hevc_dec->ref_bufs[i].dma);
> +               }
> +       }
> +}
> +
> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> +}
> +
> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> +                                  int poc)
> +{
> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> +       int i;
> +
> +       /* Find the reference buffer in already know ones */
> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       return hevc_dec->ref_bufs[i].dma;
> +               }
> +       }
> +
> +       /* Allocate a new reference buffer */
> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> +                       if (!hevc_dec->ref_bufs[i].cpu) {
> +                               struct hantro_dev *vpu = ctx->dev;
> +
> +                               hevc_dec->ref_bufs[i].cpu =
> +                                       dma_alloc_coherent(vpu->dev,
> +                                                          hantro_hevc_ref_size(ctx),
> +                                                          &hevc_dec->ref_bufs[i].dma,
> +                                                          GFP_KERNEL);

Is there any reason why we need to allocate reference buffers and MV contiguously?

> +                               if (!hevc_dec->ref_bufs[i].cpu)
> +                                       return 0;
> +
> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> +                       }
> +                       hevc_dec->ref_bufs_used |= 1 << i;
> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));


I believe the coherent allocation is to be able to clear each reference, but is this
really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?

Also, if that's the case, then allocating the MV buffer separatedly will allow
to not allocate the reference buffers coherently (note that we use NO_MAPPING
in the vb2_queue, so the vb2_buffers shouldn't be coherent).

Thanks,
Ezequiel


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
  2021-03-16 18:46     ` Ezequiel Garcia
  (?)
@ 2021-03-16 20:19       ` Benjamin Gaignard
  -1 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-16 20:19 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> The series is looking really good. Some comments below.
>
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Implement all the logic to get G2 hardware decoding HEVC frames.
>> It support up level 5.1 HEVC stream.
>> It doesn't support yet 10 bits formats or scaling feature.
>>
>> Add HANTRO HEVC dedicated control to skip some bits at the beginning
>> of the slice header. That is very specific to this hardware so can't
>> go into uapi structures. Compute the needed value is complex and require
>> information from the stream that only the userland knows so let it
>> provide the correct value to the driver.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 4:
>> - fix Ezequiel comments
>> - use dedicated control as an integer
>> - change hantro_g2_hevc_dec_run prototype to return errors
>>
>> version 2:
>> - squash multiple commits in this one.
>> - fix the comments done by Ezequiel about dma_alloc_coherent usage
>> - fix Dan's comments about control copy, reverse the test logic
>> in tile_buffer_reallocate, rework some goto and return cases.
>>
>>   drivers/staging/media/hantro/Makefile         |   2 +
>>   drivers/staging/media/hantro/hantro.h         |  18 +
>>   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>>   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>>   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>>   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>>   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>>   7 files changed, 1228 insertions(+)
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>>   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
>>
>> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
>> index 743ce08eb184..0357f1772267 100644
>> --- a/drivers/staging/media/hantro/Makefile
>> +++ b/drivers/staging/media/hantro/Makefile
>> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>>                  hantro_h1_jpeg_enc.o \
>>                  hantro_g1_h264_dec.o \
>>                  hantro_g1_mpeg2_dec.o \
>> +               hantro_g2_hevc_dec.o \
>>                  hantro_g1_vp8_dec.o \
>>                  rk3399_vpu_hw_jpeg_enc.o \
>>                  rk3399_vpu_hw_mpeg2_dec.o \
>>                  rk3399_vpu_hw_vp8_dec.o \
>>                  hantro_jpeg.o \
>>                  hantro_h264.o \
>> +               hantro_hevc.o \
>>                  hantro_mpeg2.o \
>>                  hantro_vp8.o
>>   
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index 05876e426419..a9b80b2c9124 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -225,6 +225,7 @@ struct hantro_dev {
>>    * @jpeg_enc:          JPEG-encoding context.
>>    * @mpeg2_dec:         MPEG-2-decoding context.
>>    * @vp8_dec:           VP8-decoding context.
>> + * @hevc_dec:          HEVC-decoding context.
>>    */
>>   struct hantro_ctx {
>>          struct hantro_dev *dev;
>> @@ -251,6 +252,7 @@ struct hantro_ctx {
>>                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>>                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>>                  struct hantro_vp8_dec_hw_ctx vp8_dec;
>> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>>          };
>>   };
>>   
>> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>>          return vb2_dma_contig_plane_dma_addr(vb, 0);
>>   }
>>   
>> +static inline size_t
>> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].size;
>> +       return vb2_plane_size(vb, 0);
>> +}
>> +
>> +static inline void *
>> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].cpu;
>> +       return vb2_plane_vaddr(vb, 0);
>> +}
>> +
> Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

You are right I will remove them

>
>>   void hantro_postproc_disable(struct hantro_ctx *ctx);
>>   void hantro_postproc_enable(struct hantro_ctx *ctx);
>>   void hantro_postproc_free(struct hantro_ctx *ctx);
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e3e6df28f470..bc90a52f4d3d 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -30,6 +30,13 @@
>>   
>>   #define DRIVER_NAME "hantro-vpu"
>>   
>> +/*
>> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
>> + * the number of data (in bits) to skip in the
>> + * slice segment header syntax after 'slice type' token
>> + */
> I think we need to document this better, so applications can
> correctly use the control. From i.MX reference code, it seems
> this needs to be used as follows:
>
> If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> to before syntax element "slice_temporal_mvp_enabled_flag".
>
> If IDR, the skipped bits are just "pic_output_flag"
> (separate_colour_plane_flag is not supported).
>
> And it seems this needs to be passed parsing only the first slice,
> given this syntax remains invariant across all the slices.

Ok I will add your description in the next version.

>
>> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
>> +
>>   int hantro_debug;
>>   module_param_named(debug, hantro_debug, int, 0644);
>>   MODULE_PARM_DESC(debug,
>> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>>          return 0;
>>   }
>>   
>> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
>> +{
>> +       struct hantro_ctx *ctx;
>> +
>> +       ctx = container_of(ctrl->handler,
>> +                          struct hantro_ctx, ctrl_handler);
>> +
>> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
>> +
>> +       switch (ctrl->id) {
>> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
>> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
>> +               break;
>> +       default:
>> +               return -EINVAL;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>>          .try_ctrl = hantro_try_ctrl,
>>   };
>> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>>          .s_ctrl = hantro_jpeg_s_ctrl,
>>   };
>>   
>> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
>> +       .s_ctrl = hantro_hevc_s_ctrl,
>> +};
>> +
>>   static const struct hantro_ctrl controls[] = {
>>          {
>>                  .codec = HANTRO_JPEG_ENCODER,
>> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>>                  .cfg = {
>>                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>>                  },
>> +       }, {
>> +               .codec = HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
>> +                       .name = "Hantro HEVC slice header skip bytes",
>> +                       .type = V4L2_CTRL_TYPE_INTEGER,
>> +                       .min = 0,
>> +                       .def = 0,
>> +                       .max = 0x7fffffff,
>> +                       .step = 1,
>> +                       .ops = &hantro_hevc_ctrl_ops,
>> +               },
>> +       }, {
>> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
>> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
>> +                        HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_USER_CLASS,
> This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> or by the spec?

It is required by v4l2-compliance.

>
>> +                       .name = "HANTRO controls",
>> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
>> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
>> +               },
>>          },
>>   };
>>   
>> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> new file mode 100644
>> index 000000000000..5d75b36bc40c
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> @@ -0,0 +1,587 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include "hantro_hw.h"
>> +#include "hantro_g2_regs.h"
>> +
>> +#define HEVC_DEC_MODE  0xC
>> +
>> +#define BUS_WIDTH_32           0
>> +#define BUS_WIDTH_64           1
>> +#define BUS_WIDTH_128          2
>> +#define BUS_WIDTH_256          3
>> +
>> +static inline void hantro_write_addr(struct hantro_dev *vpu,
>> +                                    unsigned long offset,
>> +                                    dma_addr_t addr)
>> +{
>> +       vdpu_write(vpu, addr & 0xffffffff, offset);
>> +}
>> +
>> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
>> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
>> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
>> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
>> +       unsigned int max_log2_ctb_size, ctb_size;
>> +       bool tiles_enabled, uniform_spacing;
>> +       u32 no_chroma = 0;
>> +
>> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
>> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
>> +
>> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
>> +
>> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
>> +                           sps->log2_diff_max_min_luma_coding_block_size;
>> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
>> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
>> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
>> +                            >> max_log2_ctb_size;
>> +       ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
>> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
>> +
>> +       if (tiles_enabled) {
>> +               unsigned int i, j, h;
>> +
>> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
>> +
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
>> +
>> +               /* write width + height for each tile in pic */
>> +               if (!uniform_spacing) {
>> +                       u32 tmp_w = 0, tmp_h = 0;
>> +
>> +                       for (i = 0; i < num_tile_rows; i++) {
>> +                               if (i == num_tile_rows - 1)
>> +                                       h = pic_height_in_ctbs - tmp_h;
>> +                               else
>> +                                       h = pps->row_height_minus1[i] + 1;
>> +                               tmp_h += h;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
>> +                                       tmp_w += pps->column_width_minus1[j] + 1;
>> +                                       *p++ = pps->column_width_minus1[j + 1];
>> +                                       *p++ = h;
>> +                                       if (i == 0 && h == 1 && ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                               }
>> +                               /* last column */
>> +                               *p++ = pic_width_in_ctbs - tmp_w;
>> +                               *p++ = h;
>> +                       }
>> +               } else { /* uniform spacing */
>> +                       u32 tmp, prev_h, prev_w;
>> +
>> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
>> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
>> +                               h = tmp - prev_h;
>> +                               prev_h = tmp;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
>> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
>> +                                       *p++ = tmp - prev_w;
>> +                                       *p++ = h;
>> +                                       if (j == 0 &&
>> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
>> +                                           ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                                       prev_w = tmp;
>> +                               }
>> +                       }
>> +               }
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
>> +
>> +               /* There's one tile, with dimensions equal to pic size. */
>> +               p[0] = pic_width_in_ctbs;
>> +               p[1] = pic_height_in_ctbs;
>> +       }
>> +
>> +       if (no_chroma)
>> +               vpu_debug(1, "%s: no chroma!\n", __func__);
>> +}
>> +
>> +static void set_params(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
>> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
>> +       u32 pic_width_aligned, pic_height_aligned;
>> +       u32 partial_ctb_x, partial_ctb_y;
>> +
>> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
>> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
>> +
>> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
>> +
>> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
>> +
>> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
>> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
>> +
>> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
>> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
>> +
>> +       min_cb_size = 1 << min_log2_cb_size;
>> +       max_ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
>> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
>> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
>> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
>> +
>> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
>> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
>> +
>> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
>> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
>> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
>> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
>> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
>> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
>> +
>> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_inter);
>> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_intra);
>> +       hantro_reg_write(vpu, hevc_min_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_max_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
>> +                        sps->log2_diff_max_min_luma_transform_block_size);
>> +
>> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
>> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
>> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
>> +       hantro_reg_write(vpu, hevc_asym_pred_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_sao_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
>> +       hantro_reg_write(vpu, hevc_sign_data_hide,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
>> +       }
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
>> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
>> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
>> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
>> +       hantro_reg_write(vpu, hevc_transq_bypass,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
>> +       hantro_reg_write(vpu, hevc_list_mod_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_idr_pic_e,
>> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
>> +       hantro_reg_write(vpu, hevc_parallel_merge,
>> +                        pps->log2_parallel_merge_level_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
>> +       hantro_reg_write(vpu, hevc_pcm_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
>> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size,
>> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size,
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
>> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
>> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
>> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
>> +       hantro_reg_write(vpu, hevc_weight_pred_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_const_intra_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
>> +       hantro_reg_write(vpu, hevc_transform_skip,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
>> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
>> +       hantro_reg_write(vpu, hevc_dependent_slice,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
>> +       hantro_reg_write(vpu, hevc_filter_override,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
>> +       hantro_reg_write(vpu, hevc_refidx0_active,
>> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_refidx1_active,
>> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
>> +}
>> +
>> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
>> +                       return i;
>> +       }
>> +
>> +       return 0x0;
>> +}
>> +
>> +static void set_ref_pic_list(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       const struct hantro_reg *ref_pic_regs0[] = {
>> +               hevc_rlist_f0,
>> +               hevc_rlist_f1,
>> +               hevc_rlist_f2,
>> +               hevc_rlist_f3,
>> +               hevc_rlist_f4,
>> +               hevc_rlist_f5,
>> +               hevc_rlist_f6,
>> +               hevc_rlist_f7,
>> +               hevc_rlist_f8,
>> +               hevc_rlist_f9,
>> +               hevc_rlist_f10,
>> +               hevc_rlist_f11,
>> +               hevc_rlist_f12,
>> +               hevc_rlist_f13,
>> +               hevc_rlist_f14,
>> +               hevc_rlist_f15,
>> +       };
>> +       const struct hantro_reg *ref_pic_regs1[] = {
>> +               hevc_rlist_b0,
>> +               hevc_rlist_b1,
>> +               hevc_rlist_b2,
>> +               hevc_rlist_b3,
>> +               hevc_rlist_b4,
>> +               hevc_rlist_b5,
>> +               hevc_rlist_b6,
>> +               hevc_rlist_b7,
>> +               hevc_rlist_b8,
>> +               hevc_rlist_b9,
>> +               hevc_rlist_b10,
>> +               hevc_rlist_b11,
>> +               hevc_rlist_b12,
>> +               hevc_rlist_b13,
>> +               hevc_rlist_b14,
>> +               hevc_rlist_b15,
>> +       };
>> +       unsigned int i, j;
>> +
>> +       /* List 0 contains: short term before, short term after and long term */
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       /* Fill the list, copying over and over */
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list0))
>> +               list0[j++] = list0[i++];
>> +
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list1))
>> +               list1[j++] = list1[i++];
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
>> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
>> +       }
>> +}
>> +
>> +static int set_ref(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
>> +       u32 max_ref_frames;
>> +       u16 dpb_longterm_e;
>> +
>> +       const struct hantro_reg *cur_poc[] = {
>> +               hevc_cur_poc_00,
>> +               hevc_cur_poc_01,
>> +               hevc_cur_poc_02,
>> +               hevc_cur_poc_03,
>> +               hevc_cur_poc_04,
>> +               hevc_cur_poc_05,
>> +               hevc_cur_poc_06,
>> +               hevc_cur_poc_07,
>> +               hevc_cur_poc_08,
>> +               hevc_cur_poc_09,
>> +               hevc_cur_poc_10,
>> +               hevc_cur_poc_11,
>> +               hevc_cur_poc_12,
>> +               hevc_cur_poc_13,
>> +               hevc_cur_poc_14,
>> +               hevc_cur_poc_15,
>> +       };
>> +       unsigned int i;
>> +
>> +       max_ref_frames = decode_params->num_poc_lt_curr +
>> +               decode_params->num_poc_st_curr_before +
>> +               decode_params->num_poc_st_curr_after;
>> +       /*
>> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
>> +        * badly marked I-frames.
>> +        */
>> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
>> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
>> +       hantro_reg_write(vpu, hevc_filter_over_slices,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
>> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
>> +
>> +       /*
>> +        * Write POC count diff from current pic. For frame decoding only compute
>> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
>> +        */
>> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
>> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
>> +
>> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
>> +       }
>> +
>> +       if (i < ARRAY_SIZE(cur_poc)) {
>> +               /*
>> +                * After the references, fill one entry pointing to itself,
>> +                * i.e. difference is zero.
>> +                */
>> +               hantro_reg_write(vpu, cur_poc[i], 0);
>> +               i++;
>> +       }
>> +
>> +       /* Fill the rest with the current picture */
>> +       for (; i < ARRAY_SIZE(cur_poc); i++)
>> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
>> +
>> +       set_ref_pic_list(ctx);
>> +
>> +       /* We will only keep the references picture that are still used */
>> +       ctx->hevc_dec.ref_bufs_used = 0;
>> +
>> +       /* Set up addresses of DPB buffers */
>> +       dpb_longterm_e = 0;
>> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
>> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
>> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
>> +               if (!luma_addr)
>> +                       return -ENOMEM;
>> +
>> +               chroma_addr = luma_addr + cr_offset;
>> +               mv_addr = luma_addr + mv_offset;
>> +
>> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
>> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
>> +
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
>> +       }
>> +
>> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
>> +       if (!luma_addr)
>> +               return -ENOMEM;
>> +
>> +       chroma_addr = luma_addr + cr_offset;
>> +       mv_addr = luma_addr + mv_offset;
>> +
>> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
>> +
>> +       hantro_hevc_ref_remove_unused(ctx);
>> +
>> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
>> +
>> +       return 0;
>> +}
>> +
>> +static void set_buffers(struct hantro_ctx *ctx)
>> +{
>> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       dma_addr_t src_dma, dst_dma;
>> +       u32 src_len, src_buf_len;
>> +
>> +       src_buf = hantro_get_src_buf(ctx);
>> +       dst_buf = hantro_get_dst_buf(ctx);
>> +
>> +       /* Source (stream) buffer. */
>> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
>> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
>> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
>> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
>> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
>> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
>> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
>> +
>> +       /* Destination (decoded frame) buffer. */
>> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
>> +
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
>> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
>> +}
>> +
>> +void hantro_g2_check_idle(struct hantro_dev *vpu)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < 3; i++) {
>> +               u32 status;
>> +
>> +               /* Make sure the VPU is idle */
>> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
>> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> the warning "device still running, aborting" (I personally dislike the abort
> metaphor, but guess it's OK here).

Ok

>
>> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
>> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
>> +               }
>> +       }
>> +}
>> +
>> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int ret;
>> +
>> +       hantro_g2_check_idle(vpu);
>> +
>> +       /* Prepare HEVC decoder context. */
>> +       ret = hantro_hevc_dec_prepare_run(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Configure hardware registers. */
>> +       set_params(ctx);
>> +
>> +       /* set reference pictures */
>> +       ret = set_ref(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       set_buffers(ctx);
>> +       prepare_tile_info_buffer(ctx);
>> +
>> +       hantro_end_prepare_run(ctx);
>> +
>> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
>> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
>> +
>> +       /* Don't disable output */
>> +       hantro_reg_write(vpu, hevc_out_dis, 0);
>> +
>> +       /* Don't compress buffers */
>> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
>> +
>> +       /* use NV12 as output format */
>> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
>> +
>> +       /* Bus width and max burst */
>> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
>> +       hantro_reg_write(vpu, hevc_max_burst, 16);
>> +
>> +       /* Swap */
>> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
>> +
>> +       /* Start decoding! */
>> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
>> +
>> +       return 0;
>> +}
>> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
>> new file mode 100644
>> index 000000000000..a361c9ba911d
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
>> @@ -0,0 +1,198 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2021, Collabora
>> + *
>> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> + */
>> +
>> +#ifndef HANTRO_G2_REGS_H_
>> +#define HANTRO_G2_REGS_H_
>> +
>> +#include "hantro.h"
>> +
>> +#define G2_SWREG(nr)   ((nr) * 4)
>> +
>> +#define HEVC_DEC_REG(name, base, shift, mask) \
>> +       static const struct hantro_reg _hevc_##name[] = { \
>> +               { G2_SWREG(base), (shift), (mask) } \
>> +       }; \
>> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
>> +
>> +#define HEVC_REG_VERSION               G2_SWREG(0)
>> +
>> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
>> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
>> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
>> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
>> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
>> +
>> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
>> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
>> +
>> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
>> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
>> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
>> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
>> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
>> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
>> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
>> +
>> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
>> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
>> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
>> +
>> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
>> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
>> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
>> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
>> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
>> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
>> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
>> +
>> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
>> +
>> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
>> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
>> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
>> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
>> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
>> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
>> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
>> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
>> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
>> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
>> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
>> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
>> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
>> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
>> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
>> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
>> +
>> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
>> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
>> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
>> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
>> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
>> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
>> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
>> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
>> +
>> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
>> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
>> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
>> +
>> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
>> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
>> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
>> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
>> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
>> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
>> +
>> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
>> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
>> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
>> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
>> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
>> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
>> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
>> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
>> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
>> +
>> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
>> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
>> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
>> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
>> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
>> +
>> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
>> +
>> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
>> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
>> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
>> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
>> +
>> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
>> +
>> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
>> +
>> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
>> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
>> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
>> +
>> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
>> +
>> +#define HEVC_ADDR_DST          (G2_SWREG(65))
>> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
>> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
>> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
>> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
>> +#define HEVC_ADDR_STR          (G2_SWREG(169))
>> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
>> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
>> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
>> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
>> +#define HEVC_TILE_SAO          (G2_SWREG(181))
>> +#define HEVC_TILE_BSD          (G2_SWREG(183))
>> +
>> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
>> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
>> +
>> +#endif
>> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
>> new file mode 100644
>> index 000000000000..8e319a837ff3
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_hevc.c
>> @@ -0,0 +1,321 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <media/v4l2-mem2mem.h>
>> +
>> +#include "hantro.h"
>> +#include "hantro_hw.h"
>> +
>> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
>> +/*
>> + * BSD control data of current picture at tile border
>> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
>> + */
>> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
>> +/* tile border coefficients of filter */
>> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
>> +
>> +#define MAX_TILE_COLS 20
>> +#define MAX_TILE_ROWS 22
>> +
>> +#define UNUSED_REF     -1
>> +
>> +#define G2_ALIGN               16
>> +#define MC_WORD_SIZE           32
>> +
>> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
>> +
>> +       return sps->pic_width_in_luma_samples *
>> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
>> +}
>> +
>> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +
>> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
>> +}
>> +
>> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
>> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
>> +       size_t mv_size;
>> +
>> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
>> +                 (1 << (2 * (8 - 4))) * 16) + 32;
>> +
>> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
>> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
>> +
>> +       return mv_size;
>> +}
>> +
>> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +
>> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
>> +}
>> +
>> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int i;
>> +
>> +       /* Just tag buffer as unused, do not free them */
> This comment seems wrong.

You are right I will remove it.

>
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs[i].cpu) {
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> Is this memset clearing the buffer required? If we're getting artifacts
> from previous decodes, then that would be more of a bug somewhere.

Clear is done after allocating/reused the buffer I can remove this one.

>
>> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
>> +                                         hevc_dec->ref_bufs[i].cpu,
>> +                                         hevc_dec->ref_bufs[i].dma);
>> +               }
>> +       }
>> +}
>> +
>> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
>> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
>> +}
>> +
>> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
>> +                                  int poc)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       /* Find the reference buffer in already know ones */
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       return hevc_dec->ref_bufs[i].dma;
>> +               }
>> +       }
>> +
>> +       /* Allocate a new reference buffer */
>> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
>> +                       if (!hevc_dec->ref_bufs[i].cpu) {
>> +                               struct hantro_dev *vpu = ctx->dev;
>> +
>> +                               hevc_dec->ref_bufs[i].cpu =
>> +                                       dma_alloc_coherent(vpu->dev,
>> +                                                          hantro_hevc_ref_size(ctx),
>> +                                                          &hevc_dec->ref_bufs[i].dma,
>> +                                                          GFP_KERNEL);
> Is there any reason why we need to allocate reference buffers and MV contiguously?

It is done like that in IMX reference code and makes the management of reference frame
and MV more simple.

>
>> +                               if (!hevc_dec->ref_bufs[i].cpu)
>> +                                       return 0;
>> +
>> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
>> +                       }
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
>
> I believe the coherent allocation is to be able to clear each reference, but is this
> really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
>
> Also, if that's the case, then allocating the MV buffer separatedly will allow
> to not allocate the reference buffers coherently (note that we use NO_MAPPING
> in the vb2_queue, so the vb2_buffers shouldn't be coherent).

That sound like good possible optimizations but I'm not at this stage.
I would rather keep it in this fairly functional state and improve it later.
I think the patches are already enough larges and complexes like that.

Benjamin

>
> Thanks,
> Ezequiel
>
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 20:19       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-16 20:19 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> The series is looking really good. Some comments below.
>
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Implement all the logic to get G2 hardware decoding HEVC frames.
>> It support up level 5.1 HEVC stream.
>> It doesn't support yet 10 bits formats or scaling feature.
>>
>> Add HANTRO HEVC dedicated control to skip some bits at the beginning
>> of the slice header. That is very specific to this hardware so can't
>> go into uapi structures. Compute the needed value is complex and require
>> information from the stream that only the userland knows so let it
>> provide the correct value to the driver.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 4:
>> - fix Ezequiel comments
>> - use dedicated control as an integer
>> - change hantro_g2_hevc_dec_run prototype to return errors
>>
>> version 2:
>> - squash multiple commits in this one.
>> - fix the comments done by Ezequiel about dma_alloc_coherent usage
>> - fix Dan's comments about control copy, reverse the test logic
>> in tile_buffer_reallocate, rework some goto and return cases.
>>
>>   drivers/staging/media/hantro/Makefile         |   2 +
>>   drivers/staging/media/hantro/hantro.h         |  18 +
>>   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>>   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>>   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>>   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>>   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>>   7 files changed, 1228 insertions(+)
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>>   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
>>
>> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
>> index 743ce08eb184..0357f1772267 100644
>> --- a/drivers/staging/media/hantro/Makefile
>> +++ b/drivers/staging/media/hantro/Makefile
>> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>>                  hantro_h1_jpeg_enc.o \
>>                  hantro_g1_h264_dec.o \
>>                  hantro_g1_mpeg2_dec.o \
>> +               hantro_g2_hevc_dec.o \
>>                  hantro_g1_vp8_dec.o \
>>                  rk3399_vpu_hw_jpeg_enc.o \
>>                  rk3399_vpu_hw_mpeg2_dec.o \
>>                  rk3399_vpu_hw_vp8_dec.o \
>>                  hantro_jpeg.o \
>>                  hantro_h264.o \
>> +               hantro_hevc.o \
>>                  hantro_mpeg2.o \
>>                  hantro_vp8.o
>>   
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index 05876e426419..a9b80b2c9124 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -225,6 +225,7 @@ struct hantro_dev {
>>    * @jpeg_enc:          JPEG-encoding context.
>>    * @mpeg2_dec:         MPEG-2-decoding context.
>>    * @vp8_dec:           VP8-decoding context.
>> + * @hevc_dec:          HEVC-decoding context.
>>    */
>>   struct hantro_ctx {
>>          struct hantro_dev *dev;
>> @@ -251,6 +252,7 @@ struct hantro_ctx {
>>                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>>                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>>                  struct hantro_vp8_dec_hw_ctx vp8_dec;
>> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>>          };
>>   };
>>   
>> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>>          return vb2_dma_contig_plane_dma_addr(vb, 0);
>>   }
>>   
>> +static inline size_t
>> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].size;
>> +       return vb2_plane_size(vb, 0);
>> +}
>> +
>> +static inline void *
>> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].cpu;
>> +       return vb2_plane_vaddr(vb, 0);
>> +}
>> +
> Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

You are right I will remove them

>
>>   void hantro_postproc_disable(struct hantro_ctx *ctx);
>>   void hantro_postproc_enable(struct hantro_ctx *ctx);
>>   void hantro_postproc_free(struct hantro_ctx *ctx);
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e3e6df28f470..bc90a52f4d3d 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -30,6 +30,13 @@
>>   
>>   #define DRIVER_NAME "hantro-vpu"
>>   
>> +/*
>> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
>> + * the number of data (in bits) to skip in the
>> + * slice segment header syntax after 'slice type' token
>> + */
> I think we need to document this better, so applications can
> correctly use the control. From i.MX reference code, it seems
> this needs to be used as follows:
>
> If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> to before syntax element "slice_temporal_mvp_enabled_flag".
>
> If IDR, the skipped bits are just "pic_output_flag"
> (separate_colour_plane_flag is not supported).
>
> And it seems this needs to be passed parsing only the first slice,
> given this syntax remains invariant across all the slices.

Ok I will add your description in the next version.

>
>> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
>> +
>>   int hantro_debug;
>>   module_param_named(debug, hantro_debug, int, 0644);
>>   MODULE_PARM_DESC(debug,
>> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>>          return 0;
>>   }
>>   
>> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
>> +{
>> +       struct hantro_ctx *ctx;
>> +
>> +       ctx = container_of(ctrl->handler,
>> +                          struct hantro_ctx, ctrl_handler);
>> +
>> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
>> +
>> +       switch (ctrl->id) {
>> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
>> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
>> +               break;
>> +       default:
>> +               return -EINVAL;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>>          .try_ctrl = hantro_try_ctrl,
>>   };
>> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>>          .s_ctrl = hantro_jpeg_s_ctrl,
>>   };
>>   
>> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
>> +       .s_ctrl = hantro_hevc_s_ctrl,
>> +};
>> +
>>   static const struct hantro_ctrl controls[] = {
>>          {
>>                  .codec = HANTRO_JPEG_ENCODER,
>> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>>                  .cfg = {
>>                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>>                  },
>> +       }, {
>> +               .codec = HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
>> +                       .name = "Hantro HEVC slice header skip bytes",
>> +                       .type = V4L2_CTRL_TYPE_INTEGER,
>> +                       .min = 0,
>> +                       .def = 0,
>> +                       .max = 0x7fffffff,
>> +                       .step = 1,
>> +                       .ops = &hantro_hevc_ctrl_ops,
>> +               },
>> +       }, {
>> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
>> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
>> +                        HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_USER_CLASS,
> This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> or by the spec?

It is required by v4l2-compliance.

>
>> +                       .name = "HANTRO controls",
>> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
>> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
>> +               },
>>          },
>>   };
>>   
>> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> new file mode 100644
>> index 000000000000..5d75b36bc40c
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> @@ -0,0 +1,587 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include "hantro_hw.h"
>> +#include "hantro_g2_regs.h"
>> +
>> +#define HEVC_DEC_MODE  0xC
>> +
>> +#define BUS_WIDTH_32           0
>> +#define BUS_WIDTH_64           1
>> +#define BUS_WIDTH_128          2
>> +#define BUS_WIDTH_256          3
>> +
>> +static inline void hantro_write_addr(struct hantro_dev *vpu,
>> +                                    unsigned long offset,
>> +                                    dma_addr_t addr)
>> +{
>> +       vdpu_write(vpu, addr & 0xffffffff, offset);
>> +}
>> +
>> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
>> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
>> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
>> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
>> +       unsigned int max_log2_ctb_size, ctb_size;
>> +       bool tiles_enabled, uniform_spacing;
>> +       u32 no_chroma = 0;
>> +
>> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
>> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
>> +
>> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
>> +
>> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
>> +                           sps->log2_diff_max_min_luma_coding_block_size;
>> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
>> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
>> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
>> +                            >> max_log2_ctb_size;
>> +       ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
>> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
>> +
>> +       if (tiles_enabled) {
>> +               unsigned int i, j, h;
>> +
>> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
>> +
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
>> +
>> +               /* write width + height for each tile in pic */
>> +               if (!uniform_spacing) {
>> +                       u32 tmp_w = 0, tmp_h = 0;
>> +
>> +                       for (i = 0; i < num_tile_rows; i++) {
>> +                               if (i == num_tile_rows - 1)
>> +                                       h = pic_height_in_ctbs - tmp_h;
>> +                               else
>> +                                       h = pps->row_height_minus1[i] + 1;
>> +                               tmp_h += h;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
>> +                                       tmp_w += pps->column_width_minus1[j] + 1;
>> +                                       *p++ = pps->column_width_minus1[j + 1];
>> +                                       *p++ = h;
>> +                                       if (i == 0 && h == 1 && ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                               }
>> +                               /* last column */
>> +                               *p++ = pic_width_in_ctbs - tmp_w;
>> +                               *p++ = h;
>> +                       }
>> +               } else { /* uniform spacing */
>> +                       u32 tmp, prev_h, prev_w;
>> +
>> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
>> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
>> +                               h = tmp - prev_h;
>> +                               prev_h = tmp;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
>> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
>> +                                       *p++ = tmp - prev_w;
>> +                                       *p++ = h;
>> +                                       if (j == 0 &&
>> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
>> +                                           ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                                       prev_w = tmp;
>> +                               }
>> +                       }
>> +               }
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
>> +
>> +               /* There's one tile, with dimensions equal to pic size. */
>> +               p[0] = pic_width_in_ctbs;
>> +               p[1] = pic_height_in_ctbs;
>> +       }
>> +
>> +       if (no_chroma)
>> +               vpu_debug(1, "%s: no chroma!\n", __func__);
>> +}
>> +
>> +static void set_params(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
>> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
>> +       u32 pic_width_aligned, pic_height_aligned;
>> +       u32 partial_ctb_x, partial_ctb_y;
>> +
>> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
>> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
>> +
>> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
>> +
>> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
>> +
>> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
>> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
>> +
>> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
>> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
>> +
>> +       min_cb_size = 1 << min_log2_cb_size;
>> +       max_ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
>> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
>> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
>> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
>> +
>> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
>> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
>> +
>> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
>> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
>> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
>> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
>> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
>> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
>> +
>> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_inter);
>> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_intra);
>> +       hantro_reg_write(vpu, hevc_min_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_max_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
>> +                        sps->log2_diff_max_min_luma_transform_block_size);
>> +
>> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
>> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
>> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
>> +       hantro_reg_write(vpu, hevc_asym_pred_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_sao_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
>> +       hantro_reg_write(vpu, hevc_sign_data_hide,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
>> +       }
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
>> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
>> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
>> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
>> +       hantro_reg_write(vpu, hevc_transq_bypass,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
>> +       hantro_reg_write(vpu, hevc_list_mod_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_idr_pic_e,
>> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
>> +       hantro_reg_write(vpu, hevc_parallel_merge,
>> +                        pps->log2_parallel_merge_level_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
>> +       hantro_reg_write(vpu, hevc_pcm_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
>> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size,
>> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size,
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
>> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
>> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
>> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
>> +       hantro_reg_write(vpu, hevc_weight_pred_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_const_intra_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
>> +       hantro_reg_write(vpu, hevc_transform_skip,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
>> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
>> +       hantro_reg_write(vpu, hevc_dependent_slice,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
>> +       hantro_reg_write(vpu, hevc_filter_override,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
>> +       hantro_reg_write(vpu, hevc_refidx0_active,
>> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_refidx1_active,
>> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
>> +}
>> +
>> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
>> +                       return i;
>> +       }
>> +
>> +       return 0x0;
>> +}
>> +
>> +static void set_ref_pic_list(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       const struct hantro_reg *ref_pic_regs0[] = {
>> +               hevc_rlist_f0,
>> +               hevc_rlist_f1,
>> +               hevc_rlist_f2,
>> +               hevc_rlist_f3,
>> +               hevc_rlist_f4,
>> +               hevc_rlist_f5,
>> +               hevc_rlist_f6,
>> +               hevc_rlist_f7,
>> +               hevc_rlist_f8,
>> +               hevc_rlist_f9,
>> +               hevc_rlist_f10,
>> +               hevc_rlist_f11,
>> +               hevc_rlist_f12,
>> +               hevc_rlist_f13,
>> +               hevc_rlist_f14,
>> +               hevc_rlist_f15,
>> +       };
>> +       const struct hantro_reg *ref_pic_regs1[] = {
>> +               hevc_rlist_b0,
>> +               hevc_rlist_b1,
>> +               hevc_rlist_b2,
>> +               hevc_rlist_b3,
>> +               hevc_rlist_b4,
>> +               hevc_rlist_b5,
>> +               hevc_rlist_b6,
>> +               hevc_rlist_b7,
>> +               hevc_rlist_b8,
>> +               hevc_rlist_b9,
>> +               hevc_rlist_b10,
>> +               hevc_rlist_b11,
>> +               hevc_rlist_b12,
>> +               hevc_rlist_b13,
>> +               hevc_rlist_b14,
>> +               hevc_rlist_b15,
>> +       };
>> +       unsigned int i, j;
>> +
>> +       /* List 0 contains: short term before, short term after and long term */
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       /* Fill the list, copying over and over */
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list0))
>> +               list0[j++] = list0[i++];
>> +
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list1))
>> +               list1[j++] = list1[i++];
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
>> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
>> +       }
>> +}
>> +
>> +static int set_ref(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
>> +       u32 max_ref_frames;
>> +       u16 dpb_longterm_e;
>> +
>> +       const struct hantro_reg *cur_poc[] = {
>> +               hevc_cur_poc_00,
>> +               hevc_cur_poc_01,
>> +               hevc_cur_poc_02,
>> +               hevc_cur_poc_03,
>> +               hevc_cur_poc_04,
>> +               hevc_cur_poc_05,
>> +               hevc_cur_poc_06,
>> +               hevc_cur_poc_07,
>> +               hevc_cur_poc_08,
>> +               hevc_cur_poc_09,
>> +               hevc_cur_poc_10,
>> +               hevc_cur_poc_11,
>> +               hevc_cur_poc_12,
>> +               hevc_cur_poc_13,
>> +               hevc_cur_poc_14,
>> +               hevc_cur_poc_15,
>> +       };
>> +       unsigned int i;
>> +
>> +       max_ref_frames = decode_params->num_poc_lt_curr +
>> +               decode_params->num_poc_st_curr_before +
>> +               decode_params->num_poc_st_curr_after;
>> +       /*
>> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
>> +        * badly marked I-frames.
>> +        */
>> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
>> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
>> +       hantro_reg_write(vpu, hevc_filter_over_slices,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
>> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
>> +
>> +       /*
>> +        * Write POC count diff from current pic. For frame decoding only compute
>> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
>> +        */
>> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
>> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
>> +
>> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
>> +       }
>> +
>> +       if (i < ARRAY_SIZE(cur_poc)) {
>> +               /*
>> +                * After the references, fill one entry pointing to itself,
>> +                * i.e. difference is zero.
>> +                */
>> +               hantro_reg_write(vpu, cur_poc[i], 0);
>> +               i++;
>> +       }
>> +
>> +       /* Fill the rest with the current picture */
>> +       for (; i < ARRAY_SIZE(cur_poc); i++)
>> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
>> +
>> +       set_ref_pic_list(ctx);
>> +
>> +       /* We will only keep the references picture that are still used */
>> +       ctx->hevc_dec.ref_bufs_used = 0;
>> +
>> +       /* Set up addresses of DPB buffers */
>> +       dpb_longterm_e = 0;
>> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
>> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
>> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
>> +               if (!luma_addr)
>> +                       return -ENOMEM;
>> +
>> +               chroma_addr = luma_addr + cr_offset;
>> +               mv_addr = luma_addr + mv_offset;
>> +
>> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
>> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
>> +
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
>> +       }
>> +
>> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
>> +       if (!luma_addr)
>> +               return -ENOMEM;
>> +
>> +       chroma_addr = luma_addr + cr_offset;
>> +       mv_addr = luma_addr + mv_offset;
>> +
>> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
>> +
>> +       hantro_hevc_ref_remove_unused(ctx);
>> +
>> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
>> +
>> +       return 0;
>> +}
>> +
>> +static void set_buffers(struct hantro_ctx *ctx)
>> +{
>> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       dma_addr_t src_dma, dst_dma;
>> +       u32 src_len, src_buf_len;
>> +
>> +       src_buf = hantro_get_src_buf(ctx);
>> +       dst_buf = hantro_get_dst_buf(ctx);
>> +
>> +       /* Source (stream) buffer. */
>> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
>> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
>> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
>> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
>> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
>> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
>> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
>> +
>> +       /* Destination (decoded frame) buffer. */
>> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
>> +
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
>> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
>> +}
>> +
>> +void hantro_g2_check_idle(struct hantro_dev *vpu)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < 3; i++) {
>> +               u32 status;
>> +
>> +               /* Make sure the VPU is idle */
>> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
>> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> the warning "device still running, aborting" (I personally dislike the abort
> metaphor, but guess it's OK here).

Ok

>
>> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
>> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
>> +               }
>> +       }
>> +}
>> +
>> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int ret;
>> +
>> +       hantro_g2_check_idle(vpu);
>> +
>> +       /* Prepare HEVC decoder context. */
>> +       ret = hantro_hevc_dec_prepare_run(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Configure hardware registers. */
>> +       set_params(ctx);
>> +
>> +       /* set reference pictures */
>> +       ret = set_ref(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       set_buffers(ctx);
>> +       prepare_tile_info_buffer(ctx);
>> +
>> +       hantro_end_prepare_run(ctx);
>> +
>> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
>> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
>> +
>> +       /* Don't disable output */
>> +       hantro_reg_write(vpu, hevc_out_dis, 0);
>> +
>> +       /* Don't compress buffers */
>> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
>> +
>> +       /* use NV12 as output format */
>> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
>> +
>> +       /* Bus width and max burst */
>> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
>> +       hantro_reg_write(vpu, hevc_max_burst, 16);
>> +
>> +       /* Swap */
>> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
>> +
>> +       /* Start decoding! */
>> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
>> +
>> +       return 0;
>> +}
>> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
>> new file mode 100644
>> index 000000000000..a361c9ba911d
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
>> @@ -0,0 +1,198 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2021, Collabora
>> + *
>> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> + */
>> +
>> +#ifndef HANTRO_G2_REGS_H_
>> +#define HANTRO_G2_REGS_H_
>> +
>> +#include "hantro.h"
>> +
>> +#define G2_SWREG(nr)   ((nr) * 4)
>> +
>> +#define HEVC_DEC_REG(name, base, shift, mask) \
>> +       static const struct hantro_reg _hevc_##name[] = { \
>> +               { G2_SWREG(base), (shift), (mask) } \
>> +       }; \
>> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
>> +
>> +#define HEVC_REG_VERSION               G2_SWREG(0)
>> +
>> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
>> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
>> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
>> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
>> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
>> +
>> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
>> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
>> +
>> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
>> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
>> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
>> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
>> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
>> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
>> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
>> +
>> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
>> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
>> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
>> +
>> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
>> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
>> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
>> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
>> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
>> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
>> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
>> +
>> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
>> +
>> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
>> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
>> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
>> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
>> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
>> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
>> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
>> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
>> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
>> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
>> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
>> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
>> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
>> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
>> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
>> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
>> +
>> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
>> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
>> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
>> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
>> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
>> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
>> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
>> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
>> +
>> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
>> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
>> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
>> +
>> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
>> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
>> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
>> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
>> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
>> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
>> +
>> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
>> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
>> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
>> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
>> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
>> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
>> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
>> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
>> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
>> +
>> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
>> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
>> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
>> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
>> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
>> +
>> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
>> +
>> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
>> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
>> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
>> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
>> +
>> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
>> +
>> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
>> +
>> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
>> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
>> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
>> +
>> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
>> +
>> +#define HEVC_ADDR_DST          (G2_SWREG(65))
>> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
>> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
>> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
>> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
>> +#define HEVC_ADDR_STR          (G2_SWREG(169))
>> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
>> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
>> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
>> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
>> +#define HEVC_TILE_SAO          (G2_SWREG(181))
>> +#define HEVC_TILE_BSD          (G2_SWREG(183))
>> +
>> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
>> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
>> +
>> +#endif
>> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
>> new file mode 100644
>> index 000000000000..8e319a837ff3
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_hevc.c
>> @@ -0,0 +1,321 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <media/v4l2-mem2mem.h>
>> +
>> +#include "hantro.h"
>> +#include "hantro_hw.h"
>> +
>> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
>> +/*
>> + * BSD control data of current picture at tile border
>> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
>> + */
>> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
>> +/* tile border coefficients of filter */
>> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
>> +
>> +#define MAX_TILE_COLS 20
>> +#define MAX_TILE_ROWS 22
>> +
>> +#define UNUSED_REF     -1
>> +
>> +#define G2_ALIGN               16
>> +#define MC_WORD_SIZE           32
>> +
>> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
>> +
>> +       return sps->pic_width_in_luma_samples *
>> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
>> +}
>> +
>> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +
>> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
>> +}
>> +
>> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
>> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
>> +       size_t mv_size;
>> +
>> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
>> +                 (1 << (2 * (8 - 4))) * 16) + 32;
>> +
>> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
>> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
>> +
>> +       return mv_size;
>> +}
>> +
>> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +
>> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
>> +}
>> +
>> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int i;
>> +
>> +       /* Just tag buffer as unused, do not free them */
> This comment seems wrong.

You are right I will remove it.

>
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs[i].cpu) {
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> Is this memset clearing the buffer required? If we're getting artifacts
> from previous decodes, then that would be more of a bug somewhere.

Clear is done after allocating/reused the buffer I can remove this one.

>
>> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
>> +                                         hevc_dec->ref_bufs[i].cpu,
>> +                                         hevc_dec->ref_bufs[i].dma);
>> +               }
>> +       }
>> +}
>> +
>> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
>> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
>> +}
>> +
>> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
>> +                                  int poc)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       /* Find the reference buffer in already know ones */
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       return hevc_dec->ref_bufs[i].dma;
>> +               }
>> +       }
>> +
>> +       /* Allocate a new reference buffer */
>> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
>> +                       if (!hevc_dec->ref_bufs[i].cpu) {
>> +                               struct hantro_dev *vpu = ctx->dev;
>> +
>> +                               hevc_dec->ref_bufs[i].cpu =
>> +                                       dma_alloc_coherent(vpu->dev,
>> +                                                          hantro_hevc_ref_size(ctx),
>> +                                                          &hevc_dec->ref_bufs[i].dma,
>> +                                                          GFP_KERNEL);
> Is there any reason why we need to allocate reference buffers and MV contiguously?

It is done like that in IMX reference code and makes the management of reference frame
and MV more simple.

>
>> +                               if (!hevc_dec->ref_bufs[i].cpu)
>> +                                       return 0;
>> +
>> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
>> +                       }
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
>
> I believe the coherent allocation is to be able to clear each reference, but is this
> really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
>
> Also, if that's the case, then allocating the MV buffer separatedly will allow
> to not allocate the reference buffers coherently (note that we use NO_MAPPING
> in the vb2_queue, so the vb2_buffers shouldn't be coherent).

That sound like good possible optimizations but I'm not at this stage.
I would rather keep it in this fairly functional state and improve it later.
I think the patches are already enough larges and complexes like that.

Benjamin

>
> Thanks,
> Ezequiel
>
>

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 20:19       ` Benjamin Gaignard
  0 siblings, 0 replies; 66+ messages in thread
From: Benjamin Gaignard @ 2021-03-16 20:19 UTC (permalink / raw)
  To: Ezequiel Garcia, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel


Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> The series is looking really good. Some comments below.
>
> On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
>> Implement all the logic to get G2 hardware decoding HEVC frames.
>> It support up level 5.1 HEVC stream.
>> It doesn't support yet 10 bits formats or scaling feature.
>>
>> Add HANTRO HEVC dedicated control to skip some bits at the beginning
>> of the slice header. That is very specific to this hardware so can't
>> go into uapi structures. Compute the needed value is complex and require
>> information from the stream that only the userland knows so let it
>> provide the correct value to the driver.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 4:
>> - fix Ezequiel comments
>> - use dedicated control as an integer
>> - change hantro_g2_hevc_dec_run prototype to return errors
>>
>> version 2:
>> - squash multiple commits in this one.
>> - fix the comments done by Ezequiel about dma_alloc_coherent usage
>> - fix Dan's comments about control copy, reverse the test logic
>> in tile_buffer_reallocate, rework some goto and return cases.
>>
>>   drivers/staging/media/hantro/Makefile         |   2 +
>>   drivers/staging/media/hantro/hantro.h         |  18 +
>>   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
>>   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>>   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>>   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>>   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
>>   7 files changed, 1228 insertions(+)
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>>   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
>>
>> diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
>> index 743ce08eb184..0357f1772267 100644
>> --- a/drivers/staging/media/hantro/Makefile
>> +++ b/drivers/staging/media/hantro/Makefile
>> @@ -9,12 +9,14 @@ hantro-vpu-y += \
>>                  hantro_h1_jpeg_enc.o \
>>                  hantro_g1_h264_dec.o \
>>                  hantro_g1_mpeg2_dec.o \
>> +               hantro_g2_hevc_dec.o \
>>                  hantro_g1_vp8_dec.o \
>>                  rk3399_vpu_hw_jpeg_enc.o \
>>                  rk3399_vpu_hw_mpeg2_dec.o \
>>                  rk3399_vpu_hw_vp8_dec.o \
>>                  hantro_jpeg.o \
>>                  hantro_h264.o \
>> +               hantro_hevc.o \
>>                  hantro_mpeg2.o \
>>                  hantro_vp8.o
>>   
>> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
>> index 05876e426419..a9b80b2c9124 100644
>> --- a/drivers/staging/media/hantro/hantro.h
>> +++ b/drivers/staging/media/hantro/hantro.h
>> @@ -225,6 +225,7 @@ struct hantro_dev {
>>    * @jpeg_enc:          JPEG-encoding context.
>>    * @mpeg2_dec:         MPEG-2-decoding context.
>>    * @vp8_dec:           VP8-decoding context.
>> + * @hevc_dec:          HEVC-decoding context.
>>    */
>>   struct hantro_ctx {
>>          struct hantro_dev *dev;
>> @@ -251,6 +252,7 @@ struct hantro_ctx {
>>                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
>>                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
>>                  struct hantro_vp8_dec_hw_ctx vp8_dec;
>> +               struct hantro_hevc_dec_hw_ctx hevc_dec;
>>          };
>>   };
>>   
>> @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>>          return vb2_dma_contig_plane_dma_addr(vb, 0);
>>   }
>>   
>> +static inline size_t
>> +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].size;
>> +       return vb2_plane_size(vb, 0);
>> +}
>> +
>> +static inline void *
>> +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
>> +{
>> +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
>> +               return ctx->postproc.dec_q[vb->index].cpu;
>> +       return vb2_plane_vaddr(vb, 0);
>> +}
>> +
> Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?

You are right I will remove them

>
>>   void hantro_postproc_disable(struct hantro_ctx *ctx);
>>   void hantro_postproc_enable(struct hantro_ctx *ctx);
>>   void hantro_postproc_free(struct hantro_ctx *ctx);
>> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
>> index e3e6df28f470..bc90a52f4d3d 100644
>> --- a/drivers/staging/media/hantro/hantro_drv.c
>> +++ b/drivers/staging/media/hantro/hantro_drv.c
>> @@ -30,6 +30,13 @@
>>   
>>   #define DRIVER_NAME "hantro-vpu"
>>   
>> +/*
>> + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
>> + * the number of data (in bits) to skip in the
>> + * slice segment header syntax after 'slice type' token
>> + */
> I think we need to document this better, so applications can
> correctly use the control. From i.MX reference code, it seems
> this needs to be used as follows:
>
> If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> to before syntax element "slice_temporal_mvp_enabled_flag".
>
> If IDR, the skipped bits are just "pic_output_flag"
> (separate_colour_plane_flag is not supported).
>
> And it seems this needs to be passed parsing only the first slice,
> given this syntax remains invariant across all the slices.

Ok I will add your description in the next version.

>
>> +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
>> +
>>   int hantro_debug;
>>   module_param_named(debug, hantro_debug, int, 0644);
>>   MODULE_PARM_DESC(debug,
>> @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
>>          return 0;
>>   }
>>   
>> +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
>> +{
>> +       struct hantro_ctx *ctx;
>> +
>> +       ctx = container_of(ctrl->handler,
>> +                          struct hantro_ctx, ctrl_handler);
>> +
>> +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
>> +
>> +       switch (ctrl->id) {
>> +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
>> +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
>> +               break;
>> +       default:
>> +               return -EINVAL;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
>>          .try_ctrl = hantro_try_ctrl,
>>   };
>> @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
>>          .s_ctrl = hantro_jpeg_s_ctrl,
>>   };
>>   
>> +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
>> +       .s_ctrl = hantro_hevc_s_ctrl,
>> +};
>> +
>>   static const struct hantro_ctrl controls[] = {
>>          {
>>                  .codec = HANTRO_JPEG_ENCODER,
>> @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
>>                  .cfg = {
>>                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
>>                  },
>> +       }, {
>> +               .codec = HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
>> +                       .name = "Hantro HEVC slice header skip bytes",
>> +                       .type = V4L2_CTRL_TYPE_INTEGER,
>> +                       .min = 0,
>> +                       .def = 0,
>> +                       .max = 0x7fffffff,
>> +                       .step = 1,
>> +                       .ops = &hantro_hevc_ctrl_ops,
>> +               },
>> +       }, {
>> +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
>> +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
>> +                        HANTRO_HEVC_DECODER,
>> +               .cfg = {
>> +                       .id = V4L2_CID_USER_CLASS,
> This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> or by the spec?

It is required by v4l2-compliance.

>
>> +                       .name = "HANTRO controls",
>> +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
>> +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
>> +               },
>>          },
>>   };
>>   
>> diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> new file mode 100644
>> index 000000000000..5d75b36bc40c
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>> @@ -0,0 +1,587 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include "hantro_hw.h"
>> +#include "hantro_g2_regs.h"
>> +
>> +#define HEVC_DEC_MODE  0xC
>> +
>> +#define BUS_WIDTH_32           0
>> +#define BUS_WIDTH_64           1
>> +#define BUS_WIDTH_128          2
>> +#define BUS_WIDTH_256          3
>> +
>> +static inline void hantro_write_addr(struct hantro_dev *vpu,
>> +                                    unsigned long offset,
>> +                                    dma_addr_t addr)
>> +{
>> +       vdpu_write(vpu, addr & 0xffffffff, offset);
>> +}
>> +
>> +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
>> +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
>> +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
>> +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
>> +       unsigned int max_log2_ctb_size, ctb_size;
>> +       bool tiles_enabled, uniform_spacing;
>> +       u32 no_chroma = 0;
>> +
>> +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
>> +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
>> +
>> +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
>> +
>> +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
>> +                           sps->log2_diff_max_min_luma_coding_block_size;
>> +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
>> +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
>> +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
>> +                            >> max_log2_ctb_size;
>> +       ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
>> +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
>> +
>> +       if (tiles_enabled) {
>> +               unsigned int i, j, h;
>> +
>> +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
>> +
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
>> +
>> +               /* write width + height for each tile in pic */
>> +               if (!uniform_spacing) {
>> +                       u32 tmp_w = 0, tmp_h = 0;
>> +
>> +                       for (i = 0; i < num_tile_rows; i++) {
>> +                               if (i == num_tile_rows - 1)
>> +                                       h = pic_height_in_ctbs - tmp_h;
>> +                               else
>> +                                       h = pps->row_height_minus1[i] + 1;
>> +                               tmp_h += h;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
>> +                                       tmp_w += pps->column_width_minus1[j] + 1;
>> +                                       *p++ = pps->column_width_minus1[j + 1];
>> +                                       *p++ = h;
>> +                                       if (i == 0 && h == 1 && ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                               }
>> +                               /* last column */
>> +                               *p++ = pic_width_in_ctbs - tmp_w;
>> +                               *p++ = h;
>> +                       }
>> +               } else { /* uniform spacing */
>> +                       u32 tmp, prev_h, prev_w;
>> +
>> +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
>> +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
>> +                               h = tmp - prev_h;
>> +                               prev_h = tmp;
>> +                               if (i == 0 && h == 1 && ctb_size == 16)
>> +                                       no_chroma = 1;
>> +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
>> +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
>> +                                       *p++ = tmp - prev_w;
>> +                                       *p++ = h;
>> +                                       if (j == 0 &&
>> +                                           (pps->column_width_minus1[0] + 1) == 1 &&
>> +                                           ctb_size == 16)
>> +                                               no_chroma = 1;
>> +                                       prev_w = tmp;
>> +                               }
>> +                       }
>> +               }
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
>> +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
>> +
>> +               /* There's one tile, with dimensions equal to pic size. */
>> +               p[0] = pic_width_in_ctbs;
>> +               p[1] = pic_height_in_ctbs;
>> +       }
>> +
>> +       if (no_chroma)
>> +               vpu_debug(1, "%s: no chroma!\n", __func__);
>> +}
>> +
>> +static void set_params(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
>> +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
>> +       u32 pic_width_aligned, pic_height_aligned;
>> +       u32 partial_ctb_x, partial_ctb_y;
>> +
>> +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
>> +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
>> +
>> +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
>> +
>> +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
>> +
>> +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
>> +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
>> +
>> +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
>> +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
>> +
>> +       min_cb_size = 1 << min_log2_cb_size;
>> +       max_ctb_size = 1 << max_log2_ctb_size;
>> +
>> +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
>> +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
>> +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
>> +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
>> +
>> +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
>> +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
>> +
>> +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
>> +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
>> +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
>> +
>> +       hantro_reg_write(vpu, hevc_pic_width_4x4,
>> +                        (pic_width_in_min_cbs * min_cb_size) / 4);
>> +       hantro_reg_write(vpu, hevc_pic_height_4x4,
>> +                        (pic_height_in_min_cbs * min_cb_size) / 4);
>> +
>> +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_inter);
>> +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
>> +                        sps->max_transform_hierarchy_depth_intra);
>> +       hantro_reg_write(vpu, hevc_min_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_max_trb_size,
>> +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
>> +                        sps->log2_diff_max_min_luma_transform_block_size);
>> +
>> +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
>> +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
>> +       hantro_reg_write(vpu, hevc_strong_smooth_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
>> +       hantro_reg_write(vpu, hevc_asym_pred_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_sao_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
>> +       hantro_reg_write(vpu, hevc_sign_data_hide,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
>> +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
>> +       }
>> +
>> +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
>> +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
>> +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
>> +       hantro_reg_write(vpu, hevc_slice_chqp_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
>> +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
>> +       hantro_reg_write(vpu, hevc_transq_bypass,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
>> +       hantro_reg_write(vpu, hevc_list_mod_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
>> +       hantro_reg_write(vpu, hevc_entropy_sync_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_idr_pic_e,
>> +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
>> +       hantro_reg_write(vpu, hevc_parallel_merge,
>> +                        pps->log2_parallel_merge_level_minus2 + 2);
>> +       hantro_reg_write(vpu, hevc_pcm_filt_d,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
>> +       hantro_reg_write(vpu, hevc_pcm_e,
>> +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
>> +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size,
>> +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size,
>> +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
>> +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
>> +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
>> +       } else {
>> +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
>> +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_start_code_e, 1);
>> +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
>> +       hantro_reg_write(vpu, hevc_weight_pred_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
>> +       hantro_reg_write(vpu, hevc_cabac_init_present,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
>> +       hantro_reg_write(vpu, hevc_const_intra_e,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
>> +       hantro_reg_write(vpu, hevc_transform_skip,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
>> +       hantro_reg_write(vpu, hevc_out_filtering_dis,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
>> +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
>> +       hantro_reg_write(vpu, hevc_dependent_slice,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
>> +       hantro_reg_write(vpu, hevc_filter_override,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
>> +       hantro_reg_write(vpu, hevc_refidx0_active,
>> +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_refidx1_active,
>> +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
>> +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
>> +}
>> +
>> +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
>> +                       return i;
>> +       }
>> +
>> +       return 0x0;
>> +}
>> +
>> +static void set_ref_pic_list(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
>> +       const struct hantro_reg *ref_pic_regs0[] = {
>> +               hevc_rlist_f0,
>> +               hevc_rlist_f1,
>> +               hevc_rlist_f2,
>> +               hevc_rlist_f3,
>> +               hevc_rlist_f4,
>> +               hevc_rlist_f5,
>> +               hevc_rlist_f6,
>> +               hevc_rlist_f7,
>> +               hevc_rlist_f8,
>> +               hevc_rlist_f9,
>> +               hevc_rlist_f10,
>> +               hevc_rlist_f11,
>> +               hevc_rlist_f12,
>> +               hevc_rlist_f13,
>> +               hevc_rlist_f14,
>> +               hevc_rlist_f15,
>> +       };
>> +       const struct hantro_reg *ref_pic_regs1[] = {
>> +               hevc_rlist_b0,
>> +               hevc_rlist_b1,
>> +               hevc_rlist_b2,
>> +               hevc_rlist_b3,
>> +               hevc_rlist_b4,
>> +               hevc_rlist_b5,
>> +               hevc_rlist_b6,
>> +               hevc_rlist_b7,
>> +               hevc_rlist_b8,
>> +               hevc_rlist_b9,
>> +               hevc_rlist_b10,
>> +               hevc_rlist_b11,
>> +               hevc_rlist_b12,
>> +               hevc_rlist_b13,
>> +               hevc_rlist_b14,
>> +               hevc_rlist_b15,
>> +       };
>> +       unsigned int i, j;
>> +
>> +       /* List 0 contains: short term before, short term after and long term */
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
>> +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       /* Fill the list, copying over and over */
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list0))
>> +               list0[j++] = list0[i++];
>> +
>> +       j = 0;
>> +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
>> +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
>> +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
>> +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
>> +
>> +       i = 0;
>> +       while (j < ARRAY_SIZE(list1))
>> +               list1[j++] = list1[i++];
>> +
>> +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
>> +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
>> +       }
>> +}
>> +
>> +static int set_ref(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
>> +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
>> +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
>> +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
>> +       u32 max_ref_frames;
>> +       u16 dpb_longterm_e;
>> +
>> +       const struct hantro_reg *cur_poc[] = {
>> +               hevc_cur_poc_00,
>> +               hevc_cur_poc_01,
>> +               hevc_cur_poc_02,
>> +               hevc_cur_poc_03,
>> +               hevc_cur_poc_04,
>> +               hevc_cur_poc_05,
>> +               hevc_cur_poc_06,
>> +               hevc_cur_poc_07,
>> +               hevc_cur_poc_08,
>> +               hevc_cur_poc_09,
>> +               hevc_cur_poc_10,
>> +               hevc_cur_poc_11,
>> +               hevc_cur_poc_12,
>> +               hevc_cur_poc_13,
>> +               hevc_cur_poc_14,
>> +               hevc_cur_poc_15,
>> +       };
>> +       unsigned int i;
>> +
>> +       max_ref_frames = decode_params->num_poc_lt_curr +
>> +               decode_params->num_poc_st_curr_before +
>> +               decode_params->num_poc_st_curr_after;
>> +       /*
>> +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
>> +        * badly marked I-frames.
>> +        */
>> +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
>> +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
>> +       hantro_reg_write(vpu, hevc_filter_over_slices,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
>> +       hantro_reg_write(vpu, hevc_filter_over_tiles,
>> +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
>> +
>> +       /*
>> +        * Write POC count diff from current pic. For frame decoding only compute
>> +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
>> +        */
>> +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
>> +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
>> +
>> +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
>> +       }
>> +
>> +       if (i < ARRAY_SIZE(cur_poc)) {
>> +               /*
>> +                * After the references, fill one entry pointing to itself,
>> +                * i.e. difference is zero.
>> +                */
>> +               hantro_reg_write(vpu, cur_poc[i], 0);
>> +               i++;
>> +       }
>> +
>> +       /* Fill the rest with the current picture */
>> +       for (; i < ARRAY_SIZE(cur_poc); i++)
>> +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
>> +
>> +       set_ref_pic_list(ctx);
>> +
>> +       /* We will only keep the references picture that are still used */
>> +       ctx->hevc_dec.ref_bufs_used = 0;
>> +
>> +       /* Set up addresses of DPB buffers */
>> +       dpb_longterm_e = 0;
>> +       for (i = 0; i < decode_params->num_active_dpb_entries &&
>> +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
>> +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
>> +               if (!luma_addr)
>> +                       return -ENOMEM;
>> +
>> +               chroma_addr = luma_addr + cr_offset;
>> +               mv_addr = luma_addr + mv_offset;
>> +
>> +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
>> +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
>> +
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
>> +       }
>> +
>> +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
>> +       if (!luma_addr)
>> +               return -ENOMEM;
>> +
>> +       chroma_addr = luma_addr + cr_offset;
>> +       mv_addr = luma_addr + mv_offset;
>> +
>> +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
>> +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
>> +
>> +       hantro_hevc_ref_remove_unused(ctx);
>> +
>> +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
>> +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
>> +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
>> +       }
>> +
>> +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
>> +
>> +       return 0;
>> +}
>> +
>> +static void set_buffers(struct hantro_ctx *ctx)
>> +{
>> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       dma_addr_t src_dma, dst_dma;
>> +       u32 src_len, src_buf_len;
>> +
>> +       src_buf = hantro_get_src_buf(ctx);
>> +       dst_buf = hantro_get_dst_buf(ctx);
>> +
>> +       /* Source (stream) buffer. */
>> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
>> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
>> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
>> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
>> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
>> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
>> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
>> +
>> +       /* Destination (decoded frame) buffer. */
>> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
>> +
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
>> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
>> +}
>> +
>> +void hantro_g2_check_idle(struct hantro_dev *vpu)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < 3; i++) {
>> +               u32 status;
>> +
>> +               /* Make sure the VPU is idle */
>> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
>> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> the warning "device still running, aborting" (I personally dislike the abort
> metaphor, but guess it's OK here).

Ok

>
>> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
>> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
>> +               }
>> +       }
>> +}
>> +
>> +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int ret;
>> +
>> +       hantro_g2_check_idle(vpu);
>> +
>> +       /* Prepare HEVC decoder context. */
>> +       ret = hantro_hevc_dec_prepare_run(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Configure hardware registers. */
>> +       set_params(ctx);
>> +
>> +       /* set reference pictures */
>> +       ret = set_ref(ctx);
>> +       if (ret)
>> +               return ret;
>> +
>> +       set_buffers(ctx);
>> +       prepare_tile_info_buffer(ctx);
>> +
>> +       hantro_end_prepare_run(ctx);
>> +
>> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
>> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
>> +
>> +       /* Don't disable output */
>> +       hantro_reg_write(vpu, hevc_out_dis, 0);
>> +
>> +       /* Don't compress buffers */
>> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
>> +
>> +       /* use NV12 as output format */
>> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
>> +
>> +       /* Bus width and max burst */
>> +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
>> +       hantro_reg_write(vpu, hevc_max_burst, 16);
>> +
>> +       /* Swap */
>> +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
>> +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
>> +
>> +       /* Start decoding! */
>> +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
>> +
>> +       return 0;
>> +}
>> diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
>> new file mode 100644
>> index 000000000000..a361c9ba911d
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
>> @@ -0,0 +1,198 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2021, Collabora
>> + *
>> + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> + */
>> +
>> +#ifndef HANTRO_G2_REGS_H_
>> +#define HANTRO_G2_REGS_H_
>> +
>> +#include "hantro.h"
>> +
>> +#define G2_SWREG(nr)   ((nr) * 4)
>> +
>> +#define HEVC_DEC_REG(name, base, shift, mask) \
>> +       static const struct hantro_reg _hevc_##name[] = { \
>> +               { G2_SWREG(base), (shift), (mask) } \
>> +       }; \
>> +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
>> +
>> +#define HEVC_REG_VERSION               G2_SWREG(0)
>> +
>> +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
>> +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
>> +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
>> +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
>> +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
>> +
>> +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
>> +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
>> +
>> +HEVC_DEC_REG(mode,               3, 27, 0x1f)
>> +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
>> +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
>> +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
>> +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
>> +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
>> +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
>> +
>> +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
>> +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
>> +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
>> +
>> +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
>> +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
>> +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
>> +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
>> +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
>> +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
>> +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
>> +
>> +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
>> +
>> +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
>> +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
>> +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
>> +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
>> +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
>> +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
>> +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
>> +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
>> +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
>> +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
>> +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
>> +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
>> +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
>> +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
>> +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
>> +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
>> +
>> +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
>> +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
>> +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
>> +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
>> +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
>> +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
>> +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
>> +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
>> +
>> +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
>> +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
>> +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
>> +
>> +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
>> +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
>> +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
>> +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
>> +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
>> +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
>> +
>> +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
>> +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
>> +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
>> +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
>> +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
>> +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
>> +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
>> +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
>> +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
>> +
>> +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
>> +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
>> +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
>> +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
>> +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
>> +
>> +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
>> +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
>> +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
>> +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
>> +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
>> +
>> +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
>> +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
>> +
>> +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
>> +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
>> +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
>> +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
>> +
>> +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
>> +
>> +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
>> +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
>> +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
>> +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
>> +
>> +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
>> +
>> +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
>> +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
>> +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
>> +
>> +#define HEVC_REG_CONFIG                                G2_SWREG(58)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
>> +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
>> +
>> +#define HEVC_ADDR_DST          (G2_SWREG(65))
>> +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
>> +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
>> +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
>> +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
>> +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
>> +#define HEVC_ADDR_STR          (G2_SWREG(169))
>> +#define HEVC_SCALING_LIST      (G2_SWREG(171))
>> +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
>> +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
>> +#define HEVC_TILE_FILTER       (G2_SWREG(179))
>> +#define HEVC_TILE_SAO          (G2_SWREG(181))
>> +#define HEVC_TILE_BSD          (G2_SWREG(183))
>> +
>> +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
>> +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
>> +
>> +#endif
>> diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
>> new file mode 100644
>> index 000000000000..8e319a837ff3
>> --- /dev/null
>> +++ b/drivers/staging/media/hantro/hantro_hevc.c
>> @@ -0,0 +1,321 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Hantro VPU HEVC codec driver
>> + *
>> + * Copyright (C) 2020 Safran Passenger Innovations LLC
>> + */
>> +
>> +#include <linux/types.h>
>> +#include <media/v4l2-mem2mem.h>
>> +
>> +#include "hantro.h"
>> +#include "hantro_hw.h"
>> +
>> +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
>> +/*
>> + * BSD control data of current picture at tile border
>> + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
>> + */
>> +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
>> +/* tile border coefficients of filter */
>> +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
>> +
>> +#define MAX_TILE_COLS 20
>> +#define MAX_TILE_ROWS 22
>> +
>> +#define UNUSED_REF     -1
>> +
>> +#define G2_ALIGN               16
>> +#define MC_WORD_SIZE           32
>> +
>> +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
>> +
>> +       return sps->pic_width_in_luma_samples *
>> +               sps->pic_height_in_luma_samples * bytes_per_pixel;
>> +}
>> +
>> +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +
>> +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
>> +}
>> +
>> +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
>> +{
>> +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
>> +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
>> +       size_t mv_size;
>> +
>> +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
>> +                 (1 << (2 * (8 - 4))) * 16) + 32;
>> +
>> +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
>> +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
>> +
>> +       return mv_size;
>> +}
>> +
>> +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
>> +{
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +
>> +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
>> +}
>> +
>> +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       int i;
>> +
>> +       /* Just tag buffer as unused, do not free them */
> This comment seems wrong.

You are right I will remove it.

>
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs[i].cpu) {
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> Is this memset clearing the buffer required? If we're getting artifacts
> from previous decodes, then that would be more of a bug somewhere.

Clear is done after allocating/reused the buffer I can remove this one.

>
>> +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
>> +                                         hevc_dec->ref_bufs[i].cpu,
>> +                                         hevc_dec->ref_bufs[i].dma);
>> +               }
>> +       }
>> +}
>> +
>> +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++)
>> +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
>> +}
>> +
>> +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
>> +                                  int poc)
>> +{
>> +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> +       int i;
>> +
>> +       /* Find the reference buffer in already know ones */
>> +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == poc) {
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       return hevc_dec->ref_bufs[i].dma;
>> +               }
>> +       }
>> +
>> +       /* Allocate a new reference buffer */
>> +       for (i = 0; i < NUM_REF_PICTURES; i++) {
>> +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
>> +                       if (!hevc_dec->ref_bufs[i].cpu) {
>> +                               struct hantro_dev *vpu = ctx->dev;
>> +
>> +                               hevc_dec->ref_bufs[i].cpu =
>> +                                       dma_alloc_coherent(vpu->dev,
>> +                                                          hantro_hevc_ref_size(ctx),
>> +                                                          &hevc_dec->ref_bufs[i].dma,
>> +                                                          GFP_KERNEL);
> Is there any reason why we need to allocate reference buffers and MV contiguously?

It is done like that in IMX reference code and makes the management of reference frame
and MV more simple.

>
>> +                               if (!hevc_dec->ref_bufs[i].cpu)
>> +                                       return 0;
>> +
>> +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
>> +                       }
>> +                       hevc_dec->ref_bufs_used |= 1 << i;
>> +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
>
> I believe the coherent allocation is to be able to clear each reference, but is this
> really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
>
> Also, if that's the case, then allocating the MV buffer separatedly will allow
> to not allocate the reference buffers coherently (note that we use NO_MAPPING
> in the vb2_queue, so the vb2_buffers shouldn't be coherent).

That sound like good possible optimizations but I'm not at this stage.
I would rather keep it in this fairly functional state and improve it later.
I think the patches are already enough larges and complexes like that.

Benjamin

>
> Thanks,
> Ezequiel
>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
  2021-03-16 20:19       ` Benjamin Gaignard
  (?)
@ 2021-03-16 20:35         ` Ezequiel Garcia
  -1 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 20:35 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Tue, 2021-03-16 at 21:19 +0100, Benjamin Gaignard wrote:
> 
> Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> > Hi Benjamin,
> > 
> > The series is looking really good. Some comments below.
> > 
> > On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> > > Implement all the logic to get G2 hardware decoding HEVC frames.
> > > It support up level 5.1 HEVC stream.
> > > It doesn't support yet 10 bits formats or scaling feature.
> > > 
> > > Add HANTRO HEVC dedicated control to skip some bits at the beginning
> > > of the slice header. That is very specific to this hardware so can't
> > > go into uapi structures. Compute the needed value is complex and require
> > > information from the stream that only the userland knows so let it
> > > provide the correct value to the driver.
> > > 
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > version 4:
> > > - fix Ezequiel comments
> > > - use dedicated control as an integer
> > > - change hantro_g2_hevc_dec_run prototype to return errors
> > > 
> > > version 2:
> > > - squash multiple commits in this one.
> > > - fix the comments done by Ezequiel about dma_alloc_coherent usage
> > > - fix Dan's comments about control copy, reverse the test logic
> > > in tile_buffer_reallocate, rework some goto and return cases.
> > > 
> > >   drivers/staging/media/hantro/Makefile         |   2 +
> > >   drivers/staging/media/hantro/hantro.h         |  18 +
> > >   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
> > >   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
> > >   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
> > >   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
> > >   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
> > >   7 files changed, 1228 insertions(+)
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
> > >   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> > > 
> > > diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> > > index 743ce08eb184..0357f1772267 100644
> > > --- a/drivers/staging/media/hantro/Makefile
> > > +++ b/drivers/staging/media/hantro/Makefile
> > > @@ -9,12 +9,14 @@ hantro-vpu-y += \
> > >                  hantro_h1_jpeg_enc.o \
> > >                  hantro_g1_h264_dec.o \
> > >                  hantro_g1_mpeg2_dec.o \
> > > +               hantro_g2_hevc_dec.o \
> > >                  hantro_g1_vp8_dec.o \
> > >                  rk3399_vpu_hw_jpeg_enc.o \
> > >                  rk3399_vpu_hw_mpeg2_dec.o \
> > >                  rk3399_vpu_hw_vp8_dec.o \
> > >                  hantro_jpeg.o \
> > >                  hantro_h264.o \
> > > +               hantro_hevc.o \
> > >                  hantro_mpeg2.o \
> > >                  hantro_vp8.o
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> > > index 05876e426419..a9b80b2c9124 100644
> > > --- a/drivers/staging/media/hantro/hantro.h
> > > +++ b/drivers/staging/media/hantro/hantro.h
> > > @@ -225,6 +225,7 @@ struct hantro_dev {
> > >    * @jpeg_enc:          JPEG-encoding context.
> > >    * @mpeg2_dec:         MPEG-2-decoding context.
> > >    * @vp8_dec:           VP8-decoding context.
> > > + * @hevc_dec:          HEVC-decoding context.
> > >    */
> > >   struct hantro_ctx {
> > >          struct hantro_dev *dev;
> > > @@ -251,6 +252,7 @@ struct hantro_ctx {
> > >                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
> > >                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
> > >                  struct hantro_vp8_dec_hw_ctx vp8_dec;
> > > +               struct hantro_hevc_dec_hw_ctx hevc_dec;
> > >          };
> > >   };
> > >   
> > > @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > >          return vb2_dma_contig_plane_dma_addr(vb, 0);
> > >   }
> > >   
> > > +static inline size_t
> > > +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].size;
> > > +       return vb2_plane_size(vb, 0);
> > > +}
> > > +
> > > +static inline void *
> > > +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].cpu;
> > > +       return vb2_plane_vaddr(vb, 0);
> > > +}
> > > +
> > Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?
> 
> You are right I will remove them
> 
> > 
> > >   void hantro_postproc_disable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_enable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_free(struct hantro_ctx *ctx);
> > > diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> > > index e3e6df28f470..bc90a52f4d3d 100644
> > > --- a/drivers/staging/media/hantro/hantro_drv.c
> > > +++ b/drivers/staging/media/hantro/hantro_drv.c
> > > @@ -30,6 +30,13 @@
> > >   
> > >   #define DRIVER_NAME "hantro-vpu"
> > >   
> > > +/*
> > > + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> > > + * the number of data (in bits) to skip in the
> > > + * slice segment header syntax after 'slice type' token
> > > + */
> > I think we need to document this better, so applications can
> > correctly use the control. From i.MX reference code, it seems
> > this needs to be used as follows:
> > 
> > If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> > to before syntax element "slice_temporal_mvp_enabled_flag".
> > 
> > If IDR, the skipped bits are just "pic_output_flag"
> > (separate_colour_plane_flag is not supported).
> > 
> > And it seems this needs to be passed parsing only the first slice,
> > given this syntax remains invariant across all the slices.
> 
> Ok I will add your description in the next version.
> 
> > 
> > > +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> > > +
> > >   int hantro_debug;
> > >   module_param_named(debug, hantro_debug, int, 0644);
> > >   MODULE_PARM_DESC(debug,
> > > @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
> > >          return 0;
> > >   }
> > >   
> > > +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> > > +{
> > > +       struct hantro_ctx *ctx;
> > > +
> > > +       ctx = container_of(ctrl->handler,
> > > +                          struct hantro_ctx, ctrl_handler);
> > > +
> > > +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> > > +
> > > +       switch (ctrl->id) {
> > > +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> > > +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> > > +               break;
> > > +       default:
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > >   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> > >          .try_ctrl = hantro_try_ctrl,
> > >   };
> > > @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
> > >          .s_ctrl = hantro_jpeg_s_ctrl,
> > >   };
> > >   
> > > +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> > > +       .s_ctrl = hantro_hevc_s_ctrl,
> > > +};
> > > +
> > >   static const struct hantro_ctrl controls[] = {
> > >          {
> > >                  .codec = HANTRO_JPEG_ENCODER,
> > > @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
> > >                  .cfg = {
> > >                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
> > >                  },
> > > +       }, {
> > > +               .codec = HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> > > +                       .name = "Hantro HEVC slice header skip bytes",
> > > +                       .type = V4L2_CTRL_TYPE_INTEGER,
> > > +                       .min = 0,
> > > +                       .def = 0,
> > > +                       .max = 0x7fffffff,
> > > +                       .step = 1,
> > > +                       .ops = &hantro_hevc_ctrl_ops,
> > > +               },
> > > +       }, {
> > > +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> > > +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> > > +                        HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_USER_CLASS,
> > This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> > or by the spec?
> 
> It is required by v4l2-compliance.
> 

Unless Hans says otherwise, I'd say drop this V4L2_CID_USER_CLASS control,
and we can figure out what's wrong with v4l2-compliance later.

> > 
> > > +                       .name = "HANTRO controls",
> > > +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> > > +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> > > +               },
> > >          },
> > >   };
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > new file mode 100644
> > > index 000000000000..5d75b36bc40c
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > @@ -0,0 +1,587 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include "hantro_hw.h"
> > > +#include "hantro_g2_regs.h"
> > > +
> > > +#define HEVC_DEC_MODE  0xC
> > > +
> > > +#define BUS_WIDTH_32           0
> > > +#define BUS_WIDTH_64           1
> > > +#define BUS_WIDTH_128          2
> > > +#define BUS_WIDTH_256          3
> > > +
> > > +static inline void hantro_write_addr(struct hantro_dev *vpu,
> > > +                                    unsigned long offset,
> > > +                                    dma_addr_t addr)
> > > +{
> > > +       vdpu_write(vpu, addr & 0xffffffff, offset);
> > > +}
> > > +
> > > +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> > > +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> > > +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> > > +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> > > +       unsigned int max_log2_ctb_size, ctb_size;
> > > +       bool tiles_enabled, uniform_spacing;
> > > +       u32 no_chroma = 0;
> > > +
> > > +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> > > +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> > > +
> > > +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> > > +                           sps->log2_diff_max_min_luma_coding_block_size;
> > > +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> > > +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> > > +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> > > +                            >> max_log2_ctb_size;
> > > +       ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> > > +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> > > +
> > > +       if (tiles_enabled) {
> > > +               unsigned int i, j, h;
> > > +
> > > +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> > > +
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> > > +
> > > +               /* write width + height for each tile in pic */
> > > +               if (!uniform_spacing) {
> > > +                       u32 tmp_w = 0, tmp_h = 0;
> > > +
> > > +                       for (i = 0; i < num_tile_rows; i++) {
> > > +                               if (i == num_tile_rows - 1)
> > > +                                       h = pic_height_in_ctbs - tmp_h;
> > > +                               else
> > > +                                       h = pps->row_height_minus1[i] + 1;
> > > +                               tmp_h += h;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> > > +                                       tmp_w += pps->column_width_minus1[j] + 1;
> > > +                                       *p++ = pps->column_width_minus1[j + 1];
> > > +                                       *p++ = h;
> > > +                                       if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                               }
> > > +                               /* last column */
> > > +                               *p++ = pic_width_in_ctbs - tmp_w;
> > > +                               *p++ = h;
> > > +                       }
> > > +               } else { /* uniform spacing */
> > > +                       u32 tmp, prev_h, prev_w;
> > > +
> > > +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> > > +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> > > +                               h = tmp - prev_h;
> > > +                               prev_h = tmp;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> > > +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> > > +                                       *p++ = tmp - prev_w;
> > > +                                       *p++ = h;
> > > +                                       if (j == 0 &&
> > > +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> > > +                                           ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                                       prev_w = tmp;
> > > +                               }
> > > +                       }
> > > +               }
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> > > +
> > > +               /* There's one tile, with dimensions equal to pic size. */
> > > +               p[0] = pic_width_in_ctbs;
> > > +               p[1] = pic_height_in_ctbs;
> > > +       }
> > > +
> > > +       if (no_chroma)
> > > +               vpu_debug(1, "%s: no chroma!\n", __func__);
> > > +}
> > > +
> > > +static void set_params(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> > > +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> > > +       u32 pic_width_aligned, pic_height_aligned;
> > > +       u32 partial_ctb_x, partial_ctb_y;
> > > +
> > > +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> > > +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> > > +
> > > +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> > > +
> > > +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> > > +
> > > +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> > > +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> > > +
> > > +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> > > +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> > > +
> > > +       min_cb_size = 1 << min_log2_cb_size;
> > > +       max_ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> > > +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> > > +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> > > +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> > > +
> > > +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> > > +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> > > +
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> > > +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> > > +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> > > +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> > > +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> > > +
> > > +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_inter);
> > > +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_intra);
> > > +       hantro_reg_write(vpu, hevc_min_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_max_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> > > +                        sps->log2_diff_max_min_luma_transform_block_size);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> > > +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> > > +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_asym_pred_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_sao_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> > > +       hantro_reg_write(vpu, hevc_sign_data_hide,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> > > +       }
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> > > +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> > > +       hantro_reg_write(vpu, hevc_transq_bypass,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_list_mod_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_idr_pic_e,
> > > +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> > > +       hantro_reg_write(vpu, hevc_parallel_merge,
> > > +                        pps->log2_parallel_merge_level_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> > > +       hantro_reg_write(vpu, hevc_pcm_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> > > +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size,
> > > +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size,
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> > > +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> > > +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> > > +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> > > +       hantro_reg_write(vpu, hevc_weight_pred_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_const_intra_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> > > +       hantro_reg_write(vpu, hevc_transform_skip,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> > > +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_dependent_slice,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> > > +       hantro_reg_write(vpu, hevc_filter_override,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_refidx0_active,
> > > +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_refidx1_active,
> > > +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> > > +}
> > > +
> > > +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> > > +                       return i;
> > > +       }
> > > +
> > > +       return 0x0;
> > > +}
> > > +
> > > +static void set_ref_pic_list(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       const struct hantro_reg *ref_pic_regs0[] = {
> > > +               hevc_rlist_f0,
> > > +               hevc_rlist_f1,
> > > +               hevc_rlist_f2,
> > > +               hevc_rlist_f3,
> > > +               hevc_rlist_f4,
> > > +               hevc_rlist_f5,
> > > +               hevc_rlist_f6,
> > > +               hevc_rlist_f7,
> > > +               hevc_rlist_f8,
> > > +               hevc_rlist_f9,
> > > +               hevc_rlist_f10,
> > > +               hevc_rlist_f11,
> > > +               hevc_rlist_f12,
> > > +               hevc_rlist_f13,
> > > +               hevc_rlist_f14,
> > > +               hevc_rlist_f15,
> > > +       };
> > > +       const struct hantro_reg *ref_pic_regs1[] = {
> > > +               hevc_rlist_b0,
> > > +               hevc_rlist_b1,
> > > +               hevc_rlist_b2,
> > > +               hevc_rlist_b3,
> > > +               hevc_rlist_b4,
> > > +               hevc_rlist_b5,
> > > +               hevc_rlist_b6,
> > > +               hevc_rlist_b7,
> > > +               hevc_rlist_b8,
> > > +               hevc_rlist_b9,
> > > +               hevc_rlist_b10,
> > > +               hevc_rlist_b11,
> > > +               hevc_rlist_b12,
> > > +               hevc_rlist_b13,
> > > +               hevc_rlist_b14,
> > > +               hevc_rlist_b15,
> > > +       };
> > > +       unsigned int i, j;
> > > +
> > > +       /* List 0 contains: short term before, short term after and long term */
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       /* Fill the list, copying over and over */
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list0))
> > > +               list0[j++] = list0[i++];
> > > +
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list1))
> > > +               list1[j++] = list1[i++];
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> > > +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> > > +       }
> > > +}
> > > +
> > > +static int set_ref(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> > > +       u32 max_ref_frames;
> > > +       u16 dpb_longterm_e;
> > > +
> > > +       const struct hantro_reg *cur_poc[] = {
> > > +               hevc_cur_poc_00,
> > > +               hevc_cur_poc_01,
> > > +               hevc_cur_poc_02,
> > > +               hevc_cur_poc_03,
> > > +               hevc_cur_poc_04,
> > > +               hevc_cur_poc_05,
> > > +               hevc_cur_poc_06,
> > > +               hevc_cur_poc_07,
> > > +               hevc_cur_poc_08,
> > > +               hevc_cur_poc_09,
> > > +               hevc_cur_poc_10,
> > > +               hevc_cur_poc_11,
> > > +               hevc_cur_poc_12,
> > > +               hevc_cur_poc_13,
> > > +               hevc_cur_poc_14,
> > > +               hevc_cur_poc_15,
> > > +       };
> > > +       unsigned int i;
> > > +
> > > +       max_ref_frames = decode_params->num_poc_lt_curr +
> > > +               decode_params->num_poc_st_curr_before +
> > > +               decode_params->num_poc_st_curr_after;
> > > +       /*
> > > +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> > > +        * badly marked I-frames.
> > > +        */
> > > +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> > > +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> > > +       hantro_reg_write(vpu, hevc_filter_over_slices,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> > > +
> > > +       /*
> > > +        * Write POC count diff from current pic. For frame decoding only compute
> > > +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> > > +        */
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> > > +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> > > +
> > > +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> > > +       }
> > > +
> > > +       if (i < ARRAY_SIZE(cur_poc)) {
> > > +               /*
> > > +                * After the references, fill one entry pointing to itself,
> > > +                * i.e. difference is zero.
> > > +                */
> > > +               hantro_reg_write(vpu, cur_poc[i], 0);
> > > +               i++;
> > > +       }
> > > +
> > > +       /* Fill the rest with the current picture */
> > > +       for (; i < ARRAY_SIZE(cur_poc); i++)
> > > +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> > > +
> > > +       set_ref_pic_list(ctx);
> > > +
> > > +       /* We will only keep the references picture that are still used */
> > > +       ctx->hevc_dec.ref_bufs_used = 0;
> > > +
> > > +       /* Set up addresses of DPB buffers */
> > > +       dpb_longterm_e = 0;
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> > > +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> > > +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> > > +               if (!luma_addr)
> > > +                       return -ENOMEM;
> > > +
> > > +               chroma_addr = luma_addr + cr_offset;
> > > +               mv_addr = luma_addr + mv_offset;
> > > +
> > > +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> > > +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> > > +
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> > > +       }
> > > +
> > > +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> > > +       if (!luma_addr)
> > > +               return -ENOMEM;
> > > +
> > > +       chroma_addr = luma_addr + cr_offset;
> > > +       mv_addr = luma_addr + mv_offset;
> > > +
> > > +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> > > +
> > > +       hantro_hevc_ref_remove_unused(ctx);
> > > +
> > > +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void set_buffers(struct hantro_ctx *ctx)
> > > +{
> > > +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       dma_addr_t src_dma, dst_dma;
> > > +       u32 src_len, src_buf_len;
> > > +
> > > +       src_buf = hantro_get_src_buf(ctx);
> > > +       dst_buf = hantro_get_dst_buf(ctx);
> > > +
> > > +       /* Source (stream) buffer. */
> > > +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> > > +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> > > +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> > > +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> > > +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> > > +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> > > +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> > > +
> > > +       /* Destination (decoded frame) buffer. */
> > > +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> > > +}
> > > +
> > > +void hantro_g2_check_idle(struct hantro_dev *vpu)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < 3; i++) {
> > > +               u32 status;
> > > +
> > > +               /* Make sure the VPU is idle */
> > > +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> > > +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> > > +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> > How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> > the warning "device still running, aborting" (I personally dislike the abort
> > metaphor, but guess it's OK here).
> 
> Ok
> 
> > 
> > > +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> > > +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int ret;
> > > +
> > > +       hantro_g2_check_idle(vpu);
> > > +
> > > +       /* Prepare HEVC decoder context. */
> > > +       ret = hantro_hevc_dec_prepare_run(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       /* Configure hardware registers. */
> > > +       set_params(ctx);
> > > +
> > > +       /* set reference pictures */
> > > +       ret = set_ref(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       set_buffers(ctx);
> > > +       prepare_tile_info_buffer(ctx);
> > > +
> > > +       hantro_end_prepare_run(ctx);
> > > +
> > > +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> > > +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> > > +
> > > +       /* Don't disable output */
> > > +       hantro_reg_write(vpu, hevc_out_dis, 0);
> > > +
> > > +       /* Don't compress buffers */
> > > +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> > > +
> > > +       /* use NV12 as output format */
> > > +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> > > +
> > > +       /* Bus width and max burst */
> > > +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> > > +       hantro_reg_write(vpu, hevc_max_burst, 16);
> > > +
> > > +       /* Swap */
> > > +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> > > +
> > > +       /* Start decoding! */
> > > +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> > > +
> > > +       return 0;
> > > +}
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > new file mode 100644
> > > index 000000000000..a361c9ba911d
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > @@ -0,0 +1,198 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * Copyright (c) 2021, Collabora
> > > + *
> > > + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > + */
> > > +
> > > +#ifndef HANTRO_G2_REGS_H_
> > > +#define HANTRO_G2_REGS_H_
> > > +
> > > +#include "hantro.h"
> > > +
> > > +#define G2_SWREG(nr)   ((nr) * 4)
> > > +
> > > +#define HEVC_DEC_REG(name, base, shift, mask) \
> > > +       static const struct hantro_reg _hevc_##name[] = { \
> > > +               { G2_SWREG(base), (shift), (mask) } \
> > > +       }; \
> > > +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> > > +
> > > +#define HEVC_REG_VERSION               G2_SWREG(0)
> > > +
> > > +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> > > +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> > > +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> > > +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> > > +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> > > +
> > > +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> > > +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> > > +
> > > +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> > > +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> > > +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> > > +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> > > +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> > > +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> > > +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> > > +
> > > +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> > > +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> > > +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> > > +
> > > +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> > > +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> > > +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> > > +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> > > +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> > > +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> > > +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> > > +
> > > +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> > > +
> > > +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> > > +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> > > +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> > > +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> > > +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> > > +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> > > +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> > > +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> > > +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> > > +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> > > +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> > > +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> > > +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> > > +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> > > +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> > > +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> > > +
> > > +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> > > +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> > > +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> > > +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> > > +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> > > +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> > > +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> > > +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> > > +
> > > +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> > > +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> > > +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> > > +
> > > +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> > > +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> > > +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> > > +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> > > +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> > > +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> > > +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> > > +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> > > +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> > > +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> > > +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> > > +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> > > +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> > > +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> > > +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> > > +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> > > +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> > > +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> > > +
> > > +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> > > +
> > > +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> > > +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> > > +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> > > +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> > > +
> > > +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> > > +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> > > +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> > > +
> > > +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> > > +
> > > +#define HEVC_ADDR_DST          (G2_SWREG(65))
> > > +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> > > +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> > > +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> > > +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> > > +#define HEVC_ADDR_STR          (G2_SWREG(169))
> > > +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> > > +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> > > +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> > > +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> > > +#define HEVC_TILE_SAO          (G2_SWREG(181))
> > > +#define HEVC_TILE_BSD          (G2_SWREG(183))
> > > +
> > > +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> > > +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> > > +
> > > +#endif
> > > diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> > > new file mode 100644
> > > index 000000000000..8e319a837ff3
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_hevc.c
> > > @@ -0,0 +1,321 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include <linux/types.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +
> > > +#include "hantro.h"
> > > +#include "hantro_hw.h"
> > > +
> > > +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> > > +/*
> > > + * BSD control data of current picture at tile border
> > > + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> > > + */
> > > +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> > > +/* tile border coefficients of filter */
> > > +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> > > +
> > > +#define MAX_TILE_COLS 20
> > > +#define MAX_TILE_ROWS 22
> > > +
> > > +#define UNUSED_REF     -1
> > > +
> > > +#define G2_ALIGN               16
> > > +#define MC_WORD_SIZE           32
> > > +
> > > +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> > > +
> > > +       return sps->pic_width_in_luma_samples *
> > > +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> > > +}
> > > +
> > > +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +
> > > +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> > > +}
> > > +
> > > +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> > > +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> > > +       size_t mv_size;
> > > +
> > > +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> > > +                 (1 << (2 * (8 - 4))) * 16) + 32;
> > > +
> > > +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> > > +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> > > +
> > > +       return mv_size;
> > > +}
> > > +
> > > +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +
> > > +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> > > +}
> > > +
> > > +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int i;
> > > +
> > > +       /* Just tag buffer as unused, do not free them */
> > This comment seems wrong.
> 
> You are right I will remove it.
> 
> > 
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs[i].cpu) {
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > Is this memset clearing the buffer required? If we're getting artifacts
> > from previous decodes, then that would be more of a bug somewhere.
> 
> Clear is done after allocating/reused the buffer I can remove this one.
> 
> > 
> > > +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> > > +                                         hevc_dec->ref_bufs[i].cpu,
> > > +                                         hevc_dec->ref_bufs[i].dma);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> > > +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> > > +}
> > > +
> > > +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> > > +                                  int poc)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       /* Find the reference buffer in already know ones */
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       return hevc_dec->ref_bufs[i].dma;
> > > +               }
> > > +       }
> > > +
> > > +       /* Allocate a new reference buffer */
> > > +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> > > +                       if (!hevc_dec->ref_bufs[i].cpu) {
> > > +                               struct hantro_dev *vpu = ctx->dev;
> > > +
> > > +                               hevc_dec->ref_bufs[i].cpu =
> > > +                                       dma_alloc_coherent(vpu->dev,
> > > +                                                          hantro_hevc_ref_size(ctx),
> > > +                                                          &hevc_dec->ref_bufs[i].dma,
> > > +                                                          GFP_KERNEL);
> > Is there any reason why we need to allocate reference buffers and MV contiguously?
> 
> It is done like that in IMX reference code and makes the management of reference frame
> and MV more simple.
> 
> > 
> > > +                               if (!hevc_dec->ref_bufs[i].cpu)
> > > +                                       return 0;
> > > +
> > > +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> > > +                       }
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > 
> > I believe the coherent allocation is to be able to clear each reference, but is this
> > really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
> > 
> > Also, if that's the case, then allocating the MV buffer separatedly will allow
> > to not allocate the reference buffers coherently (note that we use NO_MAPPING
> > in the vb2_queue, so the vb2_buffers shouldn't be coherent).
> 
> That sound like good possible optimizations but I'm not at this stage.
> I would rather keep it in this fairly functional state and improve it later.
> I think the patches are already enough larges and complexes like that.
> 

Fair enough. I think it's great to have a first working
version :)

Could you add a comment for this, specially at the
memset's and the dma_alloc_coherent, (or optionally
at the header of this .c file), in case someone
wants to revisit this topic?

Thanks a lot!
Ezequiel 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 20:35         ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 20:35 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Tue, 2021-03-16 at 21:19 +0100, Benjamin Gaignard wrote:
> 
> Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> > Hi Benjamin,
> > 
> > The series is looking really good. Some comments below.
> > 
> > On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> > > Implement all the logic to get G2 hardware decoding HEVC frames.
> > > It support up level 5.1 HEVC stream.
> > > It doesn't support yet 10 bits formats or scaling feature.
> > > 
> > > Add HANTRO HEVC dedicated control to skip some bits at the beginning
> > > of the slice header. That is very specific to this hardware so can't
> > > go into uapi structures. Compute the needed value is complex and require
> > > information from the stream that only the userland knows so let it
> > > provide the correct value to the driver.
> > > 
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > version 4:
> > > - fix Ezequiel comments
> > > - use dedicated control as an integer
> > > - change hantro_g2_hevc_dec_run prototype to return errors
> > > 
> > > version 2:
> > > - squash multiple commits in this one.
> > > - fix the comments done by Ezequiel about dma_alloc_coherent usage
> > > - fix Dan's comments about control copy, reverse the test logic
> > > in tile_buffer_reallocate, rework some goto and return cases.
> > > 
> > >   drivers/staging/media/hantro/Makefile         |   2 +
> > >   drivers/staging/media/hantro/hantro.h         |  18 +
> > >   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
> > >   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
> > >   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
> > >   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
> > >   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
> > >   7 files changed, 1228 insertions(+)
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
> > >   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> > > 
> > > diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> > > index 743ce08eb184..0357f1772267 100644
> > > --- a/drivers/staging/media/hantro/Makefile
> > > +++ b/drivers/staging/media/hantro/Makefile
> > > @@ -9,12 +9,14 @@ hantro-vpu-y += \
> > >                  hantro_h1_jpeg_enc.o \
> > >                  hantro_g1_h264_dec.o \
> > >                  hantro_g1_mpeg2_dec.o \
> > > +               hantro_g2_hevc_dec.o \
> > >                  hantro_g1_vp8_dec.o \
> > >                  rk3399_vpu_hw_jpeg_enc.o \
> > >                  rk3399_vpu_hw_mpeg2_dec.o \
> > >                  rk3399_vpu_hw_vp8_dec.o \
> > >                  hantro_jpeg.o \
> > >                  hantro_h264.o \
> > > +               hantro_hevc.o \
> > >                  hantro_mpeg2.o \
> > >                  hantro_vp8.o
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> > > index 05876e426419..a9b80b2c9124 100644
> > > --- a/drivers/staging/media/hantro/hantro.h
> > > +++ b/drivers/staging/media/hantro/hantro.h
> > > @@ -225,6 +225,7 @@ struct hantro_dev {
> > >    * @jpeg_enc:          JPEG-encoding context.
> > >    * @mpeg2_dec:         MPEG-2-decoding context.
> > >    * @vp8_dec:           VP8-decoding context.
> > > + * @hevc_dec:          HEVC-decoding context.
> > >    */
> > >   struct hantro_ctx {
> > >          struct hantro_dev *dev;
> > > @@ -251,6 +252,7 @@ struct hantro_ctx {
> > >                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
> > >                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
> > >                  struct hantro_vp8_dec_hw_ctx vp8_dec;
> > > +               struct hantro_hevc_dec_hw_ctx hevc_dec;
> > >          };
> > >   };
> > >   
> > > @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > >          return vb2_dma_contig_plane_dma_addr(vb, 0);
> > >   }
> > >   
> > > +static inline size_t
> > > +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].size;
> > > +       return vb2_plane_size(vb, 0);
> > > +}
> > > +
> > > +static inline void *
> > > +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].cpu;
> > > +       return vb2_plane_vaddr(vb, 0);
> > > +}
> > > +
> > Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?
> 
> You are right I will remove them
> 
> > 
> > >   void hantro_postproc_disable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_enable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_free(struct hantro_ctx *ctx);
> > > diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> > > index e3e6df28f470..bc90a52f4d3d 100644
> > > --- a/drivers/staging/media/hantro/hantro_drv.c
> > > +++ b/drivers/staging/media/hantro/hantro_drv.c
> > > @@ -30,6 +30,13 @@
> > >   
> > >   #define DRIVER_NAME "hantro-vpu"
> > >   
> > > +/*
> > > + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> > > + * the number of data (in bits) to skip in the
> > > + * slice segment header syntax after 'slice type' token
> > > + */
> > I think we need to document this better, so applications can
> > correctly use the control. From i.MX reference code, it seems
> > this needs to be used as follows:
> > 
> > If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> > to before syntax element "slice_temporal_mvp_enabled_flag".
> > 
> > If IDR, the skipped bits are just "pic_output_flag"
> > (separate_colour_plane_flag is not supported).
> > 
> > And it seems this needs to be passed parsing only the first slice,
> > given this syntax remains invariant across all the slices.
> 
> Ok I will add your description in the next version.
> 
> > 
> > > +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> > > +
> > >   int hantro_debug;
> > >   module_param_named(debug, hantro_debug, int, 0644);
> > >   MODULE_PARM_DESC(debug,
> > > @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
> > >          return 0;
> > >   }
> > >   
> > > +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> > > +{
> > > +       struct hantro_ctx *ctx;
> > > +
> > > +       ctx = container_of(ctrl->handler,
> > > +                          struct hantro_ctx, ctrl_handler);
> > > +
> > > +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> > > +
> > > +       switch (ctrl->id) {
> > > +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> > > +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> > > +               break;
> > > +       default:
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > >   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> > >          .try_ctrl = hantro_try_ctrl,
> > >   };
> > > @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
> > >          .s_ctrl = hantro_jpeg_s_ctrl,
> > >   };
> > >   
> > > +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> > > +       .s_ctrl = hantro_hevc_s_ctrl,
> > > +};
> > > +
> > >   static const struct hantro_ctrl controls[] = {
> > >          {
> > >                  .codec = HANTRO_JPEG_ENCODER,
> > > @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
> > >                  .cfg = {
> > >                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
> > >                  },
> > > +       }, {
> > > +               .codec = HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> > > +                       .name = "Hantro HEVC slice header skip bytes",
> > > +                       .type = V4L2_CTRL_TYPE_INTEGER,
> > > +                       .min = 0,
> > > +                       .def = 0,
> > > +                       .max = 0x7fffffff,
> > > +                       .step = 1,
> > > +                       .ops = &hantro_hevc_ctrl_ops,
> > > +               },
> > > +       }, {
> > > +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> > > +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> > > +                        HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_USER_CLASS,
> > This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> > or by the spec?
> 
> It is required by v4l2-compliance.
> 

Unless Hans says otherwise, I'd say drop this V4L2_CID_USER_CLASS control,
and we can figure out what's wrong with v4l2-compliance later.

> > 
> > > +                       .name = "HANTRO controls",
> > > +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> > > +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> > > +               },
> > >          },
> > >   };
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > new file mode 100644
> > > index 000000000000..5d75b36bc40c
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > @@ -0,0 +1,587 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include "hantro_hw.h"
> > > +#include "hantro_g2_regs.h"
> > > +
> > > +#define HEVC_DEC_MODE  0xC
> > > +
> > > +#define BUS_WIDTH_32           0
> > > +#define BUS_WIDTH_64           1
> > > +#define BUS_WIDTH_128          2
> > > +#define BUS_WIDTH_256          3
> > > +
> > > +static inline void hantro_write_addr(struct hantro_dev *vpu,
> > > +                                    unsigned long offset,
> > > +                                    dma_addr_t addr)
> > > +{
> > > +       vdpu_write(vpu, addr & 0xffffffff, offset);
> > > +}
> > > +
> > > +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> > > +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> > > +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> > > +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> > > +       unsigned int max_log2_ctb_size, ctb_size;
> > > +       bool tiles_enabled, uniform_spacing;
> > > +       u32 no_chroma = 0;
> > > +
> > > +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> > > +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> > > +
> > > +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> > > +                           sps->log2_diff_max_min_luma_coding_block_size;
> > > +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> > > +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> > > +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> > > +                            >> max_log2_ctb_size;
> > > +       ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> > > +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> > > +
> > > +       if (tiles_enabled) {
> > > +               unsigned int i, j, h;
> > > +
> > > +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> > > +
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> > > +
> > > +               /* write width + height for each tile in pic */
> > > +               if (!uniform_spacing) {
> > > +                       u32 tmp_w = 0, tmp_h = 0;
> > > +
> > > +                       for (i = 0; i < num_tile_rows; i++) {
> > > +                               if (i == num_tile_rows - 1)
> > > +                                       h = pic_height_in_ctbs - tmp_h;
> > > +                               else
> > > +                                       h = pps->row_height_minus1[i] + 1;
> > > +                               tmp_h += h;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> > > +                                       tmp_w += pps->column_width_minus1[j] + 1;
> > > +                                       *p++ = pps->column_width_minus1[j + 1];
> > > +                                       *p++ = h;
> > > +                                       if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                               }
> > > +                               /* last column */
> > > +                               *p++ = pic_width_in_ctbs - tmp_w;
> > > +                               *p++ = h;
> > > +                       }
> > > +               } else { /* uniform spacing */
> > > +                       u32 tmp, prev_h, prev_w;
> > > +
> > > +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> > > +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> > > +                               h = tmp - prev_h;
> > > +                               prev_h = tmp;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> > > +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> > > +                                       *p++ = tmp - prev_w;
> > > +                                       *p++ = h;
> > > +                                       if (j == 0 &&
> > > +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> > > +                                           ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                                       prev_w = tmp;
> > > +                               }
> > > +                       }
> > > +               }
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> > > +
> > > +               /* There's one tile, with dimensions equal to pic size. */
> > > +               p[0] = pic_width_in_ctbs;
> > > +               p[1] = pic_height_in_ctbs;
> > > +       }
> > > +
> > > +       if (no_chroma)
> > > +               vpu_debug(1, "%s: no chroma!\n", __func__);
> > > +}
> > > +
> > > +static void set_params(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> > > +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> > > +       u32 pic_width_aligned, pic_height_aligned;
> > > +       u32 partial_ctb_x, partial_ctb_y;
> > > +
> > > +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> > > +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> > > +
> > > +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> > > +
> > > +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> > > +
> > > +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> > > +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> > > +
> > > +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> > > +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> > > +
> > > +       min_cb_size = 1 << min_log2_cb_size;
> > > +       max_ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> > > +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> > > +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> > > +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> > > +
> > > +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> > > +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> > > +
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> > > +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> > > +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> > > +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> > > +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> > > +
> > > +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_inter);
> > > +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_intra);
> > > +       hantro_reg_write(vpu, hevc_min_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_max_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> > > +                        sps->log2_diff_max_min_luma_transform_block_size);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> > > +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> > > +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_asym_pred_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_sao_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> > > +       hantro_reg_write(vpu, hevc_sign_data_hide,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> > > +       }
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> > > +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> > > +       hantro_reg_write(vpu, hevc_transq_bypass,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_list_mod_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_idr_pic_e,
> > > +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> > > +       hantro_reg_write(vpu, hevc_parallel_merge,
> > > +                        pps->log2_parallel_merge_level_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> > > +       hantro_reg_write(vpu, hevc_pcm_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> > > +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size,
> > > +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size,
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> > > +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> > > +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> > > +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> > > +       hantro_reg_write(vpu, hevc_weight_pred_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_const_intra_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> > > +       hantro_reg_write(vpu, hevc_transform_skip,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> > > +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_dependent_slice,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> > > +       hantro_reg_write(vpu, hevc_filter_override,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_refidx0_active,
> > > +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_refidx1_active,
> > > +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> > > +}
> > > +
> > > +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> > > +                       return i;
> > > +       }
> > > +
> > > +       return 0x0;
> > > +}
> > > +
> > > +static void set_ref_pic_list(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       const struct hantro_reg *ref_pic_regs0[] = {
> > > +               hevc_rlist_f0,
> > > +               hevc_rlist_f1,
> > > +               hevc_rlist_f2,
> > > +               hevc_rlist_f3,
> > > +               hevc_rlist_f4,
> > > +               hevc_rlist_f5,
> > > +               hevc_rlist_f6,
> > > +               hevc_rlist_f7,
> > > +               hevc_rlist_f8,
> > > +               hevc_rlist_f9,
> > > +               hevc_rlist_f10,
> > > +               hevc_rlist_f11,
> > > +               hevc_rlist_f12,
> > > +               hevc_rlist_f13,
> > > +               hevc_rlist_f14,
> > > +               hevc_rlist_f15,
> > > +       };
> > > +       const struct hantro_reg *ref_pic_regs1[] = {
> > > +               hevc_rlist_b0,
> > > +               hevc_rlist_b1,
> > > +               hevc_rlist_b2,
> > > +               hevc_rlist_b3,
> > > +               hevc_rlist_b4,
> > > +               hevc_rlist_b5,
> > > +               hevc_rlist_b6,
> > > +               hevc_rlist_b7,
> > > +               hevc_rlist_b8,
> > > +               hevc_rlist_b9,
> > > +               hevc_rlist_b10,
> > > +               hevc_rlist_b11,
> > > +               hevc_rlist_b12,
> > > +               hevc_rlist_b13,
> > > +               hevc_rlist_b14,
> > > +               hevc_rlist_b15,
> > > +       };
> > > +       unsigned int i, j;
> > > +
> > > +       /* List 0 contains: short term before, short term after and long term */
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       /* Fill the list, copying over and over */
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list0))
> > > +               list0[j++] = list0[i++];
> > > +
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list1))
> > > +               list1[j++] = list1[i++];
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> > > +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> > > +       }
> > > +}
> > > +
> > > +static int set_ref(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> > > +       u32 max_ref_frames;
> > > +       u16 dpb_longterm_e;
> > > +
> > > +       const struct hantro_reg *cur_poc[] = {
> > > +               hevc_cur_poc_00,
> > > +               hevc_cur_poc_01,
> > > +               hevc_cur_poc_02,
> > > +               hevc_cur_poc_03,
> > > +               hevc_cur_poc_04,
> > > +               hevc_cur_poc_05,
> > > +               hevc_cur_poc_06,
> > > +               hevc_cur_poc_07,
> > > +               hevc_cur_poc_08,
> > > +               hevc_cur_poc_09,
> > > +               hevc_cur_poc_10,
> > > +               hevc_cur_poc_11,
> > > +               hevc_cur_poc_12,
> > > +               hevc_cur_poc_13,
> > > +               hevc_cur_poc_14,
> > > +               hevc_cur_poc_15,
> > > +       };
> > > +       unsigned int i;
> > > +
> > > +       max_ref_frames = decode_params->num_poc_lt_curr +
> > > +               decode_params->num_poc_st_curr_before +
> > > +               decode_params->num_poc_st_curr_after;
> > > +       /*
> > > +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> > > +        * badly marked I-frames.
> > > +        */
> > > +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> > > +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> > > +       hantro_reg_write(vpu, hevc_filter_over_slices,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> > > +
> > > +       /*
> > > +        * Write POC count diff from current pic. For frame decoding only compute
> > > +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> > > +        */
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> > > +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> > > +
> > > +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> > > +       }
> > > +
> > > +       if (i < ARRAY_SIZE(cur_poc)) {
> > > +               /*
> > > +                * After the references, fill one entry pointing to itself,
> > > +                * i.e. difference is zero.
> > > +                */
> > > +               hantro_reg_write(vpu, cur_poc[i], 0);
> > > +               i++;
> > > +       }
> > > +
> > > +       /* Fill the rest with the current picture */
> > > +       for (; i < ARRAY_SIZE(cur_poc); i++)
> > > +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> > > +
> > > +       set_ref_pic_list(ctx);
> > > +
> > > +       /* We will only keep the references picture that are still used */
> > > +       ctx->hevc_dec.ref_bufs_used = 0;
> > > +
> > > +       /* Set up addresses of DPB buffers */
> > > +       dpb_longterm_e = 0;
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> > > +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> > > +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> > > +               if (!luma_addr)
> > > +                       return -ENOMEM;
> > > +
> > > +               chroma_addr = luma_addr + cr_offset;
> > > +               mv_addr = luma_addr + mv_offset;
> > > +
> > > +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> > > +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> > > +
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> > > +       }
> > > +
> > > +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> > > +       if (!luma_addr)
> > > +               return -ENOMEM;
> > > +
> > > +       chroma_addr = luma_addr + cr_offset;
> > > +       mv_addr = luma_addr + mv_offset;
> > > +
> > > +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> > > +
> > > +       hantro_hevc_ref_remove_unused(ctx);
> > > +
> > > +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void set_buffers(struct hantro_ctx *ctx)
> > > +{
> > > +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       dma_addr_t src_dma, dst_dma;
> > > +       u32 src_len, src_buf_len;
> > > +
> > > +       src_buf = hantro_get_src_buf(ctx);
> > > +       dst_buf = hantro_get_dst_buf(ctx);
> > > +
> > > +       /* Source (stream) buffer. */
> > > +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> > > +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> > > +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> > > +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> > > +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> > > +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> > > +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> > > +
> > > +       /* Destination (decoded frame) buffer. */
> > > +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> > > +}
> > > +
> > > +void hantro_g2_check_idle(struct hantro_dev *vpu)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < 3; i++) {
> > > +               u32 status;
> > > +
> > > +               /* Make sure the VPU is idle */
> > > +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> > > +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> > > +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> > How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> > the warning "device still running, aborting" (I personally dislike the abort
> > metaphor, but guess it's OK here).
> 
> Ok
> 
> > 
> > > +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> > > +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int ret;
> > > +
> > > +       hantro_g2_check_idle(vpu);
> > > +
> > > +       /* Prepare HEVC decoder context. */
> > > +       ret = hantro_hevc_dec_prepare_run(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       /* Configure hardware registers. */
> > > +       set_params(ctx);
> > > +
> > > +       /* set reference pictures */
> > > +       ret = set_ref(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       set_buffers(ctx);
> > > +       prepare_tile_info_buffer(ctx);
> > > +
> > > +       hantro_end_prepare_run(ctx);
> > > +
> > > +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> > > +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> > > +
> > > +       /* Don't disable output */
> > > +       hantro_reg_write(vpu, hevc_out_dis, 0);
> > > +
> > > +       /* Don't compress buffers */
> > > +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> > > +
> > > +       /* use NV12 as output format */
> > > +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> > > +
> > > +       /* Bus width and max burst */
> > > +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> > > +       hantro_reg_write(vpu, hevc_max_burst, 16);
> > > +
> > > +       /* Swap */
> > > +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> > > +
> > > +       /* Start decoding! */
> > > +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> > > +
> > > +       return 0;
> > > +}
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > new file mode 100644
> > > index 000000000000..a361c9ba911d
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > @@ -0,0 +1,198 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * Copyright (c) 2021, Collabora
> > > + *
> > > + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > + */
> > > +
> > > +#ifndef HANTRO_G2_REGS_H_
> > > +#define HANTRO_G2_REGS_H_
> > > +
> > > +#include "hantro.h"
> > > +
> > > +#define G2_SWREG(nr)   ((nr) * 4)
> > > +
> > > +#define HEVC_DEC_REG(name, base, shift, mask) \
> > > +       static const struct hantro_reg _hevc_##name[] = { \
> > > +               { G2_SWREG(base), (shift), (mask) } \
> > > +       }; \
> > > +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> > > +
> > > +#define HEVC_REG_VERSION               G2_SWREG(0)
> > > +
> > > +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> > > +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> > > +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> > > +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> > > +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> > > +
> > > +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> > > +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> > > +
> > > +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> > > +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> > > +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> > > +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> > > +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> > > +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> > > +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> > > +
> > > +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> > > +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> > > +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> > > +
> > > +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> > > +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> > > +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> > > +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> > > +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> > > +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> > > +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> > > +
> > > +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> > > +
> > > +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> > > +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> > > +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> > > +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> > > +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> > > +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> > > +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> > > +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> > > +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> > > +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> > > +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> > > +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> > > +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> > > +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> > > +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> > > +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> > > +
> > > +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> > > +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> > > +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> > > +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> > > +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> > > +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> > > +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> > > +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> > > +
> > > +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> > > +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> > > +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> > > +
> > > +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> > > +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> > > +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> > > +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> > > +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> > > +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> > > +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> > > +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> > > +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> > > +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> > > +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> > > +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> > > +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> > > +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> > > +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> > > +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> > > +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> > > +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> > > +
> > > +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> > > +
> > > +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> > > +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> > > +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> > > +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> > > +
> > > +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> > > +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> > > +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> > > +
> > > +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> > > +
> > > +#define HEVC_ADDR_DST          (G2_SWREG(65))
> > > +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> > > +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> > > +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> > > +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> > > +#define HEVC_ADDR_STR          (G2_SWREG(169))
> > > +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> > > +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> > > +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> > > +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> > > +#define HEVC_TILE_SAO          (G2_SWREG(181))
> > > +#define HEVC_TILE_BSD          (G2_SWREG(183))
> > > +
> > > +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> > > +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> > > +
> > > +#endif
> > > diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> > > new file mode 100644
> > > index 000000000000..8e319a837ff3
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_hevc.c
> > > @@ -0,0 +1,321 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include <linux/types.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +
> > > +#include "hantro.h"
> > > +#include "hantro_hw.h"
> > > +
> > > +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> > > +/*
> > > + * BSD control data of current picture at tile border
> > > + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> > > + */
> > > +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> > > +/* tile border coefficients of filter */
> > > +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> > > +
> > > +#define MAX_TILE_COLS 20
> > > +#define MAX_TILE_ROWS 22
> > > +
> > > +#define UNUSED_REF     -1
> > > +
> > > +#define G2_ALIGN               16
> > > +#define MC_WORD_SIZE           32
> > > +
> > > +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> > > +
> > > +       return sps->pic_width_in_luma_samples *
> > > +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> > > +}
> > > +
> > > +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +
> > > +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> > > +}
> > > +
> > > +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> > > +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> > > +       size_t mv_size;
> > > +
> > > +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> > > +                 (1 << (2 * (8 - 4))) * 16) + 32;
> > > +
> > > +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> > > +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> > > +
> > > +       return mv_size;
> > > +}
> > > +
> > > +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +
> > > +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> > > +}
> > > +
> > > +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int i;
> > > +
> > > +       /* Just tag buffer as unused, do not free them */
> > This comment seems wrong.
> 
> You are right I will remove it.
> 
> > 
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs[i].cpu) {
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > Is this memset clearing the buffer required? If we're getting artifacts
> > from previous decodes, then that would be more of a bug somewhere.
> 
> Clear is done after allocating/reused the buffer I can remove this one.
> 
> > 
> > > +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> > > +                                         hevc_dec->ref_bufs[i].cpu,
> > > +                                         hevc_dec->ref_bufs[i].dma);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> > > +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> > > +}
> > > +
> > > +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> > > +                                  int poc)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       /* Find the reference buffer in already know ones */
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       return hevc_dec->ref_bufs[i].dma;
> > > +               }
> > > +       }
> > > +
> > > +       /* Allocate a new reference buffer */
> > > +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> > > +                       if (!hevc_dec->ref_bufs[i].cpu) {
> > > +                               struct hantro_dev *vpu = ctx->dev;
> > > +
> > > +                               hevc_dec->ref_bufs[i].cpu =
> > > +                                       dma_alloc_coherent(vpu->dev,
> > > +                                                          hantro_hevc_ref_size(ctx),
> > > +                                                          &hevc_dec->ref_bufs[i].dma,
> > > +                                                          GFP_KERNEL);
> > Is there any reason why we need to allocate reference buffers and MV contiguously?
> 
> It is done like that in IMX reference code and makes the management of reference frame
> and MV more simple.
> 
> > 
> > > +                               if (!hevc_dec->ref_bufs[i].cpu)
> > > +                                       return 0;
> > > +
> > > +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> > > +                       }
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > 
> > I believe the coherent allocation is to be able to clear each reference, but is this
> > really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
> > 
> > Also, if that's the case, then allocating the MV buffer separatedly will allow
> > to not allocate the reference buffers coherently (note that we use NO_MAPPING
> > in the vb2_queue, so the vb2_buffers shouldn't be coherent).
> 
> That sound like good possible optimizations but I'm not at this stage.
> I would rather keep it in this fairly functional state and improve it later.
> I think the patches are already enough larges and complexes like that.
> 

Fair enough. I think it's great to have a first working
version :)

Could you add a comment for this, specially at the
memset's and the dma_alloc_coherent, (or optionally
at the header of this .c file), in case someone
wants to revisit this topic?

Thanks a lot!
Ezequiel 


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder
@ 2021-03-16 20:35         ` Ezequiel Garcia
  0 siblings, 0 replies; 66+ messages in thread
From: Ezequiel Garcia @ 2021-03-16 20:35 UTC (permalink / raw)
  To: Benjamin Gaignard, p.zabel, mchehab, robh+dt, shawnguo, s.hauer,
	kernel, festevam, linux-imx, gregkh, mripard, paul.kocialkowski,
	wens, jernej.skrabec, peng.fan, hverkuil-cisco, dan.carpenter
  Cc: linux-media, linux-rockchip, devicetree, linux-arm-kernel,
	linux-kernel, kernel

On Tue, 2021-03-16 at 21:19 +0100, Benjamin Gaignard wrote:
> 
> Le 16/03/2021 à 19:46, Ezequiel Garcia a écrit :
> > Hi Benjamin,
> > 
> > The series is looking really good. Some comments below.
> > 
> > On Wed, 2021-03-03 at 12:39 +0100, Benjamin Gaignard wrote:
> > > Implement all the logic to get G2 hardware decoding HEVC frames.
> > > It support up level 5.1 HEVC stream.
> > > It doesn't support yet 10 bits formats or scaling feature.
> > > 
> > > Add HANTRO HEVC dedicated control to skip some bits at the beginning
> > > of the slice header. That is very specific to this hardware so can't
> > > go into uapi structures. Compute the needed value is complex and require
> > > information from the stream that only the userland knows so let it
> > > provide the correct value to the driver.
> > > 
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > version 4:
> > > - fix Ezequiel comments
> > > - use dedicated control as an integer
> > > - change hantro_g2_hevc_dec_run prototype to return errors
> > > 
> > > version 2:
> > > - squash multiple commits in this one.
> > > - fix the comments done by Ezequiel about dma_alloc_coherent usage
> > > - fix Dan's comments about control copy, reverse the test logic
> > > in tile_buffer_reallocate, rework some goto and return cases.
> > > 
> > >   drivers/staging/media/hantro/Makefile         |   2 +
> > >   drivers/staging/media/hantro/hantro.h         |  18 +
> > >   drivers/staging/media/hantro/hantro_drv.c     |  53 ++
> > >   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
> > >   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
> > >   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
> > >   drivers/staging/media/hantro/hantro_hw.h      |  49 ++
> > >   7 files changed, 1228 insertions(+)
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
> > >   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> > > 
> > > diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile
> > > index 743ce08eb184..0357f1772267 100644
> > > --- a/drivers/staging/media/hantro/Makefile
> > > +++ b/drivers/staging/media/hantro/Makefile
> > > @@ -9,12 +9,14 @@ hantro-vpu-y += \
> > >                  hantro_h1_jpeg_enc.o \
> > >                  hantro_g1_h264_dec.o \
> > >                  hantro_g1_mpeg2_dec.o \
> > > +               hantro_g2_hevc_dec.o \
> > >                  hantro_g1_vp8_dec.o \
> > >                  rk3399_vpu_hw_jpeg_enc.o \
> > >                  rk3399_vpu_hw_mpeg2_dec.o \
> > >                  rk3399_vpu_hw_vp8_dec.o \
> > >                  hantro_jpeg.o \
> > >                  hantro_h264.o \
> > > +               hantro_hevc.o \
> > >                  hantro_mpeg2.o \
> > >                  hantro_vp8.o
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> > > index 05876e426419..a9b80b2c9124 100644
> > > --- a/drivers/staging/media/hantro/hantro.h
> > > +++ b/drivers/staging/media/hantro/hantro.h
> > > @@ -225,6 +225,7 @@ struct hantro_dev {
> > >    * @jpeg_enc:          JPEG-encoding context.
> > >    * @mpeg2_dec:         MPEG-2-decoding context.
> > >    * @vp8_dec:           VP8-decoding context.
> > > + * @hevc_dec:          HEVC-decoding context.
> > >    */
> > >   struct hantro_ctx {
> > >          struct hantro_dev *dev;
> > > @@ -251,6 +252,7 @@ struct hantro_ctx {
> > >                  struct hantro_jpeg_enc_hw_ctx jpeg_enc;
> > >                  struct hantro_mpeg2_dec_hw_ctx mpeg2_dec;
> > >                  struct hantro_vp8_dec_hw_ctx vp8_dec;
> > > +               struct hantro_hevc_dec_hw_ctx hevc_dec;
> > >          };
> > >   };
> > >   
> > > @@ -428,6 +430,22 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > >          return vb2_dma_contig_plane_dma_addr(vb, 0);
> > >   }
> > >   
> > > +static inline size_t
> > > +hantro_get_dec_buf_size(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].size;
> > > +       return vb2_plane_size(vb, 0);
> > > +}
> > > +
> > > +static inline void *
> > > +hantro_get_dec_buf(struct hantro_ctx *ctx, struct vb2_buffer *vb)
> > > +{
> > > +       if (hantro_needs_postproc(ctx, ctx->vpu_dst_fmt))
> > > +               return ctx->postproc.dec_q[vb->index].cpu;
> > > +       return vb2_plane_vaddr(vb, 0);
> > > +}
> > > +
> > Seems hantro_get_dec_buf_size and hantro_get_dec_buf are not used?
> 
> You are right I will remove them
> 
> > 
> > >   void hantro_postproc_disable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_enable(struct hantro_ctx *ctx);
> > >   void hantro_postproc_free(struct hantro_ctx *ctx);
> > > diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> > > index e3e6df28f470..bc90a52f4d3d 100644
> > > --- a/drivers/staging/media/hantro/hantro_drv.c
> > > +++ b/drivers/staging/media/hantro/hantro_drv.c
> > > @@ -30,6 +30,13 @@
> > >   
> > >   #define DRIVER_NAME "hantro-vpu"
> > >   
> > > +/*
> > > + * V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP -
> > > + * the number of data (in bits) to skip in the
> > > + * slice segment header syntax after 'slice type' token
> > > + */
> > I think we need to document this better, so applications can
> > correctly use the control. From i.MX reference code, it seems
> > this needs to be used as follows:
> > 
> > If non-IDR, the bits to be skipped go from syntax element "pic_output_flag"
> > to before syntax element "slice_temporal_mvp_enabled_flag".
> > 
> > If IDR, the skipped bits are just "pic_output_flag"
> > (separate_colour_plane_flag is not supported).
> > 
> > And it seems this needs to be passed parsing only the first slice,
> > given this syntax remains invariant across all the slices.
> 
> Ok I will add your description in the next version.
> 
> > 
> > > +#define V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP (V4L2_CID_USER_HANTRO_BASE + 0)
> > > +
> > >   int hantro_debug;
> > >   module_param_named(debug, hantro_debug, int, 0644);
> > >   MODULE_PARM_DESC(debug,
> > > @@ -281,6 +288,26 @@ static int hantro_jpeg_s_ctrl(struct v4l2_ctrl *ctrl)
> > >          return 0;
> > >   }
> > >   
> > > +static int hantro_hevc_s_ctrl(struct v4l2_ctrl *ctrl)
> > > +{
> > > +       struct hantro_ctx *ctx;
> > > +
> > > +       ctx = container_of(ctrl->handler,
> > > +                          struct hantro_ctx, ctrl_handler);
> > > +
> > > +       vpu_debug(1, "s_ctrl: id = %d, val = %d\n", ctrl->id, ctrl->val);
> > > +
> > > +       switch (ctrl->id) {
> > > +       case V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP:
> > > +               ctx->hevc_dec.ctrls.hevc_hdr_skip_length = ctrl->val;
> > > +               break;
> > > +       default:
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > >   static const struct v4l2_ctrl_ops hantro_ctrl_ops = {
> > >          .try_ctrl = hantro_try_ctrl,
> > >   };
> > > @@ -289,6 +316,10 @@ static const struct v4l2_ctrl_ops hantro_jpeg_ctrl_ops = {
> > >          .s_ctrl = hantro_jpeg_s_ctrl,
> > >   };
> > >   
> > > +static const struct v4l2_ctrl_ops hantro_hevc_ctrl_ops = {
> > > +       .s_ctrl = hantro_hevc_s_ctrl,
> > > +};
> > > +
> > >   static const struct hantro_ctrl controls[] = {
> > >          {
> > >                  .codec = HANTRO_JPEG_ENCODER,
> > > @@ -409,6 +440,28 @@ static const struct hantro_ctrl controls[] = {
> > >                  .cfg = {
> > >                          .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
> > >                  },
> > > +       }, {
> > > +               .codec = HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,
> > > +                       .name = "Hantro HEVC slice header skip bytes",
> > > +                       .type = V4L2_CTRL_TYPE_INTEGER,
> > > +                       .min = 0,
> > > +                       .def = 0,
> > > +                       .max = 0x7fffffff,
> > > +                       .step = 1,
> > > +                       .ops = &hantro_hevc_ctrl_ops,
> > > +               },
> > > +       }, {
> > > +               .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> > > +                        HANTRO_VP8_DECODER | HANTRO_H264_DECODER |
> > > +                        HANTRO_HEVC_DECODER,
> > > +               .cfg = {
> > > +                       .id = V4L2_CID_USER_CLASS,
> > This shouldn't be here, is this V4L2_CID_USER_CLASS required by v4l2-compliance
> > or by the spec?
> 
> It is required by v4l2-compliance.
> 

Unless Hans says otherwise, I'd say drop this V4L2_CID_USER_CLASS control,
and we can figure out what's wrong with v4l2-compliance later.

> > 
> > > +                       .name = "HANTRO controls",
> > > +                       .type = V4L2_CTRL_TYPE_CTRL_CLASS,
> > > +                       .flags = V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY,
> > > +               },
> > >          },
> > >   };
> > >   
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > new file mode 100644
> > > index 000000000000..5d75b36bc40c
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > > @@ -0,0 +1,587 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include "hantro_hw.h"
> > > +#include "hantro_g2_regs.h"
> > > +
> > > +#define HEVC_DEC_MODE  0xC
> > > +
> > > +#define BUS_WIDTH_32           0
> > > +#define BUS_WIDTH_64           1
> > > +#define BUS_WIDTH_128          2
> > > +#define BUS_WIDTH_256          3
> > > +
> > > +static inline void hantro_write_addr(struct hantro_dev *vpu,
> > > +                                    unsigned long offset,
> > > +                                    dma_addr_t addr)
> > > +{
> > > +       vdpu_write(vpu, addr & 0xffffffff, offset);
> > > +}
> > > +
> > > +static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       u16 *p = (u16 *)((u8 *)ctx->hevc_dec.tile_sizes.cpu);
> > > +       unsigned int num_tile_rows = pps->num_tile_rows_minus1 + 1;
> > > +       unsigned int num_tile_cols = pps->num_tile_columns_minus1 + 1;
> > > +       unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
> > > +       unsigned int max_log2_ctb_size, ctb_size;
> > > +       bool tiles_enabled, uniform_spacing;
> > > +       u32 no_chroma = 0;
> > > +
> > > +       tiles_enabled = !!(pps->flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED);
> > > +       uniform_spacing = !!(pps->flags & V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tile_e, tiles_enabled);
> > > +
> > > +       max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
> > > +                           sps->log2_diff_max_min_luma_coding_block_size;
> > > +       pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
> > > +                           (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
> > > +       pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
> > > +                            >> max_log2_ctb_size;
> > > +       ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       vpu_debug(1, "Preparing tile sizes buffer for %dx%d CTBs (CTB size %d)\n",
> > > +                 pic_width_in_ctbs, pic_height_in_ctbs, ctb_size);
> > > +
> > > +       if (tiles_enabled) {
> > > +               unsigned int i, j, h;
> > > +
> > > +               vpu_debug(1, "Tiles enabled! %dx%d\n", num_tile_cols, num_tile_rows);
> > > +
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, num_tile_rows);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, num_tile_cols);
> > > +
> > > +               /* write width + height for each tile in pic */
> > > +               if (!uniform_spacing) {
> > > +                       u32 tmp_w = 0, tmp_h = 0;
> > > +
> > > +                       for (i = 0; i < num_tile_rows; i++) {
> > > +                               if (i == num_tile_rows - 1)
> > > +                                       h = pic_height_in_ctbs - tmp_h;
> > > +                               else
> > > +                                       h = pps->row_height_minus1[i] + 1;
> > > +                               tmp_h += h;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
> > > +                                       tmp_w += pps->column_width_minus1[j] + 1;
> > > +                                       *p++ = pps->column_width_minus1[j + 1];
> > > +                                       *p++ = h;
> > > +                                       if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                               }
> > > +                               /* last column */
> > > +                               *p++ = pic_width_in_ctbs - tmp_w;
> > > +                               *p++ = h;
> > > +                       }
> > > +               } else { /* uniform spacing */
> > > +                       u32 tmp, prev_h, prev_w;
> > > +
> > > +                       for (i = 0, prev_h = 0; i < num_tile_rows; i++) {
> > > +                               tmp = (i + 1) * pic_height_in_ctbs / num_tile_rows;
> > > +                               h = tmp - prev_h;
> > > +                               prev_h = tmp;
> > > +                               if (i == 0 && h == 1 && ctb_size == 16)
> > > +                                       no_chroma = 1;
> > > +                               for (j = 0, prev_w = 0; j < num_tile_cols; j++) {
> > > +                                       tmp = (j + 1) * pic_width_in_ctbs / num_tile_cols;
> > > +                                       *p++ = tmp - prev_w;
> > > +                                       *p++ = h;
> > > +                                       if (j == 0 &&
> > > +                                           (pps->column_width_minus1[0] + 1) == 1 &&
> > > +                                           ctb_size == 16)
> > > +                                               no_chroma = 1;
> > > +                                       prev_w = tmp;
> > > +                               }
> > > +                       }
> > > +               }
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> > > +               hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> > > +
> > > +               /* There's one tile, with dimensions equal to pic size. */
> > > +               p[0] = pic_width_in_ctbs;
> > > +               p[1] = pic_height_in_ctbs;
> > > +       }
> > > +
> > > +       if (no_chroma)
> > > +               vpu_debug(1, "%s: no chroma!\n", __func__);
> > > +}
> > > +
> > > +static void set_params(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       u32 min_log2_cb_size, max_log2_ctb_size, min_cb_size, max_ctb_size;
> > > +       u32 pic_width_in_min_cbs, pic_height_in_min_cbs;
> > > +       u32 pic_width_aligned, pic_height_aligned;
> > > +       u32 partial_ctb_x, partial_ctb_y;
> > > +
> > > +       hantro_reg_write(vpu, hevc_bit_depth_y_minus8, sps->bit_depth_luma_minus8);
> > > +       hantro_reg_write(vpu, hevc_bit_depth_c_minus8, sps->bit_depth_chroma_minus8);
> > > +
> > > +       hantro_reg_write(vpu, hevc_output_8_bits, 0);
> > > +
> > > +       hantro_reg_write(vpu, hevc_hdr_skip_length, ctrls->hevc_hdr_skip_length);
> > > +
> > > +       min_log2_cb_size = sps->log2_min_luma_coding_block_size_minus3 + 3;
> > > +       max_log2_ctb_size = min_log2_cb_size + sps->log2_diff_max_min_luma_coding_block_size;
> > > +
> > > +       hantro_reg_write(vpu, hevc_min_cb_size, min_log2_cb_size);
> > > +       hantro_reg_write(vpu, hevc_max_cb_size, max_log2_ctb_size);
> > > +
> > > +       min_cb_size = 1 << min_log2_cb_size;
> > > +       max_ctb_size = 1 << max_log2_ctb_size;
> > > +
> > > +       pic_width_in_min_cbs = sps->pic_width_in_luma_samples / min_cb_size;
> > > +       pic_height_in_min_cbs = sps->pic_height_in_luma_samples / min_cb_size;
> > > +       pic_width_aligned = ALIGN(sps->pic_width_in_luma_samples, max_ctb_size);
> > > +       pic_height_aligned = ALIGN(sps->pic_height_in_luma_samples, max_ctb_size);
> > > +
> > > +       partial_ctb_x = !!(sps->pic_width_in_luma_samples != pic_width_aligned);
> > > +       partial_ctb_y = !!(sps->pic_height_in_luma_samples != pic_height_aligned);
> > > +
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_x, partial_ctb_x);
> > > +       hantro_reg_write(vpu, hevc_partial_ctb_y, partial_ctb_y);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_in_cbs, pic_width_in_min_cbs);
> > > +       hantro_reg_write(vpu, hevc_pic_height_in_cbs, pic_height_in_min_cbs);
> > > +
> > > +       hantro_reg_write(vpu, hevc_pic_width_4x4,
> > > +                        (pic_width_in_min_cbs * min_cb_size) / 4);
> > > +       hantro_reg_write(vpu, hevc_pic_height_4x4,
> > > +                        (pic_height_in_min_cbs * min_cb_size) / 4);
> > > +
> > > +       hantro_reg_write(vpu, hevc_max_inter_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_inter);
> > > +       hantro_reg_write(vpu, hevc_max_intra_hierdepth,
> > > +                        sps->max_transform_hierarchy_depth_intra);
> > > +       hantro_reg_write(vpu, hevc_min_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_max_trb_size,
> > > +                        sps->log2_min_luma_transform_block_size_minus2 + 2 +
> > > +                        sps->log2_diff_max_min_luma_transform_block_size);
> > > +
> > > +       hantro_reg_write(vpu, hevc_tempor_mvp_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) &&
> > > +                        !(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IDR_PIC));
> > > +       hantro_reg_write(vpu, hevc_strong_smooth_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_asym_pred_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_sao_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET));
> > > +       hantro_reg_write(vpu, hevc_sign_data_hide,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED));
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 1);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, pps->diff_cu_qp_delta_depth);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cu_qpd_e, 0);
> > > +               hantro_reg_write(vpu, hevc_max_cu_qpd_depth, 0);
> > > +       }
> > > +
> > > +       if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, pps->pps_cb_qp_offset);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, pps->pps_cr_qp_offset);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_cb_qp_offset, 0);
> > > +               hantro_reg_write(vpu, hevc_cr_qp_offset, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_filt_offset_beta, pps->pps_beta_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_filt_offset_tc, pps->pps_tc_offset_div2);
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_slice_hdr_ext_bits, pps->num_extra_slice_header_bits);
> > > +       hantro_reg_write(vpu, hevc_slice_chqp_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_weight_bipr_idc,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED));
> > > +       hantro_reg_write(vpu, hevc_transq_bypass,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_list_mod_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_entropy_sync_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_idr_pic_e,
> > > +                        !!(decode_params->flags & V4L2_HEVC_DECODE_PARAM_FLAG_IRAP_PIC));
> > > +       hantro_reg_write(vpu, hevc_parallel_merge,
> > > +                        pps->log2_parallel_merge_level_minus2 + 2);
> > > +       hantro_reg_write(vpu, hevc_pcm_filt_d,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED));
> > > +       hantro_reg_write(vpu, hevc_pcm_e,
> > > +                        !!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED));
> > > +       if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED) {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size,
> > > +                                sps->log2_diff_max_min_pcm_luma_coding_block_size +
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size,
> > > +                                sps->log2_min_pcm_luma_coding_block_size_minus3 + 3);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y,
> > > +                                sps->pcm_sample_bit_depth_luma_minus1 + 1);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c,
> > > +                                sps->pcm_sample_bit_depth_chroma_minus1 + 1);
> > > +       } else {
> > > +               hantro_reg_write(vpu, hevc_max_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_min_pcm_size, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_y, 0);
> > > +               hantro_reg_write(vpu, hevc_bit_depth_pcm_c, 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_start_code_e, 1);
> > > +       hantro_reg_write(vpu, hevc_init_qp, pps->init_qp_minus26 + 26);
> > > +       hantro_reg_write(vpu, hevc_weight_pred_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED));
> > > +       hantro_reg_write(vpu, hevc_cabac_init_present,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_const_intra_e,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED));
> > > +       hantro_reg_write(vpu, hevc_transform_skip,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_out_filtering_dis,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER));
> > > +       hantro_reg_write(vpu, hevc_filt_ctrl_pres,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT));
> > > +       hantro_reg_write(vpu, hevc_dependent_slice,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT));
> > > +       hantro_reg_write(vpu, hevc_filter_override,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_refidx0_active,
> > > +                        pps->num_ref_idx_l0_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_refidx1_active,
> > > +                        pps->num_ref_idx_l1_default_active_minus1 + 1);
> > > +       hantro_reg_write(vpu, hevc_apf_threshold, 8);
> > > +}
> > > +
> > > +static int find_ref_pic_index(const struct v4l2_hevc_dpb_entry *dpb, int pic_order_cnt)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               if (dpb[i].pic_order_cnt[0] == pic_order_cnt)
> > > +                       return i;
> > > +       }
> > > +
> > > +       return 0x0;
> > > +}
> > > +
> > > +static void set_ref_pic_list(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       u32 list0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       u32 list1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX] = {0};
> > > +       const struct hantro_reg *ref_pic_regs0[] = {
> > > +               hevc_rlist_f0,
> > > +               hevc_rlist_f1,
> > > +               hevc_rlist_f2,
> > > +               hevc_rlist_f3,
> > > +               hevc_rlist_f4,
> > > +               hevc_rlist_f5,
> > > +               hevc_rlist_f6,
> > > +               hevc_rlist_f7,
> > > +               hevc_rlist_f8,
> > > +               hevc_rlist_f9,
> > > +               hevc_rlist_f10,
> > > +               hevc_rlist_f11,
> > > +               hevc_rlist_f12,
> > > +               hevc_rlist_f13,
> > > +               hevc_rlist_f14,
> > > +               hevc_rlist_f15,
> > > +       };
> > > +       const struct hantro_reg *ref_pic_regs1[] = {
> > > +               hevc_rlist_b0,
> > > +               hevc_rlist_b1,
> > > +               hevc_rlist_b2,
> > > +               hevc_rlist_b3,
> > > +               hevc_rlist_b4,
> > > +               hevc_rlist_b5,
> > > +               hevc_rlist_b6,
> > > +               hevc_rlist_b7,
> > > +               hevc_rlist_b8,
> > > +               hevc_rlist_b9,
> > > +               hevc_rlist_b10,
> > > +               hevc_rlist_b11,
> > > +               hevc_rlist_b12,
> > > +               hevc_rlist_b13,
> > > +               hevc_rlist_b14,
> > > +               hevc_rlist_b15,
> > > +       };
> > > +       unsigned int i, j;
> > > +
> > > +       /* List 0 contains: short term before, short term after and long term */
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list0); i++)
> > > +               list0[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       /* Fill the list, copying over and over */
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list0))
> > > +               list0[j++] = list0[i++];
> > > +
> > > +       j = 0;
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_after && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_after[i]);
> > > +       for (i = 0; i < decode_params->num_poc_st_curr_before && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_st_curr_before[i]);
> > > +       for (i = 0; i < decode_params->num_poc_lt_curr && j < ARRAY_SIZE(list1); i++)
> > > +               list1[j++] = find_ref_pic_index(dpb, decode_params->poc_lt_curr[i]);
> > > +
> > > +       i = 0;
> > > +       while (j < ARRAY_SIZE(list1))
> > > +               list1[j++] = list1[i++];
> > > +
> > > +       for (i = 0; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_reg_write(vpu, ref_pic_regs0[i], list0[i]);
> > > +               hantro_reg_write(vpu, ref_pic_regs1[i], list1[i]);
> > > +       }
> > > +}
> > > +
> > > +static int set_ref(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
> > > +       const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
> > > +       const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
> > > +       dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
> > > +       u32 max_ref_frames;
> > > +       u16 dpb_longterm_e;
> > > +
> > > +       const struct hantro_reg *cur_poc[] = {
> > > +               hevc_cur_poc_00,
> > > +               hevc_cur_poc_01,
> > > +               hevc_cur_poc_02,
> > > +               hevc_cur_poc_03,
> > > +               hevc_cur_poc_04,
> > > +               hevc_cur_poc_05,
> > > +               hevc_cur_poc_06,
> > > +               hevc_cur_poc_07,
> > > +               hevc_cur_poc_08,
> > > +               hevc_cur_poc_09,
> > > +               hevc_cur_poc_10,
> > > +               hevc_cur_poc_11,
> > > +               hevc_cur_poc_12,
> > > +               hevc_cur_poc_13,
> > > +               hevc_cur_poc_14,
> > > +               hevc_cur_poc_15,
> > > +       };
> > > +       unsigned int i;
> > > +
> > > +       max_ref_frames = decode_params->num_poc_lt_curr +
> > > +               decode_params->num_poc_st_curr_before +
> > > +               decode_params->num_poc_st_curr_after;
> > > +       /*
> > > +        * Set max_ref_frames to non-zero to avoid HW hang when decoding
> > > +        * badly marked I-frames.
> > > +        */
> > > +       max_ref_frames = max_ref_frames ? max_ref_frames : 1;
> > > +       hantro_reg_write(vpu, hevc_num_ref_frames, max_ref_frames);
> > > +       hantro_reg_write(vpu, hevc_filter_over_slices,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED));
> > > +       hantro_reg_write(vpu, hevc_filter_over_tiles,
> > > +                        !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
> > > +
> > > +       /*
> > > +        * Write POC count diff from current pic. For frame decoding only compute
> > > +        * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
> > > +        */
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
> > > +               char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
> > > +
> > > +               hantro_reg_write(vpu, cur_poc[i], poc_diff);
> > > +       }
> > > +
> > > +       if (i < ARRAY_SIZE(cur_poc)) {
> > > +               /*
> > > +                * After the references, fill one entry pointing to itself,
> > > +                * i.e. difference is zero.
> > > +                */
> > > +               hantro_reg_write(vpu, cur_poc[i], 0);
> > > +               i++;
> > > +       }
> > > +
> > > +       /* Fill the rest with the current picture */
> > > +       for (; i < ARRAY_SIZE(cur_poc); i++)
> > > +               hantro_reg_write(vpu, cur_poc[i], decode_params->pic_order_cnt_val);
> > > +
> > > +       set_ref_pic_list(ctx);
> > > +
> > > +       /* We will only keep the references picture that are still used */
> > > +       ctx->hevc_dec.ref_bufs_used = 0;
> > > +
> > > +       /* Set up addresses of DPB buffers */
> > > +       dpb_longterm_e = 0;
> > > +       for (i = 0; i < decode_params->num_active_dpb_entries &&
> > > +            i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
> > > +               luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
> > > +               if (!luma_addr)
> > > +                       return -ENOMEM;
> > > +
> > > +               chroma_addr = luma_addr + cr_offset;
> > > +               mv_addr = luma_addr + mv_offset;
> > > +
> > > +               if (dpb[i].rps == V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR)
> > > +                       dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
> > > +
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), mv_addr);
> > > +       }
> > > +
> > > +       luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val);
> > > +       if (!luma_addr)
> > > +               return -ENOMEM;
> > > +
> > > +       chroma_addr = luma_addr + cr_offset;
> > > +       mv_addr = luma_addr + mv_offset;
> > > +
> > > +       hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_REG_DMV_REF(i++), mv_addr);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST, luma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_CHR, chroma_addr);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_DST_MV, mv_addr);
> > > +
> > > +       hantro_hevc_ref_remove_unused(ctx);
> > > +
> > > +       for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
> > > +               hantro_write_addr(vpu, HEVC_REG_ADDR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_CHR_REF(i), 0);
> > > +               hantro_write_addr(vpu, HEVC_REG_DMV_REF(i), 0);
> > > +       }
> > > +
> > > +       hantro_reg_write(vpu, hevc_refer_lterm_e, dpb_longterm_e);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void set_buffers(struct hantro_ctx *ctx)
> > > +{
> > > +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       dma_addr_t src_dma, dst_dma;
> > > +       u32 src_len, src_buf_len;
> > > +
> > > +       src_buf = hantro_get_src_buf(ctx);
> > > +       dst_buf = hantro_get_dst_buf(ctx);
> > > +
> > > +       /* Source (stream) buffer. */
> > > +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> > > +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> > > +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> > > +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> > > +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> > > +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> > > +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> > > +
> > > +       /* Destination (decoded frame) buffer. */
> > > +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> > > +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> > > +}
> > > +
> > > +void hantro_g2_check_idle(struct hantro_dev *vpu)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < 3; i++) {
> > > +               u32 status;
> > > +
> > > +               /* Make sure the VPU is idle */
> > > +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> > > +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> > > +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> > How about we clean this pr_warn: use either v4l2_warn or dev_warn and make
> > the warning "device still running, aborting" (I personally dislike the abort
> > metaphor, but guess it's OK here).
> 
> Ok
> 
> > 
> > > +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> > > +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int ret;
> > > +
> > > +       hantro_g2_check_idle(vpu);
> > > +
> > > +       /* Prepare HEVC decoder context. */
> > > +       ret = hantro_hevc_dec_prepare_run(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       /* Configure hardware registers. */
> > > +       set_params(ctx);
> > > +
> > > +       /* set reference pictures */
> > > +       ret = set_ref(ctx);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       set_buffers(ctx);
> > > +       prepare_tile_info_buffer(ctx);
> > > +
> > > +       hantro_end_prepare_run(ctx);
> > > +
> > > +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> > > +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> > > +
> > > +       /* Don't disable output */
> > > +       hantro_reg_write(vpu, hevc_out_dis, 0);
> > > +
> > > +       /* Don't compress buffers */
> > > +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> > > +
> > > +       /* use NV12 as output format */
> > > +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
> > > +
> > > +       /* Bus width and max burst */
> > > +       hantro_reg_write(vpu, hevc_buswidth, BUS_WIDTH_128);
> > > +       hantro_reg_write(vpu, hevc_max_burst, 16);
> > > +
> > > +       /* Swap */
> > > +       hantro_reg_write(vpu, hevc_strm_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_dirmv_swap, 0xf);
> > > +       hantro_reg_write(vpu, hevc_compress_swap, 0xf);
> > > +
> > > +       /* Start decoding! */
> > > +       vdpu_write(vpu, HEVC_REG_INTERRUPT_DEC_E, HEVC_REG_INTERRUPT);
> > > +
> > > +       return 0;
> > > +}
> > > diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > new file mode 100644
> > > index 000000000000..a361c9ba911d
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_g2_regs.h
> > > @@ -0,0 +1,198 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * Copyright (c) 2021, Collabora
> > > + *
> > > + * Author: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > + */
> > > +
> > > +#ifndef HANTRO_G2_REGS_H_
> > > +#define HANTRO_G2_REGS_H_
> > > +
> > > +#include "hantro.h"
> > > +
> > > +#define G2_SWREG(nr)   ((nr) * 4)
> > > +
> > > +#define HEVC_DEC_REG(name, base, shift, mask) \
> > > +       static const struct hantro_reg _hevc_##name[] = { \
> > > +               { G2_SWREG(base), (shift), (mask) } \
> > > +       }; \
> > > +       static const struct hantro_reg __maybe_unused *hevc_##name = &_hevc_##name[0];
> > > +
> > > +#define HEVC_REG_VERSION               G2_SWREG(0)
> > > +
> > > +#define HEVC_REG_INTERRUPT             G2_SWREG(1)
> > > +#define HEVC_REG_INTERRUPT_DEC_RDY_INT BIT(12)
> > > +#define HEVC_REG_INTERRUPT_DEC_ABORT_E BIT(5)
> > > +#define HEVC_REG_INTERRUPT_DEC_IRQ_DIS BIT(4)
> > > +#define HEVC_REG_INTERRUPT_DEC_E       BIT(0)
> > > +
> > > +HEVC_DEC_REG(strm_swap,                2, 28,  0xf)
> > > +HEVC_DEC_REG(dirmv_swap,       2, 20,  0xf)
> > > +
> > > +HEVC_DEC_REG(mode,               3, 27, 0x1f)
> > > +HEVC_DEC_REG(compress_swap,      3, 20, 0xf)
> > > +HEVC_DEC_REG(ref_compress_bypass, 3, 17, 0x1)
> > > +HEVC_DEC_REG(out_rs_e,           3, 16, 0x1)
> > > +HEVC_DEC_REG(out_dis,            3, 15, 0x1)
> > > +HEVC_DEC_REG(out_filtering_dis,   3, 14, 0x1)
> > > +HEVC_DEC_REG(write_mvs_e,        3, 12, 0x1)
> > > +
> > > +HEVC_DEC_REG(pic_width_in_cbs, 4, 19,  0x1ff)
> > > +HEVC_DEC_REG(pic_height_in_cbs,        4, 6,   0x1ff)
> > > +HEVC_DEC_REG(num_ref_frames,   4, 0,   0x1f)
> > > +
> > > +HEVC_DEC_REG(scaling_list_e,   5, 24,  0x1)
> > > +HEVC_DEC_REG(cb_qp_offset,     5, 19,  0x1f)
> > > +HEVC_DEC_REG(cr_qp_offset,     5, 14,  0x1f)
> > > +HEVC_DEC_REG(sign_data_hide,   5, 12,  0x1)
> > > +HEVC_DEC_REG(tempor_mvp_e,     5, 11,  0x1)
> > > +HEVC_DEC_REG(max_cu_qpd_depth, 5, 5,   0x3f)
> > > +HEVC_DEC_REG(cu_qpd_e,         5, 4,   0x1)
> > > +
> > > +HEVC_DEC_REG(stream_len,       6, 0,   0xffffffff)
> > > +
> > > +HEVC_DEC_REG(cabac_init_present, 7, 31, 0x1)
> > > +HEVC_DEC_REG(weight_pred_e,     7, 28, 0x1)
> > > +HEVC_DEC_REG(weight_bipr_idc,   7, 26, 0x3)
> > > +HEVC_DEC_REG(filter_over_slices, 7, 25, 0x1)
> > > +HEVC_DEC_REG(filter_over_tiles,  7, 24, 0x1)
> > > +HEVC_DEC_REG(asym_pred_e,       7, 23, 0x1)
> > > +HEVC_DEC_REG(sao_e,             7, 22, 0x1)
> > > +HEVC_DEC_REG(pcm_filt_d,        7, 21, 0x1)
> > > +HEVC_DEC_REG(slice_chqp_present, 7, 20, 0x1)
> > > +HEVC_DEC_REG(dependent_slice,   7, 19, 0x1)
> > > +HEVC_DEC_REG(filter_override,   7, 18, 0x1)
> > > +HEVC_DEC_REG(strong_smooth_e,   7, 17, 0x1)
> > > +HEVC_DEC_REG(filt_offset_beta,  7, 12, 0x1f)
> > > +HEVC_DEC_REG(filt_offset_tc,    7, 7,  0x1f)
> > > +HEVC_DEC_REG(slice_hdr_ext_e,   7, 6,  0x1)
> > > +HEVC_DEC_REG(slice_hdr_ext_bits, 7, 3, 0x7)
> > > +
> > > +HEVC_DEC_REG(const_intra_e,     8, 31, 0x1)
> > > +HEVC_DEC_REG(filt_ctrl_pres,    8, 30, 0x1)
> > > +HEVC_DEC_REG(idr_pic_e,                 8, 16, 0x1)
> > > +HEVC_DEC_REG(bit_depth_pcm_y,   8, 12, 0xf)
> > > +HEVC_DEC_REG(bit_depth_pcm_c,   8, 8,  0xf)
> > > +HEVC_DEC_REG(bit_depth_y_minus8, 8, 6,  0x3)
> > > +HEVC_DEC_REG(bit_depth_c_minus8, 8, 4,  0x3)
> > > +HEVC_DEC_REG(output_8_bits,     8, 3,  0x1)
> > > +
> > > +HEVC_DEC_REG(refidx1_active,   9, 19,  0x1f)
> > > +HEVC_DEC_REG(refidx0_active,   9, 14,  0x1f)
> > > +HEVC_DEC_REG(hdr_skip_length,  9, 0,   0x3fff)
> > > +
> > > +HEVC_DEC_REG(start_code_e,     10, 31, 0x1)
> > > +HEVC_DEC_REG(init_qp,          10, 24, 0x3f)
> > > +HEVC_DEC_REG(num_tile_cols,    10, 19, 0x1f)
> > > +HEVC_DEC_REG(num_tile_rows,    10, 14, 0x1f)
> > > +HEVC_DEC_REG(tile_e,           10, 1,  0x1)
> > > +HEVC_DEC_REG(entropy_sync_e,   10, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(refer_lterm_e,    12, 16, 0xffff)
> > > +HEVC_DEC_REG(min_cb_size,      12, 13, 0x7)
> > > +HEVC_DEC_REG(max_cb_size,      12, 10, 0x7)
> > > +HEVC_DEC_REG(min_pcm_size,     12, 7,  0x7)
> > > +HEVC_DEC_REG(max_pcm_size,     12, 4,  0x7)
> > > +HEVC_DEC_REG(pcm_e,            12, 3,  0x1)
> > > +HEVC_DEC_REG(transform_skip,   12, 2,  0x1)
> > > +HEVC_DEC_REG(transq_bypass,    12, 1,  0x1)
> > > +HEVC_DEC_REG(list_mod_e,       12, 0,  0x1)
> > > +
> > > +HEVC_DEC_REG(min_trb_size,       13, 13, 0x7)
> > > +HEVC_DEC_REG(max_trb_size,       13, 10, 0x7)
> > > +HEVC_DEC_REG(max_intra_hierdepth, 13, 7,  0x7)
> > > +HEVC_DEC_REG(max_inter_hierdepth, 13, 4,  0x7)
> > > +HEVC_DEC_REG(parallel_merge,     13, 0,  0xf)
> > > +
> > > +HEVC_DEC_REG(rlist_f0,         14, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f1,         14, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f2,         14, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b0,         14, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b1,         14, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b2,         14, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f3,         15, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f4,         15, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f5,         15, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b3,         15, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b4,         15, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b5,         15, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f6,         16, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f7,         16, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f8,         16, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b6,         16, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b7,         16, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b8,         16, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f9,         17, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f10,                17, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f11,                17, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b9,         17, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b10,                17, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b11,                17, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f12,                18, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_f13,                18, 10, 0x1f)
> > > +HEVC_DEC_REG(rlist_f14,                18, 20, 0x1f)
> > > +HEVC_DEC_REG(rlist_b12,                18, 5,  0x1f)
> > > +HEVC_DEC_REG(rlist_b13,                18, 15, 0x1f)
> > > +HEVC_DEC_REG(rlist_b14,                18, 25, 0x1f)
> > > +
> > > +HEVC_DEC_REG(rlist_f15,                19, 0,  0x1f)
> > > +HEVC_DEC_REG(rlist_b15,                19, 5,  0x1f)
> > > +
> > > +HEVC_DEC_REG(partial_ctb_x,    20, 31, 0x1)
> > > +HEVC_DEC_REG(partial_ctb_y,    20, 30, 0x1)
> > > +HEVC_DEC_REG(pic_width_4x4,    20, 16, 0xfff)
> > > +HEVC_DEC_REG(pic_height_4x4,   20, 0,  0xfff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_00,       46, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_01,       46, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_02,       46, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_03,       46, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_04,       47, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_05,       47, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_06,       47, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_07,       47, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_08,       48, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_09,       48, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_10,       48, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_11,       48, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(cur_poc_12,       49, 24, 0xff)
> > > +HEVC_DEC_REG(cur_poc_13,       49, 16, 0xff)
> > > +HEVC_DEC_REG(cur_poc_14,       49, 8,  0xff)
> > > +HEVC_DEC_REG(cur_poc_15,       49, 0,  0xff)
> > > +
> > > +HEVC_DEC_REG(apf_threshold,    55, 0,  0xffff)
> > > +
> > > +HEVC_DEC_REG(clk_gate_e,       58, 16, 0x1)
> > > +HEVC_DEC_REG(buswidth,         58, 8,  0x7)
> > > +HEVC_DEC_REG(max_burst,                58, 0,  0xff)
> > > +
> > > +#define HEVC_REG_CONFIG                                G2_SWREG(58)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_E         BIT(16)
> > > +#define HEVC_REG_CONFIG_DEC_CLK_GATE_IDLE_E    BIT(17)
> > > +
> > > +#define HEVC_ADDR_DST          (G2_SWREG(65))
> > > +#define HEVC_REG_ADDR_REF(i)   (G2_SWREG(67)  + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_CHR      (G2_SWREG(99))
> > > +#define HEVC_REG_CHR_REF(i)    (G2_SWREG(101) + ((i) * 0x8))
> > > +#define HEVC_ADDR_DST_MV       (G2_SWREG(133))
> > > +#define HEVC_REG_DMV_REF(i)    (G2_SWREG(135) + ((i) * 0x8))
> > > +#define HEVC_ADDR_TILE_SIZE    (G2_SWREG(167))
> > > +#define HEVC_ADDR_STR          (G2_SWREG(169))
> > > +#define HEVC_SCALING_LIST      (G2_SWREG(171))
> > > +#define HEVC_RASTER_SCAN       (G2_SWREG(175))
> > > +#define HEVC_RASTER_SCAN_CHR   (G2_SWREG(177))
> > > +#define HEVC_TILE_FILTER       (G2_SWREG(179))
> > > +#define HEVC_TILE_SAO          (G2_SWREG(181))
> > > +#define HEVC_TILE_BSD          (G2_SWREG(183))
> > > +
> > > +HEVC_DEC_REG(strm_buffer_len,  258, 0, 0xffffffff)
> > > +HEVC_DEC_REG(strm_start_offset,        259, 0, 0xffffffff)
> > > +
> > > +#endif
> > > diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
> > > new file mode 100644
> > > index 000000000000..8e319a837ff3
> > > --- /dev/null
> > > +++ b/drivers/staging/media/hantro/hantro_hevc.c
> > > @@ -0,0 +1,321 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Hantro VPU HEVC codec driver
> > > + *
> > > + * Copyright (C) 2020 Safran Passenger Innovations LLC
> > > + */
> > > +
> > > +#include <linux/types.h>
> > > +#include <media/v4l2-mem2mem.h>
> > > +
> > > +#include "hantro.h"
> > > +#include "hantro_hw.h"
> > > +
> > > +#define VERT_FILTER_RAM_SIZE 8 /* bytes per pixel row */
> > > +/*
> > > + * BSD control data of current picture at tile border
> > > + * 128 bits per 4x4 tile = 128/(8*4) bytes per row
> > > + */
> > > +#define BSD_CTRL_RAM_SIZE 4 /* bytes per pixel row */
> > > +/* tile border coefficients of filter */
> > > +#define VERT_SAO_RAM_SIZE 48 /* bytes per pixel */
> > > +
> > > +#define MAX_TILE_COLS 20
> > > +#define MAX_TILE_ROWS 22
> > > +
> > > +#define UNUSED_REF     -1
> > > +
> > > +#define G2_ALIGN               16
> > > +#define MC_WORD_SIZE           32
> > > +
> > > +size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
> > > +
> > > +       return sps->pic_width_in_luma_samples *
> > > +               sps->pic_height_in_luma_samples * bytes_per_pixel;
> > > +}
> > > +
> > > +size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +
> > > +       return ALIGN((cr_offset * 3) / 2, G2_ALIGN) + MC_WORD_SIZE;
> > > +}
> > > +
> > > +static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps)
> > > +{
> > > +       u32 pic_width_in_ctb64 = (sps->pic_width_in_luma_samples + (1 << 8) - 1) >> 8;
> > > +       u32 pic_height_in_ctb64 = (sps->pic_height_in_luma_samples  + (1 << 8) - 1) >> 8;
> > > +       size_t mv_size;
> > > +
> > > +       mv_size = (pic_width_in_ctb64 * pic_height_in_ctb64 *
> > > +                 (1 << (2 * (8 - 4))) * 16) + 32;
> > > +
> > > +       vpu_debug(4, "%dx%d (CTBs) %lu MV bytes\n",
> > > +                 pic_width_in_ctb64, pic_height_in_ctb64, mv_size);
> > > +
> > > +       return mv_size;
> > > +}
> > > +
> > > +static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx)
> > > +{
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +
> > > +       return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps);
> > > +}
> > > +
> > > +static void hantro_hevc_ref_free(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       int i;
> > > +
> > > +       /* Just tag buffer as unused, do not free them */
> > This comment seems wrong.
> 
> You are right I will remove it.
> 
> > 
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs[i].cpu) {
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > Is this memset clearing the buffer required? If we're getting artifacts
> > from previous decodes, then that would be more of a bug somewhere.
> 
> Clear is done after allocating/reused the buffer I can remove this one.
> 
> > 
> > > +                       dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size,
> > > +                                         hevc_dec->ref_bufs[i].cpu,
> > > +                                         hevc_dec->ref_bufs[i].dma);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++)
> > > +               hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
> > > +}
> > > +
> > > +dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
> > > +                                  int poc)
> > > +{
> > > +       struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
> > > +       int i;
> > > +
> > > +       /* Find the reference buffer in already know ones */
> > > +       for (i = 0;  i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == poc) {
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       return hevc_dec->ref_bufs[i].dma;
> > > +               }
> > > +       }
> > > +
> > > +       /* Allocate a new reference buffer */
> > > +       for (i = 0; i < NUM_REF_PICTURES; i++) {
> > > +               if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
> > > +                       if (!hevc_dec->ref_bufs[i].cpu) {
> > > +                               struct hantro_dev *vpu = ctx->dev;
> > > +
> > > +                               hevc_dec->ref_bufs[i].cpu =
> > > +                                       dma_alloc_coherent(vpu->dev,
> > > +                                                          hantro_hevc_ref_size(ctx),
> > > +                                                          &hevc_dec->ref_bufs[i].dma,
> > > +                                                          GFP_KERNEL);
> > Is there any reason why we need to allocate reference buffers and MV contiguously?
> 
> It is done like that in IMX reference code and makes the management of reference frame
> and MV more simple.
> 
> > 
> > > +                               if (!hevc_dec->ref_bufs[i].cpu)
> > > +                                       return 0;
> > > +
> > > +                               hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx);
> > > +                       }
> > > +                       hevc_dec->ref_bufs_used |= 1 << i;
> > > +                       memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx));
> > 
> > I believe the coherent allocation is to be able to clear each reference, but is this
> > really needed? I recall maybe only the MV buffer needs clearing, maybe you can try that?
> > 
> > Also, if that's the case, then allocating the MV buffer separatedly will allow
> > to not allocate the reference buffers coherently (note that we use NO_MAPPING
> > in the vb2_queue, so the vb2_buffers shouldn't be coherent).
> 
> That sound like good possible optimizations but I'm not at this stage.
> I would rather keep it in this fairly functional state and improve it later.
> I think the patches are already enough larges and complexes like that.
> 

Fair enough. I think it's great to have a first working
version :)

Could you add a comment for this, specially at the
memset's and the dma_alloc_coherent, (or optionally
at the header of this .c file), in case someone
wants to revisit this topic?

Thanks a lot!
Ezequiel 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2021-03-16 20:37 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-03 11:39 [PATCH v4 00/11] Add HANTRO G2/HEVC decoder support for IMX8MQ Benjamin Gaignard
2021-03-03 11:39 ` Benjamin Gaignard
2021-03-03 11:39 ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 01/11] media: hevc: Add fields and flags for hevc PPS Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 02/11] media: hevc: Add decode params control Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 03/11] media: hantro: change hantro_codec_ops run prototype to return errors Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 21:56   ` Ezequiel Garcia
2021-03-03 21:56     ` Ezequiel Garcia
2021-03-03 21:56     ` Ezequiel Garcia
2021-03-05  9:24     ` Benjamin Gaignard
2021-03-05  9:24       ` Benjamin Gaignard
2021-03-05  9:24       ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 04/11] media: hantro: Define HEVC codec profiles and supported features Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 05/11] media: hantro: Add a field to distinguish the hardware versions Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 22:05   ` Ezequiel Garcia
2021-03-03 22:05     ` Ezequiel Garcia
2021-03-03 22:05     ` Ezequiel Garcia
2021-03-05  9:27     ` Benjamin Gaignard
2021-03-05  9:27       ` Benjamin Gaignard
2021-03-05  9:27       ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 06/11] media: uapi: Add a control for HANTRO driver Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 07/11] media: hantro: Introduce G2/HEVC decoder Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-16 18:46   ` Ezequiel Garcia
2021-03-16 18:46     ` Ezequiel Garcia
2021-03-16 18:46     ` Ezequiel Garcia
2021-03-16 20:19     ` Benjamin Gaignard
2021-03-16 20:19       ` Benjamin Gaignard
2021-03-16 20:19       ` Benjamin Gaignard
2021-03-16 20:35       ` Ezequiel Garcia
2021-03-16 20:35         ` Ezequiel Garcia
2021-03-16 20:35         ` Ezequiel Garcia
2021-03-03 11:39 ` [PATCH v4 08/11] media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 09/11] media: hantro: IMX8M: add variant for G2/HEVC codec Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 22:08   ` Ezequiel Garcia
2021-03-03 22:08     ` Ezequiel Garcia
2021-03-03 22:08     ` Ezequiel Garcia
2021-03-05  9:32     ` Benjamin Gaignard
2021-03-05  9:32       ` Benjamin Gaignard
2021-03-05  9:32       ` Benjamin Gaignard
2021-03-03 11:39 ` [PATCH v4 10/11] dt-bindings: media: nxp,imx8mq-vpu: Update bindings Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-08 20:08   ` Rob Herring
2021-03-08 20:08     ` Rob Herring
2021-03-08 20:08     ` Rob Herring
2021-03-03 11:39 ` [PATCH v4 11/11] arm64: dts: imx8mq: Add node to G2 hardware Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard
2021-03-03 11:39   ` Benjamin Gaignard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.