dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Add pixel formats used in Synatpics SoC
@ 2022-08-08 16:27 Hsia-Jun Li
  2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-08 16:27 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-kernel, tfiga, tzimmermann, sebastian.hesselbarth,
	airlied, linux-kernel, linux-media, sakari.ailus,
	Hsia-Jun(Randy) Li, laurent.pinchart, ribalda, hverkuil-cisco,
	mchehab, jszhang, ezequiel

From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>

Those pixel formats are used in Synaptics's VideoSmart series SoCs,
likes VS640, VS680.  I just disclose the pixel formats used in the video
codecs and display pipeline this time. Actually any device with a MTR
module could support those tiled and compressed pixel formats. The more
detail about MTR module could be found in the first patch of this serial
of mail.

We may not be able to post any drivers here in a short time, the most of
work in this platform is done in the Trusted Execution Environment and
we didn't use the optee framework.

Please notice that, the memory planes used for video codecs would be 5
when the compression is invoked while it would be 4 for display, the
extra planes in the video codecs is for the decoding internally usage,
it can't append the luma or chroma buffer as many other drivers do,
because this buffer could be only accessed by the video codecs itself,
it requests a different memory security attributes. Any other reason is
described in the v4l pixel formats's patch. I don't know whether a
different numbers of memory planes between drm and v4l2 is acceptable.

I only posted the compression fourcc for the v4l2, because it is really
hard to put the uncompression version of pixel formats under the fourcc.
I would be better that we could have something likes format modifers in
drm here.

https://synaptics.com/products/multimedia-solutions

Hsia-Jun(Randy) Li (2):
  drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  [WIP]: media: Add Synaptics compressed tiled format

 drivers/media/v4l2-core/v4l2-common.c |  1 +
 drivers/media/v4l2-core/v4l2-ioctl.c  |  2 ++
 include/uapi/drm/drm_fourcc.h         | 49 +++++++++++++++++++++++++++
 include/uapi/linux/videodev2.h        |  2 ++
 4 files changed, 54 insertions(+)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  2022-08-08 16:27 [PATCH 0/2] Add pixel formats used in Synatpics SoC Hsia-Jun Li
@ 2022-08-08 16:27 ` Hsia-Jun Li
  2022-08-18  6:07   ` Tomasz Figa
  2022-08-18 23:16   ` Laurent Pinchart
  2022-08-08 16:27 ` [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format Hsia-Jun Li
  2022-08-18 23:08 ` [PATCH 0/2] Add pixel formats used in Synatpics SoC Laurent Pinchart
  2 siblings, 2 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-08 16:27 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-kernel, tfiga, tzimmermann, sebastian.hesselbarth,
	airlied, linux-kernel, linux-media, sakari.ailus,
	Hsia-Jun(Randy) Li, laurent.pinchart, ribalda, hverkuil-cisco,
	mchehab, jszhang, ezequiel

From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>

Memory Traffic Reduction(MTR) is a module in Synaptics
VideoSmart platform could process lossless compression image
and cache the tile memory line.

Those modifiers only record the parameters would effort pixel
layout or memory layout. Whether physical memory page mapping
is used is not a part of format.

We would allocate the same size of memory for uncompressed
and compressed luma and chroma data, while the compressed buffer
would request two extra planes holding the metadata for
the decompression.

The reason why we need to allocate the same size of memory for
the compressed frame:
1. The compression ratio is not fixed and differnt platforms could
use a different compression protocol. These protocols are complete
vendor proprietaries, the other device won't be able to use them.
It is not necessary to define the version of them here.

2. Video codec could discard the compression attribute when the
intra block copy applied to this frame. It would waste lots of
time on re-allocation.

I am wondering if it is better to add an addtional plane property to
describe whether the current framebuffer is compressed?
While the compression flag is still a part of format modifier,
because it would have two extra meta data planes in the compression
version.

Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
---
 include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 0206f812c569..b67884e8bc69 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -381,6 +381,7 @@ extern "C" {
 #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
 #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
 #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
+#define DRM_FORMAT_MOD_VENDOR_SYNAPTICS 0x0b
 
 /* add more to the end as needed */
 
@@ -1452,6 +1453,54 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
 #define AMD_FMT_MOD_CLEAR(field) \
 	(~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
 
+/*
+ * Synaptics VideoSmart modifiers
+ *
+ *       Macro
+ * Bits  Param Description
+ * ----  ----- -----------------------------------------------------------------
+ *
+ *  7:0  f     Scan direction description.
+ *
+ *               0 = Invalid
+ *               1 = V4, the scan would always start from vertical for 4 pixel
+ *                   then move back to the start pixel of the next horizontal
+ *                   direction.
+ *               2 = Reserved for future use.
+ *
+ * 15:8  m     The times of pattern repeat in the right angle direction from
+ *             the first scan direction.
+ *
+ * 19:16 p     The padding bits after the whole scan, could be zero.
+ *
+ * 35:20 -     Reserved for future use.  Must be zero.
+ *
+ * 36:36 c     Compression flag.
+ *
+ * 55:37 -     Reserved for future use.  Must be zero.
+ *
+ */
+
+#define DRM_FORMAT_MOD_SYNA_V4_TILED		fourcc_mod_code(SYNAPTICS, 1)
+
+#define DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(f, m, p, c) \
+	fourcc_mod_code(SYNAPTICS, (((f) & 0xff) | \
+				 (((m) & 0xff) << 8) | \
+				 (((p) & 0xf) << 16) | \
+				 (((c) & 0x1) << 36)))
+
+#define DRM_FORMAT_MOD_SYNA_V4H1 \
+	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 0)
+
+#define DRM_FORMAT_MOD_SYNA_V4H3P8 \
+	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 0)
+
+#define DRM_FORMAT_MOD_SYNA_V4H1_COMPRESSED \
+	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 1)
+
+#define DRM_FORMAT_MOD_SYNA_V4H3P8_COMPRESSED \
+	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 1)
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-08 16:27 [PATCH 0/2] Add pixel formats used in Synatpics SoC Hsia-Jun Li
  2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
@ 2022-08-08 16:27 ` Hsia-Jun Li
  2022-08-18  6:06   ` Tomasz Figa
  2022-08-18 23:08 ` [PATCH 0/2] Add pixel formats used in Synatpics SoC Laurent Pinchart
  2 siblings, 1 reply; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-08 16:27 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-arm-kernel, tfiga, tzimmermann, sebastian.hesselbarth,
	airlied, linux-kernel, linux-media, sakari.ailus,
	Hsia-Jun(Randy) Li, laurent.pinchart, ribalda, hverkuil-cisco,
	mchehab, jszhang, ezequiel

From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>

The most of detail has been written in the drm.
Please notice that the tiled formats here request
one more plane for storing the motion vector metadata.
This buffer won't be compressed, so you can't append
it to luma or chroma plane.

Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
---
 drivers/media/v4l2-core/v4l2-common.c | 1 +
 drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
 include/uapi/linux/videodev2.h        | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
index e0fbe6ba4b6c..f645278b3055 100644
--- a/drivers/media/v4l2-core/v4l2-common.c
+++ b/drivers/media/v4l2-core/v4l2-common.c
@@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
 		{ .format = V4L2_PIX_FMT_SGBRG12,	.pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_SGRBG12,	.pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_SRGGB12,	.pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
 	};
 	unsigned int i;
 
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index e6fd355a2e92..8f65964aff08 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 		case V4L2_PIX_FMT_MT21C:	descr = "Mediatek Compressed Format"; break;
 		case V4L2_PIX_FMT_QC08C:	descr = "QCOM Compressed 8-bit Format"; break;
 		case V4L2_PIX_FMT_QC10C:	descr = "QCOM Compressed 10-bit Format"; break;
+		case V4L2_PIX_FMT_NV12M_V4H1C:	descr = "Synaptics Compressed 8-bit tiled Format";break;
+		case V4L2_PIX_FMT_NV12M_10_V4H3P8C:	descr = "Synaptics Compressed 10-bit tiled Format";break;
 		default:
 			if (fmt->description[0])
 				return;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 01e630f2ec78..7e928cb69e7c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -661,6 +661,8 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
 #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
 #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
+#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
+#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
 
 /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
 #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-08 16:27 ` [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format Hsia-Jun Li
@ 2022-08-18  6:06   ` Tomasz Figa
  2022-08-18  6:33     ` Hsia-Jun Li
  0 siblings, 1 reply; 24+ messages in thread
From: Tomasz Figa @ 2022-08-18  6:06 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, ezequiel, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, laurent.pinchart

Hi Randy,

On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>
> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>
> The most of detail has been written in the drm.
> Please notice that the tiled formats here request
> one more plane for storing the motion vector metadata.
> This buffer won't be compressed, so you can't append
> it to luma or chroma plane.

Does the motion vector buffer need to be exposed to userspace? Is the
decoder stateless (requires userspace to specify the reference frames)
or stateful (manages the entire decoding process internally)?

Best regards,
Tomasz

>
> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> ---
>  drivers/media/v4l2-core/v4l2-common.c | 1 +
>  drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>  include/uapi/linux/videodev2.h        | 2 ++
>  3 files changed, 5 insertions(+)
>
> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> index e0fbe6ba4b6c..f645278b3055 100644
> --- a/drivers/media/v4l2-core/v4l2-common.c
> +++ b/drivers/media/v4l2-core/v4l2-common.c
> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>                 { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>                 { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>                 { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>         };
>         unsigned int i;
>
> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> index e6fd355a2e92..8f65964aff08 100644
> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>                 case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>                 case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>                 case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>                 default:
>                         if (fmt->description[0])
>                                 return;
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 01e630f2ec78..7e928cb69e7c 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>  #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>  #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>  #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>
>  /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
>  #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
@ 2022-08-18  6:07   ` Tomasz Figa
  2022-08-18  6:49     ` Hsia-Jun Li
  2022-08-18 23:16   ` Laurent Pinchart
  1 sibling, 1 reply; 24+ messages in thread
From: Tomasz Figa @ 2022-08-18  6:07 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, ezequiel, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, laurent.pinchart

Hi Randy,

On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>
> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>
> Memory Traffic Reduction(MTR) is a module in Synaptics
> VideoSmart platform could process lossless compression image
> and cache the tile memory line.
>
> Those modifiers only record the parameters would effort pixel
> layout or memory layout. Whether physical memory page mapping
> is used is not a part of format.
>
> We would allocate the same size of memory for uncompressed
> and compressed luma and chroma data, while the compressed buffer
> would request two extra planes holding the metadata for
> the decompression.
>
> The reason why we need to allocate the same size of memory for
> the compressed frame:
> 1. The compression ratio is not fixed and differnt platforms could
> use a different compression protocol. These protocols are complete
> vendor proprietaries, the other device won't be able to use them.
> It is not necessary to define the version of them here.
>
> 2. Video codec could discard the compression attribute when the
> intra block copy applied to this frame. It would waste lots of
> time on re-allocation.
>
> I am wondering if it is better to add an addtional plane property to
> describe whether the current framebuffer is compressed?
> While the compression flag is still a part of format modifier,
> because it would have two extra meta data planes in the compression
> version.
>
> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> ---
>  include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
>
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 0206f812c569..b67884e8bc69 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -381,6 +381,7 @@ extern "C" {
>  #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
>  #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
>  #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
> +#define DRM_FORMAT_MOD_VENDOR_SYNAPTICS 0x0b
>
>  /* add more to the end as needed */
>
> @@ -1452,6 +1453,54 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
>  #define AMD_FMT_MOD_CLEAR(field) \
>         (~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
>
> +/*
> + * Synaptics VideoSmart modifiers
> + *
> + *       Macro
> + * Bits  Param Description
> + * ----  ----- -----------------------------------------------------------------
> + *
> + *  7:0  f     Scan direction description.
> + *
> + *               0 = Invalid
> + *               1 = V4, the scan would always start from vertical for 4 pixel
> + *                   then move back to the start pixel of the next horizontal
> + *                   direction.
> + *               2 = Reserved for future use.

I guess 2..255 are all reserved for future use?

> + *
> + * 15:8  m     The times of pattern repeat in the right angle direction from
> + *             the first scan direction.
> + *
> + * 19:16 p     The padding bits after the whole scan, could be zero.

What is the meaning of "scan" and "whole scan" here?

Best regards,
Tomasz

> + *
> + * 35:20 -     Reserved for future use.  Must be zero.
> + *
> + * 36:36 c     Compression flag.
> + *
> + * 55:37 -     Reserved for future use.  Must be zero.
> + *
> + */
> +
> +#define DRM_FORMAT_MOD_SYNA_V4_TILED           fourcc_mod_code(SYNAPTICS, 1)
> +
> +#define DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(f, m, p, c) \
> +       fourcc_mod_code(SYNAPTICS, (((f) & 0xff) | \
> +                                (((m) & 0xff) << 8) | \
> +                                (((p) & 0xf) << 16) | \
> +                                (((c) & 0x1) << 36)))
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H1 \
> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 0)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H3P8 \
> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 0)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H1_COMPRESSED \
> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 1)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H3P8_COMPRESSED \
> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 1)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-18  6:06   ` Tomasz Figa
@ 2022-08-18  6:33     ` Hsia-Jun Li
  2022-08-18 23:13       ` Laurent Pinchart
  2022-08-19 15:26       ` Nicolas Dufresne
  0 siblings, 2 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-18  6:33 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: linux-arm-kernel, ezequiel, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, laurent.pinchart



On 8/18/22 14:06, Tomasz Figa wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Hi Randy,
> 
> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>
>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>
>> The most of detail has been written in the drm.
>> Please notice that the tiled formats here request
>> one more plane for storing the motion vector metadata.
>> This buffer won't be compressed, so you can't append
>> it to luma or chroma plane.
> 
> Does the motion vector buffer need to be exposed to userspace? Is the
> decoder stateless (requires userspace to specify the reference frames)
> or stateful (manages the entire decoding process internally)?
> 
No, users don't need to access them at all. Just they need a different 
dma-heap.

You would only get the stateful version of both encoder and decoder.
> Best regards,
> Tomasz
> 
>>
>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>> ---
>>   drivers/media/v4l2-core/v4l2-common.c | 1 +
>>   drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>   include/uapi/linux/videodev2.h        | 2 ++
>>   3 files changed, 5 insertions(+)
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>> index e0fbe6ba4b6c..f645278b3055 100644
>> --- a/drivers/media/v4l2-core/v4l2-common.c
>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>                  { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>                  { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>                  { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>          };
>>          unsigned int i;
>>
>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>> index e6fd355a2e92..8f65964aff08 100644
>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>                  case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>                  case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>                  case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>                  default:
>>                          if (fmt->description[0])
>>                                  return;
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index 01e630f2ec78..7e928cb69e7c 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>   #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>   #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>   #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>
>>   /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=vmpysqneiHK3UXcq6UOewdMwobFa70zKB3RuOgYT02aFw9fCs6qd7j-U1sYSey79&s=yblzF1GwanMEJFC3yt9nBAQjaaAEJKKlNgj4k64v5eE&e=   */
>>   #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>> --
>> 2.17.1
>>

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  2022-08-18  6:07   ` Tomasz Figa
@ 2022-08-18  6:49     ` Hsia-Jun Li
  0 siblings, 0 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-18  6:49 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: linux-arm-kernel, ezequiel, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, laurent.pinchart



On 8/18/22 14:07, Tomasz Figa wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Hi Randy,
> 
> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>
>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>
>> Memory Traffic Reduction(MTR) is a module in Synaptics
>> VideoSmart platform could process lossless compression image
>> and cache the tile memory line.
>>
>> Those modifiers only record the parameters would effort pixel
>> layout or memory layout. Whether physical memory page mapping
>> is used is not a part of format.
>>
>> We would allocate the same size of memory for uncompressed
>> and compressed luma and chroma data, while the compressed buffer
>> would request two extra planes holding the metadata for
>> the decompression.
>>
>> The reason why we need to allocate the same size of memory for
>> the compressed frame:
>> 1. The compression ratio is not fixed and differnt platforms could
>> use a different compression protocol. These protocols are complete
>> vendor proprietaries, the other device won't be able to use them.
>> It is not necessary to define the version of them here.
>>
>> 2. Video codec could discard the compression attribute when the
>> intra block copy applied to this frame. It would waste lots of
>> time on re-allocation.
>>
>> I am wondering if it is better to add an addtional plane property to
>> describe whether the current framebuffer is compressed?
>> While the compression flag is still a part of format modifier,
>> because it would have two extra meta data planes in the compression
>> version.
>>
>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>> ---
>>   include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++++++++++
>>   1 file changed, 49 insertions(+)
>>
>> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
>> index 0206f812c569..b67884e8bc69 100644
>> --- a/include/uapi/drm/drm_fourcc.h
>> +++ b/include/uapi/drm/drm_fourcc.h
>> @@ -381,6 +381,7 @@ extern "C" {
>>   #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
>>   #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
>>   #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
>> +#define DRM_FORMAT_MOD_VENDOR_SYNAPTICS 0x0b
>>
>>   /* add more to the end as needed */
>>
>> @@ -1452,6 +1453,54 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
>>   #define AMD_FMT_MOD_CLEAR(field) \
>>          (~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
>>
>> +/*
>> + * Synaptics VideoSmart modifiers
>> + *
>> + *       Macro
>> + * Bits  Param Description
>> + * ----  ----- -----------------------------------------------------------------
>> + *
>> + *  7:0  f     Scan direction description.
>> + *
>> + *               0 = Invalid
>> + *               1 = V4, the scan would always start from vertical for 4 pixel
>> + *                   then move back to the start pixel of the next horizontal
>> + *                   direction.
>> + *               2 = Reserved for future use.
> 
> I guess 2..255 are all reserved for future use?
Likes the Intel has y-tile and x-tile.
> 
>> + *
>> + * 15:8  m     The times of pattern repeat in the right angle direction from
>> + *             the first scan direction.
>> + *
>> + * 19:16 p     The padding bits after the whole scan, could be zero.
> 
> What is the meaning of "scan" and "whole scan" here?
For example a NV15 tiled format,
(0, 0) (0, 1) (0, 2)
(1, 0) (1, 1)
(2, 0) (2, 1)
(3, 0)

(0, 0) (1, 0) (2, 0) (3, 0) (0, 1) (1, 1) (2, 1) ... (3, 2) are 120bits, 
then fill it with an extra 8 bits, (0, 3) would be placed that the first 
128bits in the memory.

Besides, I found even with four 64bits modifiers, it is not possible to 
store all the compression options we need there. I need to borrow what 
Intel did, hiding the tile flags somewhere(I would not use the userspace 
gmmlib way, our codecs is based on v4l2, even drm framework may not be a 
good place).

Although the compression options could affect the memory layout but 
userspace really don't need to know that.
> 
> Best regards,
> Tomasz
> 
>> + *
>> + * 35:20 -     Reserved for future use.  Must be zero.
>> + *
>> + * 36:36 c     Compression flag.
>> + *
>> + * 55:37 -     Reserved for future use.  Must be zero.
>> + *
>> + */
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4_TILED           fourcc_mod_code(SYNAPTICS, 1)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(f, m, p, c) \
>> +       fourcc_mod_code(SYNAPTICS, (((f) & 0xff) | \
>> +                                (((m) & 0xff) << 8) | \
>> +                                (((p) & 0xf) << 16) | \
>> +                                (((c) & 0x1) << 36)))
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H1 \
>> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 0)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H3P8 \
>> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 0)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H1_COMPRESSED \
>> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 1)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H3P8_COMPRESSED \
>> +       DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 1)
>> +
>>   #if defined(__cplusplus)
>>   }
>>   #endif
>> --
>> 2.17.1
>>

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] Add pixel formats used in Synatpics SoC
  2022-08-08 16:27 [PATCH 0/2] Add pixel formats used in Synatpics SoC Hsia-Jun Li
  2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
  2022-08-08 16:27 ` [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format Hsia-Jun Li
@ 2022-08-18 23:08 ` Laurent Pinchart
  2022-08-18 23:28   ` Hsia-Jun Li
  2 siblings, 1 reply; 24+ messages in thread
From: Laurent Pinchart @ 2022-08-18 23:08 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, tfiga, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, ezequiel

Hi Hsia-Jun,

On Tue, Aug 09, 2022 at 12:27:48AM +0800, Hsia-Jun Li wrote:
> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> 
> Those pixel formats are used in Synaptics's VideoSmart series SoCs,
> likes VS640, VS680.  I just disclose the pixel formats used in the video
> codecs and display pipeline this time. Actually any device with a MTR
> module could support those tiled and compressed pixel formats. The more
> detail about MTR module could be found in the first patch of this serial
> of mail.
> 
> We may not be able to post any drivers here in a short time, the most of
> work in this platform is done in the Trusted Execution Environment and
> we didn't use the optee framework.

Is that so for the display side too, or only for the video decoder ?

> Please notice that, the memory planes used for video codecs would be 5
> when the compression is invoked while it would be 4 for display, the
> extra planes in the video codecs is for the decoding internally usage,
> it can't append the luma or chroma buffer as many other drivers do,
> because this buffer could be only accessed by the video codecs itself,
> it requests a different memory security attributes. Any other reason is
> described in the v4l pixel formats's patch. I don't know whether a
> different numbers of memory planes between drm and v4l2 is acceptable.

I don't think that's a problem as such, as long as both the V4L2 and DRM
formats make sense on their own.

> I only posted the compression fourcc for the v4l2, because it is really
> hard to put the uncompression version of pixel formats under the fourcc.
> I would be better that we could have something likes format modifers in
> drm here.

Agreed, we need modifiers support in V4L2. This has been discussed
previously ([1]), and a proposal ([2]) has been submitted two years ago,
it needs to be revived.

[1] https://lore.kernel.org/linux-media/20170821155203.GB38943@e107564-lin.cambridge.arm.com/
[2] https://lore.kernel.org/linux-media/20200804192939.2251988-1-helen.koike@collabora.com/

> https://synaptics.com/products/multimedia-solutions
> 
> Hsia-Jun(Randy) Li (2):
>   drm/fourcc: Add Synaptics VideoSmart tiled modifiers
>   [WIP]: media: Add Synaptics compressed tiled format
> 
>  drivers/media/v4l2-core/v4l2-common.c |  1 +
>  drivers/media/v4l2-core/v4l2-ioctl.c  |  2 ++
>  include/uapi/drm/drm_fourcc.h         | 49 +++++++++++++++++++++++++++
>  include/uapi/linux/videodev2.h        |  2 ++
>  4 files changed, 54 insertions(+)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-18  6:33     ` Hsia-Jun Li
@ 2022-08-18 23:13       ` Laurent Pinchart
  2022-08-18 23:51         ` Hsia-Jun Li
  2022-08-19 15:28         ` Nicolas Dufresne
  2022-08-19 15:26       ` Nicolas Dufresne
  1 sibling, 2 replies; 24+ messages in thread
From: Laurent Pinchart @ 2022-08-18 23:13 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, sebastian.hesselbarth, airlied, linux-kernel,
	ribalda, linux-media, Tomasz Figa, dri-devel, tzimmermann,
	sakari.ailus, hverkuil-cisco, mchehab, jszhang, ezequiel

On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> On 8/18/22 14:06, Tomasz Figa wrote:
> > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> >>
> >> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> >>
> >> The most of detail has been written in the drm.

This patch still needs a description of the format, which should go to
Documentation/userspace-api/media/v4l/.

> >> Please notice that the tiled formats here request
> >> one more plane for storing the motion vector metadata.
> >> This buffer won't be compressed, so you can't append
> >> it to luma or chroma plane.
> > 
> > Does the motion vector buffer need to be exposed to userspace? Is the
> > decoder stateless (requires userspace to specify the reference frames)
> > or stateful (manages the entire decoding process internally)?
> 
> No, users don't need to access them at all. Just they need a different 
> dma-heap.
> 
> You would only get the stateful version of both encoder and decoder.

Shouldn't the motion vectors be stored in a separate V4L2 buffer,
submitted through a different queue then ?

> >> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> >> ---
> >>   drivers/media/v4l2-core/v4l2-common.c | 1 +
> >>   drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> >>   include/uapi/linux/videodev2.h        | 2 ++
> >>   3 files changed, 5 insertions(+)
> >>
> >> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> >> index e0fbe6ba4b6c..f645278b3055 100644
> >> --- a/drivers/media/v4l2-core/v4l2-common.c
> >> +++ b/drivers/media/v4l2-core/v4l2-common.c
> >> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> >>                  { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>                  { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>                  { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> >>          };
> >>          unsigned int i;
> >>
> >> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> >> index e6fd355a2e92..8f65964aff08 100644
> >> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> >> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> >> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> >>                  case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> >>                  case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> >>                  case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> >> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> >> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> >>                  default:
> >>                          if (fmt->description[0])
> >>                                  return;
> >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >> index 01e630f2ec78..7e928cb69e7c 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> >>   #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> >>   #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> >>   #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> >> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> >> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> >>
> >>   /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
> >>   #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
  2022-08-18  6:07   ` Tomasz Figa
@ 2022-08-18 23:16   ` Laurent Pinchart
  2022-08-18 23:37     ` Hsia-Jun Li
  1 sibling, 1 reply; 24+ messages in thread
From: Laurent Pinchart @ 2022-08-18 23:16 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, tfiga, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, ezequiel

Hi Hsia-Jun,

Thank you for the patch.

On Tue, Aug 09, 2022 at 12:27:49AM +0800, Hsia-Jun Li wrote:
> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> 
> Memory Traffic Reduction(MTR) is a module in Synaptics
> VideoSmart platform could process lossless compression image
> and cache the tile memory line.
> 
> Those modifiers only record the parameters would effort pixel
> layout or memory layout. Whether physical memory page mapping
> is used is not a part of format.
> 
> We would allocate the same size of memory for uncompressed
> and compressed luma and chroma data, while the compressed buffer
> would request two extra planes holding the metadata for
> the decompression.
> 
> The reason why we need to allocate the same size of memory for
> the compressed frame:
> 1. The compression ratio is not fixed and differnt platforms could
> use a different compression protocol. These protocols are complete
> vendor proprietaries, the other device won't be able to use them.
> It is not necessary to define the version of them here.
> 
> 2. Video codec could discard the compression attribute when the
> intra block copy applied to this frame. It would waste lots of
> time on re-allocation.
> 
> I am wondering if it is better to add an addtional plane property to
> describe whether the current framebuffer is compressed?
> While the compression flag is still a part of format modifier,
> because it would have two extra meta data planes in the compression
> version.

Would it possible to show an example of how these modifiers apply to a
particular format (such as NV12 for instance) ? Otherwise I'm having
trouble understanding how they actually work.

> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> ---
>  include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 0206f812c569..b67884e8bc69 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -381,6 +381,7 @@ extern "C" {
>  #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
>  #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
>  #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
> +#define DRM_FORMAT_MOD_VENDOR_SYNAPTICS 0x0b
>  
>  /* add more to the end as needed */
>  
> @@ -1452,6 +1453,54 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
>  #define AMD_FMT_MOD_CLEAR(field) \
>  	(~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
>  
> +/*
> + * Synaptics VideoSmart modifiers
> + *
> + *       Macro
> + * Bits  Param Description
> + * ----  ----- -----------------------------------------------------------------
> + *
> + *  7:0  f     Scan direction description.
> + *
> + *               0 = Invalid
> + *               1 = V4, the scan would always start from vertical for 4 pixel
> + *                   then move back to the start pixel of the next horizontal
> + *                   direction.
> + *               2 = Reserved for future use.
> + *
> + * 15:8  m     The times of pattern repeat in the right angle direction from
> + *             the first scan direction.
> + *
> + * 19:16 p     The padding bits after the whole scan, could be zero.
> + *
> + * 35:20 -     Reserved for future use.  Must be zero.
> + *
> + * 36:36 c     Compression flag.
> + *
> + * 55:37 -     Reserved for future use.  Must be zero.
> + *
> + */
> +
> +#define DRM_FORMAT_MOD_SYNA_V4_TILED		fourcc_mod_code(SYNAPTICS, 1)
> +
> +#define DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(f, m, p, c) \
> +	fourcc_mod_code(SYNAPTICS, (((f) & 0xff) | \
> +				 (((m) & 0xff) << 8) | \
> +				 (((p) & 0xf) << 16) | \
> +				 (((c) & 0x1) << 36)))
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H1 \
> +	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 0)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H3P8 \
> +	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 0)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H1_COMPRESSED \
> +	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 1)
> +
> +#define DRM_FORMAT_MOD_SYNA_V4H3P8_COMPRESSED \
> +	DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 1)
> +
>  #if defined(__cplusplus)
>  }
>  #endif

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] Add pixel formats used in Synatpics SoC
  2022-08-18 23:08 ` [PATCH 0/2] Add pixel formats used in Synatpics SoC Laurent Pinchart
@ 2022-08-18 23:28   ` Hsia-Jun Li
  0 siblings, 0 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-18 23:28 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: linux-arm-kernel, tfiga, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, ezequiel



On 8/19/22 07:08, Laurent Pinchart wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Hi Hsia-Jun,
> 
> On Tue, Aug 09, 2022 at 12:27:48AM +0800, Hsia-Jun Li wrote:
>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>
>> Those pixel formats are used in Synaptics's VideoSmart series SoCs,
>> likes VS640, VS680.  I just disclose the pixel formats used in the video
>> codecs and display pipeline this time. Actually any device with a MTR
>> module could support those tiled and compressed pixel formats. The more
>> detail about MTR module could be found in the first patch of this serial
>> of mail.
>>
>> We may not be able to post any drivers here in a short time, the most of
>> work in this platform is done in the Trusted Execution Environment and
>> we didn't use the optee framework.
> 
> Is that so for the display side too, or only for the video decoder ?
These pixel formats are using in both video decoder and display(Not the 
GPU). Besides, ISP and NPU in vs680 support some patterns of them.

Please notice that after I reviewed the compression options of our 
platform, I found using modifies are not enough to store all the 
compression options here. I would post a second version here.

I may use the same way that Intel, I would try to disclose more details 
here, hoping we could find a better way to describe them.
> 
>> Please notice that, the memory planes used for video codecs would be 5
>> when the compression is invoked while it would be 4 for display, the
>> extra planes in the video codecs is for the decoding internally usage,
>> it can't append the luma or chroma buffer as many other drivers do,
>> because this buffer could be only accessed by the video codecs itself,
>> it requests a different memory security attributes. Any other reason is
>> described in the v4l pixel formats's patch. I don't know whether a
>> different numbers of memory planes between drm and v4l2 is acceptable.
> 
> I don't think that's a problem as such, as long as both the V4L2 and DRM
> formats make sense on their own.
> 
>> I only posted the compression fourcc for the v4l2, because it is really
>> hard to put the uncompression version of pixel formats under the fourcc.
>> I would be better that we could have something likes format modifers in
>> drm here.
> 
> Agreed, we need modifiers support in V4L2. This has been discussed
> previously ([1]), and a proposal ([2]) has been submitted two years ago,
> it needs to be revived.
Thank you, I have found those v4l2_ext_pix_format, I would relay my 
comment in the email that posting synaptics v4l2 pixel formats.
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dmedia_20170821155203.GB38943-40e107564-2Dlin.cambridge.arm.com_&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=Ktu-e-R1Mn89Laxioh6RlL6Y2aycZ9NrJTIyONaDdRQvnlv-Nd570KldQ51vmigK&s=_7eMTIYwWUOWkXijcRfotLJlpR7G5yx-ZXuTwh9uZw4&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dmedia_20200804192939.2251988-2D1-2Dhelen.koike-40collabora.com_&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=Ktu-e-R1Mn89Laxioh6RlL6Y2aycZ9NrJTIyONaDdRQvnlv-Nd570KldQ51vmigK&s=f1dbc5ciUeIkO6VMtlRuEvXqJad2NsoaDBFyNUsSdpg&e=
> 
>> https://synaptics.com/products/multimedia-solutions
>>
>> Hsia-Jun(Randy) Li (2):
>>    drm/fourcc: Add Synaptics VideoSmart tiled modifiers
>>    [WIP]: media: Add Synaptics compressed tiled format
>>
>>   drivers/media/v4l2-core/v4l2-common.c |  1 +
>>   drivers/media/v4l2-core/v4l2-ioctl.c  |  2 ++
>>   include/uapi/drm/drm_fourcc.h         | 49 +++++++++++++++++++++++++++
>>   include/uapi/linux/videodev2.h        |  2 ++
>>   4 files changed, 54 insertions(+)
> 
> --
> Regards,
> 
> Laurent Pinchart

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers
  2022-08-18 23:16   ` Laurent Pinchart
@ 2022-08-18 23:37     ` Hsia-Jun Li
  0 siblings, 0 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-18 23:37 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: linux-arm-kernel, tfiga, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, ezequiel



On 8/19/22 07:16, Laurent Pinchart wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Hi Hsia-Jun,
> 
> Thank you for the patch.
> 
> On Tue, Aug 09, 2022 at 12:27:49AM +0800, Hsia-Jun Li wrote:
>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>
>> Memory Traffic Reduction(MTR) is a module in Synaptics
>> VideoSmart platform could process lossless compression image
>> and cache the tile memory line.
>>
>> Those modifiers only record the parameters would effort pixel
>> layout or memory layout. Whether physical memory page mapping
>> is used is not a part of format.
>>
>> We would allocate the same size of memory for uncompressed
>> and compressed luma and chroma data, while the compressed buffer
>> would request two extra planes holding the metadata for
>> the decompression.
>>
>> The reason why we need to allocate the same size of memory for
>> the compressed frame:
>> 1. The compression ratio is not fixed and differnt platforms could
>> use a different compression protocol. These protocols are complete
>> vendor proprietaries, the other device won't be able to use them.
>> It is not necessary to define the version of them here.
>>
>> 2. Video codec could discard the compression attribute when the
>> intra block copy applied to this frame. It would waste lots of
>> time on re-allocation.
>>
>> I am wondering if it is better to add an addtional plane property to
>> describe whether the current framebuffer is compressed?
>> While the compression flag is still a part of format modifier,
>> because it would have two extra meta data planes in the compression
>> version.
> 
> Would it possible to show an example of how these modifiers apply to a
> particular format (such as NV12 for instance) ? Otherwise I'm having
> trouble understanding how they actually workThis version didn't contains the big tile information as I was 
considering moving them into compression options. The uncompressed tile 
in big tile pixel formats do exist, but I never see them being used.
Anyway, let me just try to describe the most simple tile here.
For example a NV12 tile format(V4H1)
(0, 0) (0, 1) (0, 2)
(1, 0) (1, 1)
(2, 0) (2, 1)
(3, 0)
(0, 0) (1, 0) (2, 0) (3, 0) is a tile, then (0, 1)..(3, 1) after that.
(4, 0) is after (3, y).

For example a NV15 tiled format(V4H3P8),
(0, 0) (0, 1) (0, 2)
(1, 0) (1, 1)
(2, 0) (2, 1)
(3, 0)

(0, 0) (1, 0) (2, 0) (3, 0) (0, 1) (1, 1) (2, 1) ... (3, 2) are 120bits, 
then fill it with an extra 8 bits, (0, 3) would be placed that the first 
128bits in the memory.
> 
>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>> ---
>>   include/uapi/drm/drm_fourcc.h | 49 +++++++++++++++++++++++++++++++++++
>>   1 file changed, 49 insertions(+)
>>
>> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
>> index 0206f812c569..b67884e8bc69 100644
>> --- a/include/uapi/drm/drm_fourcc.h
>> +++ b/include/uapi/drm/drm_fourcc.h
>> @@ -381,6 +381,7 @@ extern "C" {
>>   #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
>>   #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
>>   #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
>> +#define DRM_FORMAT_MOD_VENDOR_SYNAPTICS 0x0b
>>
>>   /* add more to the end as needed */
>>
>> @@ -1452,6 +1453,54 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
>>   #define AMD_FMT_MOD_CLEAR(field) \
>>        (~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
>>
>> +/*
>> + * Synaptics VideoSmart modifiers
>> + *
>> + *       Macro
>> + * Bits  Param Description
>> + * ----  ----- -----------------------------------------------------------------
>> + *
>> + *  7:0  f     Scan direction description.
>> + *
>> + *               0 = Invalid
>> + *               1 = V4, the scan would always start from vertical for 4 pixel
>> + *                   then move back to the start pixel of the next horizontal
>> + *                   direction.
>> + *               2 = Reserved for future use.
>> + *
>> + * 15:8  m     The times of pattern repeat in the right angle direction from
>> + *             the first scan direction.
>> + *
>> + * 19:16 p     The padding bits after the whole scan, could be zero.
>> + *
>> + * 35:20 -     Reserved for future use.  Must be zero.
>> + *
>> + * 36:36 c     Compression flag.
>> + *
>> + * 55:37 -     Reserved for future use.  Must be zero.
>> + *
>> + */
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4_TILED         fourcc_mod_code(SYNAPTICS, 1)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(f, m, p, c) \
>> +     fourcc_mod_code(SYNAPTICS, (((f) & 0xff) | \
>> +                              (((m) & 0xff) << 8) | \
>> +                              (((p) & 0xf) << 16) | \
>> +                              (((c) & 0x1) << 36)))
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H1 \
>> +     DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 0)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H3P8 \
>> +     DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 0)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H1_COMPRESSED \
>> +     DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 1, 0, 1)
>> +
>> +#define DRM_FORMAT_MOD_SYNA_V4H3P8_COMPRESSED \
>> +     DRM_FORMAT_MOD_SYNA_MTR_LINEAR_2D(1, 3, 8, 1)
>> +
>>   #if defined(__cplusplus)
>>   }
>>   #endif
> 
> --
> Regards,
> 
> Laurent Pinchart

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-18 23:13       ` Laurent Pinchart
@ 2022-08-18 23:51         ` Hsia-Jun Li
  2022-08-19 15:28         ` Nicolas Dufresne
  1 sibling, 0 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-18 23:51 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: linux-arm-kernel, sebastian.hesselbarth, airlied, linux-kernel,
	ribalda, linux-media, Tomasz Figa, dri-devel, tzimmermann,
	sakari.ailus, hverkuil-cisco, mchehab, jszhang, ezequiel



On 8/19/22 07:13, Laurent Pinchart wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>> On 8/18/22 14:06, Tomasz Figa wrote:
>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>>>
>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>>>
>>>> The most of detail has been written in the drm.
> 
> This patch still needs a description of the format, which should go to
> Documentation/userspace-api/media/v4l/.

I just want t tell people we need an extra plane for MVTP and I don't 
have enough space here to place all the pixel formats.

Besides, I was thinking a modifer in v4l2_ext_pix_format is not enough.
Let's take a compression NV12 tile format as an example, we need
1. luma planes
2. chroma planes
3. compression meta data for luma
4. compression meta data for chroma
5. mvtp
and a single data planer version would be
1. luma and chroma data
2. compression meta data
3. mvtp

You see, we actually have 3 kind of data here(not including the 
compression options that I am thinking of storing them somewhere else).
> 
>>>> Please notice that the tiled formats here request
>>>> one more plane for storing the motion vector metadata.
>>>> This buffer won't be compressed, so you can't append
>>>> it to luma or chroma plane.
>>>
>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>> decoder stateless (requires userspace to specify the reference frames)
>>> or stateful (manages the entire decoding process internally)?
>>
>> No, users don't need to access them at all. Just they need a different
>> dma-heap.
>>
>> You would only get the stateful version of both encoder and decoder.
> 
> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> submitted through a different queue then ?
Yes, I like that.
Proposal: A third buffer type for the reconstruction buffers in V4L2 M2M 
encoder
https://www.spinics.net/lists/linux-media/msg214565.html

Although the major usage for the decoder here is producing the non-tile 
buffers, the decoder of us could product the NV12 or the pixel formats 
that GPU likes, but it must happen at the same time a frame is decoded.
Still the reference buffer or we call them the real decoded frame would 
stay in a tiled format. More than one queue would be need here.
> 
>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>>>> ---
>>>>    drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>    drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>    include/uapi/linux/videodev2.h        | 2 ++
>>>>    3 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>                   { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>                   { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>                   { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>           };
>>>>           unsigned int i;
>>>>
>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> index e6fd355a2e92..8f65964aff08 100644
>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>                   case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>                   case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>                   case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>                   default:
>>>>                           if (fmt->description[0])
>>>>                                   return;
>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>> --- a/include/uapi/linux/videodev2.h
>>>> +++ b/include/uapi/linux/videodev2.h
>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>    #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>    #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>    #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>
>>>>    /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=87M5Aa5fG3kdTTjlJrLIgv0E7O10QAj_4RqDlVsCAFdbfJzJ_P0s8wkBqaR0VBUO&s=8AsoiLPt9hQnn4ta51tT-RUXRLoKKYrePdAwtdvxuDo&e=   */
>>>>    #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> 
> --
> Regards,
> 
> Laurent Pinchart

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-18  6:33     ` Hsia-Jun Li
  2022-08-18 23:13       ` Laurent Pinchart
@ 2022-08-19 15:26       ` Nicolas Dufresne
  1 sibling, 0 replies; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-19 15:26 UTC (permalink / raw)
  To: Hsia-Jun Li, Tomasz Figa
  Cc: linux-arm-kernel, ezequiel, sebastian.hesselbarth, airlied,
	linux-kernel, linux-media, sakari.ailus, dri-devel, tzimmermann,
	ribalda, hverkuil-cisco, mchehab, jszhang, laurent.pinchart

Le jeudi 18 août 2022 à 14:33 +0800, Hsia-Jun Li a écrit :
> 
> On 8/18/22 14:06, Tomasz Figa wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > 
> > 
> > Hi Randy,
> > 
> > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> > > 
> > > From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> > > 
> > > The most of detail has been written in the drm.
> > > Please notice that the tiled formats here request
> > > one more plane for storing the motion vector metadata.
> > > This buffer won't be compressed, so you can't append
> > > it to luma or chroma plane.
> > 
> > Does the motion vector buffer need to be exposed to userspace? Is the
> > decoder stateless (requires userspace to specify the reference frames)
> > or stateful (manages the entire decoding process internally)?
> > 
> No, users don't need to access them at all. Just they need a different 
> dma-heap.
> 
> You would only get the stateful version of both encoder and decoder.

Can't you just allocate and manage these internally in the kernel driver without
adding kernel APIs ? This is notably what Mediatek and (downstream) RPi HEVC
driver do, as it allow reducing quite a lot the memory usage. In Hantro, we bind
them due to HW limitation.

Nicolas

> > Best regards,
> > Tomasz
> > 
> > > 
> > > Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> > > ---
> > >   drivers/media/v4l2-core/v4l2-common.c | 1 +
> > >   drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> > >   include/uapi/linux/videodev2.h        | 2 ++
> > >   3 files changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > index e0fbe6ba4b6c..f645278b3055 100644
> > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > >                  { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > >                  { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > >                  { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > >          };
> > >          unsigned int i;
> > > 
> > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > index e6fd355a2e92..8f65964aff08 100644
> > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > >                  case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> > >                  case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> > >                  case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> > > +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> > >                  default:
> > >                          if (fmt->description[0])
> > >                                  return;
> > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > index 01e630f2ec78..7e928cb69e7c 100644
> > > --- a/include/uapi/linux/videodev2.h
> > > +++ b/include/uapi/linux/videodev2.h
> > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > >   #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> > >   #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > >   #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> > > 
> > >   /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIBaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=vmpysqneiHK3UXcq6UOewdMwobFa70zKB3RuOgYT02aFw9fCs6qd7j-U1sYSey79&s=yblzF1GwanMEJFC3yt9nBAQjaaAEJKKlNgj4k64v5eE&e=   */
> > >   #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> > > --
> > > 2.17.1
> > > 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-18 23:13       ` Laurent Pinchart
  2022-08-18 23:51         ` Hsia-Jun Li
@ 2022-08-19 15:28         ` Nicolas Dufresne
  2022-08-19 15:44           ` Hsia-Jun Li
  1 sibling, 1 reply; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-19 15:28 UTC (permalink / raw)
  To: Laurent Pinchart, Hsia-Jun Li
  Cc: linux-arm-kernel, sebastian.hesselbarth, airlied, linux-kernel,
	ribalda, linux-media, Tomasz Figa, dri-devel, tzimmermann,
	sakari.ailus, hverkuil-cisco, mchehab, jszhang, ezequiel

Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > On 8/18/22 14:06, Tomasz Figa wrote:
> > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> > > > 
> > > > From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> > > > 
> > > > The most of detail has been written in the drm.
> 
> This patch still needs a description of the format, which should go to
> Documentation/userspace-api/media/v4l/.
> 
> > > > Please notice that the tiled formats here request
> > > > one more plane for storing the motion vector metadata.
> > > > This buffer won't be compressed, so you can't append
> > > > it to luma or chroma plane.
> > > 
> > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > decoder stateless (requires userspace to specify the reference frames)
> > > or stateful (manages the entire decoding process internally)?
> > 
> > No, users don't need to access them at all. Just they need a different 
> > dma-heap.
> > 
> > You would only get the stateful version of both encoder and decoder.
> 
> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> submitted through a different queue then ?

Imho, I believe these should be invisible to users and pooled separately to
reduce the overhead. The number of reference is usually lower then the number of
allocated display buffers.

> 
> > > > Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> > > > ---
> > > >   drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > >   drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> > > >   include/uapi/linux/videodev2.h        | 2 ++
> > > >   3 files changed, 5 insertions(+)
> > > > 
> > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > >                  { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > >                  { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > >                  { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > >          };
> > > >          unsigned int i;
> > > > 
> > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > index e6fd355a2e92..8f65964aff08 100644
> > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > >                  case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> > > >                  case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> > > >                  case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> > > > +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > >                  default:
> > > >                          if (fmt->description[0])
> > > >                                  return;
> > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > --- a/include/uapi/linux/videodev2.h
> > > > +++ b/include/uapi/linux/videodev2.h
> > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > >   #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> > > >   #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > >   #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> > > > 
> > > >   /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */
> > > >   #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-19 15:28         ` Nicolas Dufresne
@ 2022-08-19 15:44           ` Hsia-Jun Li
  2022-08-19 19:17             ` Nicolas Dufresne
  2022-08-23  6:05             ` Tomasz Figa
  0 siblings, 2 replies; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-19 15:44 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel



On 8/19/22 23:28, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>>>>
>>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>>>>
>>>>> The most of detail has been written in the drm.
>>
>> This patch still needs a description of the format, which should go to
>> Documentation/userspace-api/media/v4l/.
>>
>>>>> Please notice that the tiled formats here request
>>>>> one more plane for storing the motion vector metadata.
>>>>> This buffer won't be compressed, so you can't append
>>>>> it to luma or chroma plane.
>>>>
>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>> decoder stateless (requires userspace to specify the reference frames)
>>>> or stateful (manages the entire decoding process internally)?
>>>
>>> No, users don't need to access them at all. Just they need a different
>>> dma-heap.
>>>
>>> You would only get the stateful version of both encoder and decoder.
>>
>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>> submitted through a different queue then ?
> 
> Imho, I believe these should be invisible to users and pooled separately to
> reduce the overhead. The number of reference is usually lower then the number of
> allocated display buffers.
> 
You can't. The motion vector buffer can't share with the luma and chroma 
data planes, nor the data plane for the compression meta data.

You could consider this as a security requirement(the memory region for 
the MV could only be accessed by the decoder) or hardware limitation.

It is also not very easy to manage such a large buffer that would change 
when the resolution changed.
>>
>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>>>>> ---
>>>>>    drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>    drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>>    include/uapi/linux/videodev2.h        | 2 ++
>>>>>    3 files changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>                   { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>                   { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>                   { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>           };
>>>>>           unsigned int i;
>>>>>
>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>                   case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>>                   case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>>                   case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>                   default:
>>>>>                           if (fmt->description[0])
>>>>>                                   return;
>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>> --- a/include/uapi/linux/videodev2.h
>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>    #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>>    #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>    #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>>
>>>>>    /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
>>>>>    #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>>
> 

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-19 15:44           ` Hsia-Jun Li
@ 2022-08-19 19:17             ` Nicolas Dufresne
  2022-08-20  0:10               ` Hsia-Jun Li
  2022-08-23  6:05             ` Tomasz Figa
  1 sibling, 1 reply; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-19 19:17 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel

Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
> 
> On 8/19/22 23:28, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > 
> > 
> > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> > > > > > 
> > > > > > From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> > > > > > 
> > > > > > The most of detail has been written in the drm.
> > > 
> > > This patch still needs a description of the format, which should go to
> > > Documentation/userspace-api/media/v4l/.
> > > 
> > > > > > Please notice that the tiled formats here request
> > > > > > one more plane for storing the motion vector metadata.
> > > > > > This buffer won't be compressed, so you can't append
> > > > > > it to luma or chroma plane.
> > > > > 
> > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > or stateful (manages the entire decoding process internally)?
> > > > 
> > > > No, users don't need to access them at all. Just they need a different
> > > > dma-heap.
> > > > 
> > > > You would only get the stateful version of both encoder and decoder.
> > > 
> > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > submitted through a different queue then ?
> > 
> > Imho, I believe these should be invisible to users and pooled separately to
> > reduce the overhead. The number of reference is usually lower then the number of
> > allocated display buffers.
> > 
> You can't. The motion vector buffer can't share with the luma and chroma 
> data planes, nor the data plane for the compression meta data.
> 
> You could consider this as a security requirement(the memory region for 
> the MV could only be accessed by the decoder) or hardware limitation.
> 
> It is also not very easy to manage such a large buffer that would change 
> when the resolution changed.

Your argument are just aiming toward the fact that you should not let the user
allocate these in the first place. They should not be bound to the v4l2 buffer.
Allocate these in your driver, and leave to your user the pixel buffer (and
compress meta) allocation work.

Other driver handle this just fine, if your v4l2 driver implement the v4l2
resolution change mechanism, is should be very simple to manage.

> > > 
> > > > > > Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> > > > > > ---
> > > > > >    drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > >    drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> > > > > >    include/uapi/linux/videodev2.h        | 2 ++
> > > > > >    3 files changed, 5 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > >                   { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > >                   { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > >                   { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > >           };
> > > > > >           unsigned int i;
> > > > > > 
> > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > >                   case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> > > > > >                   case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> > > > > >                   case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > >                   default:
> > > > > >                           if (fmt->description[0])
> > > > > >                                   return;
> > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > >    #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> > > > > >    #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > >    #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> > > > > > 
> > > > > >    /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
> > > > > >    #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-19 19:17             ` Nicolas Dufresne
@ 2022-08-20  0:10               ` Hsia-Jun Li
  2022-08-22 14:15                 ` Nicolas Dufresne
  0 siblings, 1 reply; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-20  0:10 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel



On 8/20/22 03:17, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>>
>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>>>>>>
>>>>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>>>>>>
>>>>>>> The most of detail has been written in the drm.
>>>>
>>>> This patch still needs a description of the format, which should go to
>>>> Documentation/userspace-api/media/v4l/.
>>>>
>>>>>>> Please notice that the tiled formats here request
>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>> it to luma or chroma plane.
>>>>>>
>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>
>>>>> No, users don't need to access them at all. Just they need a different
>>>>> dma-heap.
>>>>>
>>>>> You would only get the stateful version of both encoder and decoder.
>>>>
>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>> submitted through a different queue then ?
>>>
>>> Imho, I believe these should be invisible to users and pooled separately to
>>> reduce the overhead. The number of reference is usually lower then the number of
>>> allocated display buffers.
>>>
>> You can't. The motion vector buffer can't share with the luma and chroma
>> data planes, nor the data plane for the compression meta data.
>>
>> You could consider this as a security requirement(the memory region for
>> the MV could only be accessed by the decoder) or hardware limitation.
>>
>> It is also not very easy to manage such a large buffer that would change
>> when the resolution changed.
> 
> Your argument are just aiming toward the fact that you should not let the user
> allocate these in the first place. They should not be bound to the v4l2 buffer.
> Allocate these in your driver, and leave to your user the pixel buffer (and
> compress meta) allocation work.
> 
What I want to say is that userspace could allocate buffers then make 
the v4l2 decoder import these buffers, but each planes should come from 
the right DMA-heaps. Usually the userspace would know better the memory 
occupation, it would bring some flexibility here.

Currently, they are another thing bothers me, I need to allocate a small 
piece of memory(less than 128KiB) as the compression metadata buffers as 
I mentioned here. And these pieces of memory should be located in a 
small region, or the performance could be badly hurt, besides, we don't 
support IOMMU for this kind of data.

Any idea about assign a small piece of memory from a pre-allocated 
memory or select region(I don't think I could reserve them in a 
DMA-heap) for a plane in the MMAP type buffer ?

Besides, I am not very satisfied with the dynamic resolution change 
steps if I understand it correct. Buffers reallocation should happen 
when we receive the event not until the drain is done. A resolution 
rising is very common when you are playing a network stream, it would be 
better that the decoder decided how many buffers it need for the 
previous sequence while the userspace could reallocate the reset of 
buffers in the CAPTURE queue.
> Other driver handle this just fine, if your v4l2 driver implement the v4l2
> resolution change mechanism, is should be very simple to manage.
> 
>>>>
>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>>>>>>> ---
>>>>>>>     drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>>     drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>>>>     include/uapi/linux/videodev2.h        | 2 ++
>>>>>>>     3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>>                    { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>>            };
>>>>>>>            unsigned int i;
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>>                    case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>>                    default:
>>>>>>>                            if (fmt->description[0])
>>>>>>>                                    return;
>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>>     #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>
>>>>>>>     /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
>>>>>>>     #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>>>>
>>>
>>
> 

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-20  0:10               ` Hsia-Jun Li
@ 2022-08-22 14:15                 ` Nicolas Dufresne
  2022-08-23  7:40                   ` Hsia-Jun Li
  0 siblings, 1 reply; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-22 14:15 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel

Le samedi 20 août 2022 à 08:10 +0800, Hsia-Jun Li a écrit :
> 
> On 8/20/22 03:17, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > 
> > 
> > Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
> > > 
> > > On 8/19/22 23:28, Nicolas Dufresne wrote:
> > > > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > > > 
> > > > 
> > > > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> > > > > > > > 
> > > > > > > > From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> > > > > > > > 
> > > > > > > > The most of detail has been written in the drm.
> > > > > 
> > > > > This patch still needs a description of the format, which should go to
> > > > > Documentation/userspace-api/media/v4l/.
> > > > > 
> > > > > > > > Please notice that the tiled formats here request
> > > > > > > > one more plane for storing the motion vector metadata.
> > > > > > > > This buffer won't be compressed, so you can't append
> > > > > > > > it to luma or chroma plane.
> > > > > > > 
> > > > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > > > or stateful (manages the entire decoding process internally)?
> > > > > > 
> > > > > > No, users don't need to access them at all. Just they need a different
> > > > > > dma-heap.
> > > > > > 
> > > > > > You would only get the stateful version of both encoder and decoder.
> > > > > 
> > > > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > > > submitted through a different queue then ?
> > > > 
> > > > Imho, I believe these should be invisible to users and pooled separately to
> > > > reduce the overhead. The number of reference is usually lower then the number of
> > > > allocated display buffers.
> > > > 
> > > You can't. The motion vector buffer can't share with the luma and chroma
> > > data planes, nor the data plane for the compression meta data.
> > > 
> > > You could consider this as a security requirement(the memory region for
> > > the MV could only be accessed by the decoder) or hardware limitation.
> > > 
> > > It is also not very easy to manage such a large buffer that would change
> > > when the resolution changed.
> > 
> > Your argument are just aiming toward the fact that you should not let the user
> > allocate these in the first place. They should not be bound to the v4l2 buffer.
> > Allocate these in your driver, and leave to your user the pixel buffer (and
> > compress meta) allocation work.
> > 
> What I want to say is that userspace could allocate buffers then make 
> the v4l2 decoder import these buffers, but each planes should come from 
> the right DMA-heaps. Usually the userspace would know better the memory 
> occupation, it would bring some flexibility here.
> 
> Currently, they are another thing bothers me, I need to allocate a small 
> piece of memory(less than 128KiB) as the compression metadata buffers as 
> I mentioned here. And these pieces of memory should be located in a 
> small region, or the performance could be badly hurt, besides, we don't 
> support IOMMU for this kind of data.
> 
> Any idea about assign a small piece of memory from a pre-allocated 
> memory or select region(I don't think I could reserve them in a 
> DMA-heap) for a plane in the MMAP type buffer ?

A V4L2 driver should first implement the V4L2 semantic before adding optional
use case like buffer importation. For this reason, your V4L2 driver should know
all the memory requirements and how to allocate that memory. Later on, your
importing driver will have to validate that the userland did it right at
importation. This is to follow V4L2 semantic and security model. If you move
simply trust the userland (gralloc), you are not doing it right.

> 
> Besides, I am not very satisfied with the dynamic resolution change 
> steps if I understand it correct. Buffers reallocation should happen 
> when we receive the event not until the drain is done. A resolution 
> rising is very common when you are playing a network stream, it would be 
> better that the decoder decided how many buffers it need for the 
> previous sequence while the userspace could reallocate the reset of 
> buffers in the CAPTURE queue.
> > Other driver handle this just fine, if your v4l2 driver implement the v4l2
> > resolution change mechanism, is should be very simple to manage.

This is a limitation of the queue design of V4L2. While streaming the buffers
associated with the queue must currently be large enough to support the selected
format. "large enough" in your case is complex, and validation must be
programmed in your driver.

There was discussion on perhaps extending on CREATE_BUFS feature, that let you
allocate at run-time for a different format/resolution (no drivers currently
allow that). The rules around that aren't specified (and will have to be defined
before a driver starts making use of that). Note that to be usable, a
DELETE_BUF(s) ioctl would probably be needed too.

In current state, If your driver can support it, userland does not strictly need
to re-allocate if the resolution is changed to smaller. In most SVC scenarios,
the largest resolution is known in advance, so pre-allocation can happen to the
appropriate resolution and queue size. Re-allocation is then rarely triggered at
run time. Unlike your system, IOMMU system are not as affected by allocation
latency and manages to do gapless transition despite this inefficiency.

Note that all this is pure recommendation. What I'm seeing here is a pixel
format documented with Android assumptions rather then mainline, and sent
without the associated implementation. This simply raises some question on the
viability of the whole. This is not a critic but just some verification that
ensure you are following the V4L2 spec.

> > 
> > > > > 
> > > > > > > > Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> > > > > > > > ---
> > > > > > > >     drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > > > >     drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> > > > > > > >     include/uapi/linux/videodev2.h        | 2 ++
> > > > > > > >     3 files changed, 5 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > > > >                    { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > >                    { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > >                    { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > > > >            };
> > > > > > > >            unsigned int i;
> > > > > > > > 
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > > > >                    case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> > > > > > > >                    case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> > > > > > > >                    case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > > > +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > > > +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > > > >                    default:
> > > > > > > >                            if (fmt->description[0])
> > > > > > > >                                    return;
> > > > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > > > >     #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> > > > > > > >     #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > > > >     #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> > > > > > > > 
> > > > > > > >     /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
> > > > > > > >     #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-19 15:44           ` Hsia-Jun Li
  2022-08-19 19:17             ` Nicolas Dufresne
@ 2022-08-23  6:05             ` Tomasz Figa
  2022-08-23  7:03               ` Hsia-Jun Li
  1 sibling, 1 reply; 24+ messages in thread
From: Tomasz Figa @ 2022-08-23  6:05 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Nicolas Dufresne,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel

On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <Randy.Li@synaptics.com> wrote:
>
>
>
> On 8/19/22 23:28, Nicolas Dufresne wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> >> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> >>> On 8/18/22 14:06, Tomasz Figa wrote:
> >>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> >>>>>
> >>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> >>>>>
> >>>>> The most of detail has been written in the drm.
> >>
> >> This patch still needs a description of the format, which should go to
> >> Documentation/userspace-api/media/v4l/.
> >>
> >>>>> Please notice that the tiled formats here request
> >>>>> one more plane for storing the motion vector metadata.
> >>>>> This buffer won't be compressed, so you can't append
> >>>>> it to luma or chroma plane.
> >>>>
> >>>> Does the motion vector buffer need to be exposed to userspace? Is the
> >>>> decoder stateless (requires userspace to specify the reference frames)
> >>>> or stateful (manages the entire decoding process internally)?
> >>>
> >>> No, users don't need to access them at all. Just they need a different
> >>> dma-heap.
> >>>
> >>> You would only get the stateful version of both encoder and decoder.
> >>
> >> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> >> submitted through a different queue then ?
> >
> > Imho, I believe these should be invisible to users and pooled separately to
> > reduce the overhead. The number of reference is usually lower then the number of
> > allocated display buffers.
> >
> You can't. The motion vector buffer can't share with the luma and chroma
> data planes, nor the data plane for the compression meta data.

I believe what Nicolas is suggesting is to just keep the MV buffer
handling completely separate from video buffers. Just keep a map
between frame buffer and MV buffer in the driver and use the right
buffer when triggering a decode.

>
> You could consider this as a security requirement(the memory region for
> the MV could only be accessed by the decoder) or hardware limitation.
>
> It is also not very easy to manage such a large buffer that would change
> when the resolution changed.

How does it differ from managing additional planes of video buffers?

Best regards,
Tomasz

> >>
> >>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> >>>>> ---
> >>>>>    drivers/media/v4l2-core/v4l2-common.c | 1 +
> >>>>>    drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> >>>>>    include/uapi/linux/videodev2.h        | 2 ++
> >>>>>    3 files changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> >>>>> index e0fbe6ba4b6c..f645278b3055 100644
> >>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
> >>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
> >>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> >>>>>                   { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>>                   { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>>                   { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> >>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> >>>>>           };
> >>>>>           unsigned int i;
> >>>>>
> >>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> index e6fd355a2e92..8f65964aff08 100644
> >>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> >>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> >>>>>                   case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> >>>>>                   case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> >>>>>                   case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> >>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> >>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> >>>>>                   default:
> >>>>>                           if (fmt->description[0])
> >>>>>                                   return;
> >>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >>>>> index 01e630f2ec78..7e928cb69e7c 100644
> >>>>> --- a/include/uapi/linux/videodev2.h
> >>>>> +++ b/include/uapi/linux/videodev2.h
> >>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> >>>>>    #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> >>>>>    #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> >>>>>    #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> >>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> >>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> >>>>>
> >>>>>    /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
> >>>>>    #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> >>
> >
>
> --
> Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-23  6:05             ` Tomasz Figa
@ 2022-08-23  7:03               ` Hsia-Jun Li
  2022-08-23 14:02                 ` Nicolas Dufresne
  0 siblings, 1 reply; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-23  7:03 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Nicolas Dufresne,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel



On 8/23/22 14:05, Tomasz Figa wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <Randy.Li@synaptics.com> wrote:
>>
>>
>>
>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>>>>>>
>>>>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>>>>>>
>>>>>>> The most of detail has been written in the drm.
>>>>
>>>> This patch still needs a description of the format, which should go to
>>>> Documentation/userspace-api/media/v4l/.
>>>>
>>>>>>> Please notice that the tiled formats here request
>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>> it to luma or chroma plane.
>>>>>>
>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>
>>>>> No, users don't need to access them at all. Just they need a different
>>>>> dma-heap.
>>>>>
>>>>> You would only get the stateful version of both encoder and decoder.
>>>>
>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>> submitted through a different queue then ?
>>>
>>> Imho, I believe these should be invisible to users and pooled separately to
>>> reduce the overhead. The number of reference is usually lower then the number of
>>> allocated display buffers.
>>>
>> You can't. The motion vector buffer can't share with the luma and chroma
>> data planes, nor the data plane for the compression meta data.
> 
> I believe what Nicolas is suggesting is to just keep the MV buffer
> handling completely separate from video buffers. Just keep a map
> between frame buffer and MV buffer in the driver and use the right
> buffer when triggering a decode.
> 
>>
>> You could consider this as a security requirement(the memory region for
>> the MV could only be accessed by the decoder) or hardware limitation.
>>
>> It is also not very easy to manage such a large buffer that would change
>> when the resolution changed.
> 
> How does it differ from managing additional planes of video buffers?
I should say I am not against his suggestion if I could make a DMA-heap 
v4l2 allocator merge into kernel in the future. Although I think we need 
two heaps here one for the normal video and one for the secure video, I 
don't have much idea on how to determine whether we are decoding a 
secure or non-secure video here (The design here is that the kernel 
didn't know, only hardware and TEE care about that).

Just one place that I think it would be more simple for me to manage the 
buffer here. When the decoder goes to the drain stage, then the MV 
buffer goes when the data buffer goes and create when the data buffer 
creates.
I know that is not a lot of work to doing the mapping between them. I 
just need to convince the other accepting that do not allocator the MV 
buffer outside.
> 
> Best regards,
> Tomasz
> 
>>>>
>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>>>>>>> ---
>>>>>>>     drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>>     drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>>>>     include/uapi/linux/videodev2.h        | 2 ++
>>>>>>>     3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>>                    { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>>            };
>>>>>>>            unsigned int i;
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>>                    case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>>                    default:
>>>>>>>                            if (fmt->description[0])
>>>>>>>                                    return;
>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>>     #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>
>>>>>>>     /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
>>>>>>>     #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>>>>
>>>
>>
>> --
>> Hsia-Jun(Randy) Li

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-22 14:15                 ` Nicolas Dufresne
@ 2022-08-23  7:40                   ` Hsia-Jun Li
  2022-08-23 13:44                     ` Nicolas Dufresne
  0 siblings, 1 reply; 24+ messages in thread
From: Hsia-Jun Li @ 2022-08-23  7:40 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel



On 8/22/22 22:15, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Le samedi 20 août 2022 à 08:10 +0800, Hsia-Jun Li a écrit :
>>
>> On 8/20/22 03:17, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>>>>
>>>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>>>
>>>>>
>>>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
>>>>>>>>>
>>>>>>>>> From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
>>>>>>>>>
>>>>>>>>> The most of detail has been written in the drm.
>>>>>>
>>>>>> This patch still needs a description of the format, which should go to
>>>>>> Documentation/userspace-api/media/v4l/.
>>>>>>
>>>>>>>>> Please notice that the tiled formats here request
>>>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>>>> it to luma or chroma plane.
>>>>>>>>
>>>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>>>
>>>>>>> No, users don't need to access them at all. Just they need a different
>>>>>>> dma-heap.
>>>>>>>
>>>>>>> You would only get the stateful version of both encoder and decoder.
>>>>>>
>>>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>>>> submitted through a different queue then ?
>>>>>
>>>>> Imho, I believe these should be invisible to users and pooled separately to
>>>>> reduce the overhead. The number of reference is usually lower then the number of
>>>>> allocated display buffers.
>>>>>
>>>> You can't. The motion vector buffer can't share with the luma and chroma
>>>> data planes, nor the data plane for the compression meta data.
>>>>
>>>> You could consider this as a security requirement(the memory region for
>>>> the MV could only be accessed by the decoder) or hardware limitation.
>>>>
>>>> It is also not very easy to manage such a large buffer that would change
>>>> when the resolution changed.
>>>
>>> Your argument are just aiming toward the fact that you should not let the user
>>> allocate these in the first place. They should not be bound to the v4l2 buffer.
>>> Allocate these in your driver, and leave to your user the pixel buffer (and
>>> compress meta) allocation work.
>>>
>> What I want to say is that userspace could allocate buffers then make
>> the v4l2 decoder import these buffers, but each planes should come from
>> the right DMA-heaps. Usually the userspace would know better the memory
>> occupation, it would bring some flexibility here.
>>
>> Currently, they are another thing bothers me, I need to allocate a small
>> piece of memory(less than 128KiB) as the compression metadata buffers as
>> I mentioned here. And these pieces of memory should be located in a
>> small region, or the performance could be badly hurt, besides, we don't
>> support IOMMU for this kind of data.
>>
>> Any idea about assign a small piece of memory from a pre-allocated
>> memory or select region(I don't think I could reserve them in a
>> DMA-heap) for a plane in the MMAP type buffer ?
> 
> A V4L2 driver should first implement the V4L2 semantic before adding optional
> use case like buffer importation. For this reason, your V4L2 driver should know
> all the memory requirements and how to allocate that memory. 
Yes, that is what I intend to. Or I just smuggle those things somewhere.
Later on, your
> importing driver will have to validate that the userland did it right at
> importation. This is to follow V4L2 semantic and security model. If you move
> simply trust the userland (gralloc), you are not doing it right.
> 
Yes, that is what I try to describe in the other thread
https://lore.kernel.org/linux-media/B4B3306F-C3B4-4594-BDF9-4BBC59C628C9@soulik.info/
I don't have the problem that let the userspace decided where and how to 
allocate the memory, but we need a new protocol here to let the 
userspace do it right.
>>
>> Besides, I am not very satisfied with the dynamic resolution change
>> steps if I understand it correct. Buffers reallocation should happen
>> when we receive the event not until the drain is done. A resolution
>> rising is very common when you are playing a network stream, it would be
>> better that the decoder decided how many buffers it need for the
>> previous sequence while the userspace could reallocate the reset of
>> buffers in the CAPTURE queue.
>>> Other driver handle this just fine, if your v4l2 driver implement the v4l2
>>> resolution change mechanism, is should be very simple to manage.
> 
> This is a limitation of the queue design of V4L2. While streaming the buffers
> associated with the queue must currently be large enough to support the selected
> format. "large enough" in your case is complex, and validation must be
> programmed in your driver.
> 
> There was discussion on perhaps extending on CREATE_BUFS feature, that let you
> allocate at run-time for a different format/resolution (no drivers currently
> allow that). The rules around that aren't specified (and will have to be defined
> before a driver starts making use of that). Note that to be usable, a
> DELETE_BUF(s) ioctl would probably be needed too.
> 
> In current state, If your driver can support it, userland does not strictly need
> to re-allocate if the resolution is changed to smaller. In most SVC scenarios,
> the largest resolution is known in advance, so pre-allocation can happen to the
When you play a video from Youtube, you may notice that starting 
resolution is low, then after it received more data knowning the 
bandwidth is enough, it would switch to a higher resolution. I don't 
think it would inform the codecs2 or OMX there is a higher target 
resolution.

Besides, for the case of SVC in a conference system, the remote(gatway) 
would not tell you there is a higer resolution or frame rate because you 
can't receive it in negotiate stage, it could be permanently(device 
capability) or just bandwidth problem. Whether we know there is a higher 
requirement video depends on the transport protocols used here.

The basic idea of SVC is that the low layer didn't depends on the upper 
layer, we can't tell how the bitstream usually.
> appropriate resolution and queue size. Re-allocation is then rarely triggered at
> run time. Unlike your system, IOMMU system are not as affected by allocation
> latency and manages to do gapless transition despite this inefficiency.
> 
> Note that all this is pure recommendation. What I'm seeing here is a pixel
> format documented with Android assumptions rather then mainline, and sent
> without the associated implementation. This simply raises some question on the
Because this implementation is very complex as you could see now, I 
didn't see the exist implementation here could decode DRM video or has 
the security restriction here.

And you see even before this decoder driver is done, we have had enough 
problems, even just the definition of pixel formats and data exchange 
mechanism.
> viability of the whole. This is not a critic but just some verification that
> ensure you are following the V4L2 spec.
I really want to those recommendations here, I want to make everything 
right at the first place. Or we would make a driver we would know it 
won't follow the v4l2 spec.
> 
>>>
>>>>>>
>>>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
>>>>>>>>> ---
>>>>>>>>>      drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>>>>      drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>>>>>>      include/uapi/linux/videodev2.h        | 2 ++
>>>>>>>>>      3 files changed, 5 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>>>>                     { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>>                     { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>>                     { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>>>>             };
>>>>>>>>>             unsigned int i;
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>>>>                     case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>>>>>>                     case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>>>>                     case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>>>>                     default:
>>>>>>>>>                             if (fmt->description[0])
>>>>>>>>>                                     return;
>>>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>>>>      #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>>>>>>      #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>>>>      #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>>>
>>>>>>>>>      /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
>>>>>>>>>      #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>>>>>>
>>>>>
>>>>
>>>
>>
> 

-- 
Hsia-Jun(Randy) Li

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-23  7:40                   ` Hsia-Jun Li
@ 2022-08-23 13:44                     ` Nicolas Dufresne
  0 siblings, 0 replies; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-23 13:44 UTC (permalink / raw)
  To: Hsia-Jun Li
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, ribalda, linux-media, Tomasz Figa,
	dri-devel, tzimmermann, sakari.ailus, hverkuil-cisco, mchehab,
	jszhang, ezequiel

Le mardi 23 août 2022 à 15:40 +0800, Hsia-Jun Li a écrit :
> > In current state, If your driver can support it, userland does not strictly
> > need
> > to re-allocate if the resolution is changed to smaller. In most SVC
> > scenarios,
> > the largest resolution is known in advance, so pre-allocation can happen to
> > the
> When you play a video from Youtube, you may notice that starting 
> resolution is low, then after it received more data knowning the 
> bandwidth is enough, it would switch to a higher resolution. I don't 
> think it would inform the codecs2 or OMX there is a higher target 
> resolution.
> 
> Besides, for the case of SVC in a conference system, the remote(gatway) 
> would not tell you there is a higer resolution or frame rate because you 
> can't receive it in negotiate stage, it could be permanently(device 
> capability) or just bandwidth problem. Whether we know there is a higher 
> requirement video depends on the transport protocols used here.
> 
> The basic idea of SVC is that the low layer didn't depends on the upper 
> layer, we can't tell how the bitstream usually.

I'm not saying against the fact the for drivers without IOMMU (hitting directly
into the CMA allocator), allocation latency is massive challenge, and a
mechanism to smoothly reallocate (rather then mass-reallocation) is needed in
the long run. This is what I'm referring to when saying that folks have
considered extending CREATE_BUFS() with a DELETE_BUFS() ioctl.

Note that there is tones of software trickery you can use to mitigate this. The
most simple one is to use CREATE_BUFS() instead of REQBUFS(). Instead of
reallocating all the buffers you need in one go, you would allocate them one by
one. This will distribute allocation latency. For stateful CODEC, most OMX
focused firmware needs to be modified for that, since they stick with the old
OMX spec which did not allow run-time allocation.

Another trick is to use a second codec session. Both stateful/stateless CODEC
have support for concurrent decoding. On the MSE requirement, is that the stream
transition happens only on keyframe boundary. Meaning, there is no need to reuse
the same session, you can create a new decoder in parallel, and that before the
drain is complete (after the event, before the last buffer). This will compress
the "setup" latency, to the cost of some extra memory usage. Specially in the
MSE case, this is nearly always possible since browsers do require support for
more then 1 concurrent decode. This method also works with OMX style CODEC
without any modification.

regards,
Nicolas




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format
  2022-08-23  7:03               ` Hsia-Jun Li
@ 2022-08-23 14:02                 ` Nicolas Dufresne
  0 siblings, 0 replies; 24+ messages in thread
From: Nicolas Dufresne @ 2022-08-23 14:02 UTC (permalink / raw)
  To: Hsia-Jun Li, Tomasz Figa
  Cc: linux-arm-kernel, Laurent Pinchart, sebastian.hesselbarth,
	airlied, linux-kernel, linux-media, sakari.ailus, dri-devel,
	tzimmermann, ribalda, hverkuil-cisco, mchehab, jszhang, ezequiel

Le mardi 23 août 2022 à 15:03 +0800, Hsia-Jun Li a écrit :
> 
> On 8/23/22 14:05, Tomasz Figa wrote:
> > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > 
> > 
> > On Sat, Aug 20, 2022 at 12:44 AM Hsia-Jun Li <Randy.Li@synaptics.com> wrote:
> > > 
> > > 
> > > 
> > > On 8/19/22 23:28, Nicolas Dufresne wrote:
> > > > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> > > > 
> > > > 
> > > > Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
> > > > > On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
> > > > > > On 8/18/22 14:06, Tomasz Figa wrote:
> > > > > > > On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li@synaptics.com> wrote:
> > > > > > > > 
> > > > > > > > From: "Hsia-Jun(Randy) Li" <randy.li@synaptics.com>
> > > > > > > > 
> > > > > > > > The most of detail has been written in the drm.
> > > > > 
> > > > > This patch still needs a description of the format, which should go to
> > > > > Documentation/userspace-api/media/v4l/.
> > > > > 
> > > > > > > > Please notice that the tiled formats here request
> > > > > > > > one more plane for storing the motion vector metadata.
> > > > > > > > This buffer won't be compressed, so you can't append
> > > > > > > > it to luma or chroma plane.
> > > > > > > 
> > > > > > > Does the motion vector buffer need to be exposed to userspace? Is the
> > > > > > > decoder stateless (requires userspace to specify the reference frames)
> > > > > > > or stateful (manages the entire decoding process internally)?
> > > > > > 
> > > > > > No, users don't need to access them at all. Just they need a different
> > > > > > dma-heap.
> > > > > > 
> > > > > > You would only get the stateful version of both encoder and decoder.
> > > > > 
> > > > > Shouldn't the motion vectors be stored in a separate V4L2 buffer,
> > > > > submitted through a different queue then ?
> > > > 
> > > > Imho, I believe these should be invisible to users and pooled separately to
> > > > reduce the overhead. The number of reference is usually lower then the number of
> > > > allocated display buffers.
> > > > 
> > > You can't. The motion vector buffer can't share with the luma and chroma
> > > data planes, nor the data plane for the compression meta data.
> > 
> > I believe what Nicolas is suggesting is to just keep the MV buffer
> > handling completely separate from video buffers. Just keep a map
> > between frame buffer and MV buffer in the driver and use the right
> > buffer when triggering a decode.
> > 
> > > 
> > > You could consider this as a security requirement(the memory region for
> > > the MV could only be accessed by the decoder) or hardware limitation.
> > > 
> > > It is also not very easy to manage such a large buffer that would change
> > > when the resolution changed.
> > 
> > How does it differ from managing additional planes of video buffers?
> I should say I am not against his suggestion if I could make a DMA-heap 
> v4l2 allocator merge into kernel in the future. Although I think we need 
> two heaps here one for the normal video and one for the secure video, I 
> don't have much idea on how to determine whether we are decoding a 
> secure or non-secure video here (The design here is that the kernel 
> didn't know, only hardware and TEE care about that).

Its always nice when "the design" get discussed upstream, so we can raise any
known issues and improve it. Here, not knowing if we are handling secure or non-
secure memory in kernel driver would indeed require external allocation for
everything, and V4L2 does not currently work like this. There is a few use cases
(not all of them might apply to your driver, but they exists).

1. Secondary buffers

When a CODEC is combined with a post-processor, the driver is then responsible
for reference frame allocation. In both known secure memory approach (NXP secure
bit and secondary mmu), the driver must know, as it won't be allowed produce any
non-secure buffer while secure (and vis-versa). It would be very difficult to
make secondary buffers externally allocated, since the fact secondary buffers
are used is no known by userspace. You slightly mention about adding a new queue
type, this seems like an option, though one will have to figure-out how to make
this work in a backward compatible manner.

2. Internally managed feedback buffers

Existing case of feedback buffers is VP9 decoders. I initially thought that
would only be a challenge for stateless decoders, but it turns out that Amlogic
stateful drivers also needs to take care. In VP9, the bitstream is further
compressed using probability obtained through decoding. Those probability can be
further tuned with updates placed in the compressed header. In Amlogic and
existing VP9 stateless decoder, the merging of the feedback and compressed
header updates is done using the CPU, hence that feedback buffer cannot be
secure. With lets say NXP secure domain HW, this is impossible. The OPT-TEE
needs to be involved to abstract the programming of the HW and copy back the
secure buffers to non-secure, making sure it is not being tricked into
delivering a copy of the wrong data. For the MMU approach, no copy is needed,
but to be sure the memory being mapped into the Linux Kernel MMU is the right
one, some level of abstraction of the CODEC is needed.

In short, you need a mix of secure and non-secure memory. This is a huge
challenge that isn't well covered by any secure memory design at the moment, its
not even clear if the HW can work. Remember that these feedback buffers are not
exposed to userspace, hence cannot be allocated there. Recent discussion shows
that NXP might be just giving up on their stateless codec so they can solve this
with a full codec abstraction (stateful codec).

Feedback buffers also exist in stateless encoders, but we don't have yet
existing drivers for that. Encoders also have to deal with secure memory,
notably when encoding from HDCP enabled HDMI receivers. Though this task is
quite likely limited to dedicated system, which can be considered secure as a
whole, time will define this.

> 
> Just one place that I think it would be more simple for me to manage the 
> buffer here. When the decoder goes to the drain stage, then the MV 
> buffer goes when the data buffer goes and create when the data buffer 
> creates.
> I know that is not a lot of work to doing the mapping between them. I 
> just need to convince the other accepting that do not allocator the MV 
> buffer outside.

Its also a big memory saver if you manage to convince them.

> > 
> > Best regards,
> > Tomasz
> > 
> > > > > 
> > > > > > > > Signed-off-by: Hsia-Jun(Randy) Li <randy.li@synaptics.com>
> > > > > > > > ---
> > > > > > > >     drivers/media/v4l2-core/v4l2-common.c | 1 +
> > > > > > > >     drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
> > > > > > > >     include/uapi/linux/videodev2.h        | 2 ++
> > > > > > > >     3 files changed, 5 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > index e0fbe6ba4b6c..f645278b3055 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-common.c
> > > > > > > > @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
> > > > > > > >                    { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > >                    { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > >                    { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
> > > > > > > > +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
> > > > > > > >            };
> > > > > > > >            unsigned int i;
> > > > > > > > 
> > > > > > > > diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > index e6fd355a2e92..8f65964aff08 100644
> > > > > > > > --- a/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
> > > > > > > > @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
> > > > > > > >                    case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
> > > > > > > >                    case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
> > > > > > > >                    case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
> > > > > > > > +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
> > > > > > > > +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
> > > > > > > >                    default:
> > > > > > > >                            if (fmt->description[0])
> > > > > > > >                                    return;
> > > > > > > > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> > > > > > > > index 01e630f2ec78..7e928cb69e7c 100644
> > > > > > > > --- a/include/uapi/linux/videodev2.h
> > > > > > > > +++ b/include/uapi/linux/videodev2.h
> > > > > > > > @@ -661,6 +661,8 @@ struct v4l2_pix_format {
> > > > > > > >     #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
> > > > > > > >     #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
> > > > > > > >     #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
> > > > > > > > +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
> > > > > > > > 
> > > > > > > >     /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
> > > > > > > >     #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
> > > > > 
> > > > 
> > > 
> > > --
> > > Hsia-Jun(Randy) Li
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-08-24 19:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 16:27 [PATCH 0/2] Add pixel formats used in Synatpics SoC Hsia-Jun Li
2022-08-08 16:27 ` [PATCH 1/2] drm/fourcc: Add Synaptics VideoSmart tiled modifiers Hsia-Jun Li
2022-08-18  6:07   ` Tomasz Figa
2022-08-18  6:49     ` Hsia-Jun Li
2022-08-18 23:16   ` Laurent Pinchart
2022-08-18 23:37     ` Hsia-Jun Li
2022-08-08 16:27 ` [PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format Hsia-Jun Li
2022-08-18  6:06   ` Tomasz Figa
2022-08-18  6:33     ` Hsia-Jun Li
2022-08-18 23:13       ` Laurent Pinchart
2022-08-18 23:51         ` Hsia-Jun Li
2022-08-19 15:28         ` Nicolas Dufresne
2022-08-19 15:44           ` Hsia-Jun Li
2022-08-19 19:17             ` Nicolas Dufresne
2022-08-20  0:10               ` Hsia-Jun Li
2022-08-22 14:15                 ` Nicolas Dufresne
2022-08-23  7:40                   ` Hsia-Jun Li
2022-08-23 13:44                     ` Nicolas Dufresne
2022-08-23  6:05             ` Tomasz Figa
2022-08-23  7:03               ` Hsia-Jun Li
2022-08-23 14:02                 ` Nicolas Dufresne
2022-08-19 15:26       ` Nicolas Dufresne
2022-08-18 23:08 ` [PATCH 0/2] Add pixel formats used in Synatpics SoC Laurent Pinchart
2022-08-18 23:28   ` Hsia-Jun Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).