linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] media: uapi: h264: First batch of adjusments
@ 2019-07-03 12:28 Boris Brezillon
  2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-03 12:28 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding, Boris Brezillon

Hello,

This is a first batch of adjustments to the stateless H264 decoder
uAPI. The first one is about adding support for per-frame decoding,
which is the only mode supported on some codecs (the hantro G1 block
supports per-slice decoding but not in an way that would allow
efficient multiplexing of several decoding contexts).

The second modification drops the P0/B0/B1 ref lists from the
decode_params control. These lists are no longer needed now that we know
we can build them kernel side based on the DPB.

There are few more changes in the pipe, but I'd like to sync with Paul,
Jonas, Jernej and Nicolas before modifying:
* Enforce order of the scaling list (looks like the rockchip and cedrus
  have different expectations)
* Pass top/bottom field info as flags in the DPB entry: the field
  attached to the capture buffer is not accurate as capture bufs might
  contain both top/bottom (meaning they are actually interlaced), but a
  coded frame might contain only one of these fields. Note
  that there's also a problem with the output -> capture field flag
  propagation we have in copy_metadata() because a coded slice might
  contain only data for top or bottom, but the capture frame might
  contain both. Doing capture->field = output->field means we're lying
  about what's inside the capture buffer (not sure we have
  implementation checking that)
* s/dpb/refs/: looks like we're abusing the term DPB which is supposed
  to be implementation specific. What's provided by the bitstream is a
  list of references that will be used to decode a frame
* ... (add your own)

Feel free to comment on these changes and/or propose alternatives.

Regards,

Boris

Changes in v3:
* s/per-{slice,frame}/{slice,frame}-based/ decoding
* Add Paul's R-b on patch 1 and 2

Changes in v2:
* Allow decoding multiple slices in per-slice decoding mode
* Minor doc improvements/fixes

Boris Brezillon (3):
  media: uapi: h264: Clarify our expectations regarding NAL header
    format
  media: uapi: h264: Add the concept of decoding mode
  media: uapi: h264: Get rid of the p0/b0/b1 ref-lists

 .../media/uapi/v4l/ext-ctrls-codec.rst        | 57 +++++++++++++++----
 drivers/media/v4l2-core/v4l2-ctrls.c          |  9 +++
 include/media/h264-ctrls.h                    | 13 +++++
 3 files changed, 69 insertions(+), 10 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-03 12:28 [PATCH v3 0/3] media: uapi: h264: First batch of adjusments Boris Brezillon
@ 2019-07-03 12:28 ` Boris Brezillon
  2019-07-05 16:40   ` Ezequiel Garcia
  2019-07-25 19:26   ` Paul Kocialkowski
  2019-07-03 12:28 ` [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode Boris Brezillon
  2019-07-03 12:28 ` [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Boris Brezillon
  2 siblings, 2 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-03 12:28 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding, Boris Brezillon

Looks like some stateless decoders expect slices to be prefixed with
ANNEX B start codes (they most likely do some kind of bitstream parsing
and/or need that to delimit slices when doing per frame decoding).
Since skipping those start codes for dummy stateless decoders (those
expecting all params to be passed through controls) should be pretty
easy, let's mandate that all slices be prepended with ANNEX B start
codes.

If we ever need to support AVC headers, we can add a new menu control
to select the type of NAL header to use.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
Changes in v3:
* Add Paul's R-b

Changes in v2:
* None
---
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index 7a1947f5be96..3ae1367806cf 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
     documentation, refer to the above specification, unless there is
     an explicit comment stating otherwise.
+    All slices should be prepended with an ANNEX B start code.
 
     .. note::
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode
  2019-07-03 12:28 [PATCH v3 0/3] media: uapi: h264: First batch of adjusments Boris Brezillon
  2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
@ 2019-07-03 12:28 ` Boris Brezillon
  2019-07-22 15:29   ` Hans Verkuil
  2019-07-25 19:20   ` Paul Kocialkowski
  2019-07-03 12:28 ` [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Boris Brezillon
  2 siblings, 2 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-03 12:28 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding, Boris Brezillon

Some stateless decoders don't support per-slice decoding (or at least
not in a way that would make them efficient or easy to use).
Let's expose a menu to control and expose the supported decoding modes.
Drivers are allowed to support only one decoding but they can support
both too.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
Changes in v3:
* s/per-{slice,frame} decoding/{slice,frame}-based decoding/
* Add Paul's R-b

Changes in v2:
* Allow decoding multiple slices in per-slice decoding mode
* Minor doc improvement/fixes
---
 .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
 include/media/h264-ctrls.h                    | 13 +++++
 3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index 3ae1367806cf..47ba2d057a92 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -1748,6 +1748,14 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - __u32
       - ``size``
       -
+    * - __u32
+      - ``start_byte_offset``
+      - Where the slice payload starts in the output buffer. Useful when
+        operating in frame-based decoding mode and decoding multi-slice
+        content. In this case, the output buffer will contain more than one
+        slice and some codecs need to know where each slice starts. Note that
+        this offsets points to the beginning of the slice which is supposed to
+        contain an ANNEX B start code
     * - __u32
       - ``header_bit_size``
       -
@@ -1931,7 +1939,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
       -
     * - __u16
       - ``num_slices``
-      - Number of slices needed to decode the current frame
+      - Number of slices needed to decode the current frame/field. When
+        operating in slice-based decoding mode (see
+        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
+        should always be set to one
     * - __u16
       - ``nal_ref_idc``
       - NAL reference ID value coming from the NAL Unit header
@@ -2022,6 +2033,40 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
       - 0x00000004
       - The DPB entry is a long term reference frame
 
+``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
+    Specifies the decoding mode to use. Currently exposes slice-based and
+    frame-based decoding but new modes might be added later on.
+
+    .. note::
+
+       This menu control is not yet part of the public kernel API and
+       it is expected to change.
+
+.. c:type:: v4l2_mpeg_video_h264_decoding_mode
+
+.. cssclass:: longtable
+
+.. flat-table::
+    :header-rows:  0
+    :stub-columns: 0
+    :widths:       1 1 2
+
+    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
+      - 0
+      - The decoding is done at the slice granularity.
+        v4l2_ctrl_h264_decode_params->num_slices can be set to anything between
+        1 and then number of slices that remain to fully decode the
+        frame/field.
+        The output buffer should contain
+        v4l2_ctrl_h264_decode_params->num_slices slices.
+    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
+      - 1
+      - The decoding is done at the frame granularity.
+        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
+        slices forming a frame.
+        The output buffer should contain all slices needed to decode the
+        frame/field.
+
 .. _v4l2-mpeg-mpeg2:
 
 ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 471ff5c91f43..70d994be27e1 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -394,6 +394,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		"Explicit",
 		NULL,
 	};
+	static const char * const h264_decoding_mode[] = {
+		"Slice-based",
+		"Frame-based",
+		NULL,
+	};
 	static const char * const mpeg_mpeg2_level[] = {
 		"Low",
 		"Main",
@@ -625,6 +630,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
 		return h264_fp_arrangement_type;
 	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
 		return h264_fmo_map_type;
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
+		return h264_decoding_mode;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 		return mpeg_mpeg2_level;
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
@@ -844,6 +851,7 @@ const char *v4l2_ctrl_get_name(u32 id)
 	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
 	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
 	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
 	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
@@ -1212,6 +1220,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
 	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
 	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
 	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
+	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
 	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
 	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index e1404d78d6ff..206fd5ada620 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -26,6 +26,7 @@
 #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
 #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
 #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
+#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
 
 /* enum v4l2_ctrl_type type values */
 #define V4L2_CTRL_TYPE_H264_SPS			0x0110
@@ -33,6 +34,12 @@
 #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
 #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
 #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
+#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
+
+enum v4l2_mpeg_video_h264_decoding_mode {
+	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
+	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
+};
 
 #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
 #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
@@ -111,6 +118,8 @@ struct v4l2_h264_pred_weight_table {
 	struct v4l2_h264_weight_factors weight_factors[2];
 };
 
+#define V4L2_H264_MAX_SLICES_PER_FRAME			16
+
 #define V4L2_H264_SLICE_TYPE_P				0
 #define V4L2_H264_SLICE_TYPE_B				1
 #define V4L2_H264_SLICE_TYPE_I				2
@@ -125,6 +134,10 @@ struct v4l2_h264_pred_weight_table {
 struct v4l2_ctrl_h264_slice_params {
 	/* Size in bytes, including header */
 	__u32 size;
+
+	/* Where the slice starts in the output buffer (expressed in bytes). */
+	__u32 start_byte_offset;
+
 	/* Offset in bits to slice_data() from the beginning of this slice. */
 	__u32 header_bit_size;
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-03 12:28 [PATCH v3 0/3] media: uapi: h264: First batch of adjusments Boris Brezillon
  2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
  2019-07-03 12:28 ` [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode Boris Brezillon
@ 2019-07-03 12:28 ` Boris Brezillon
  2019-07-03 17:18   ` Nicolas Dufresne
  2019-07-24  3:39   ` Tomasz Figa
  2 siblings, 2 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-03 12:28 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding, Boris Brezillon

Those lists can be extracted from the dpb, let's simplify userspace
life and build that list kernel-side (generic helpers will be provided
for drivers that need this list).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Changes in v3:
* None

Changes in v2:
* None
---
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index 47ba2d057a92..237c8e8e6bca 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -1946,15 +1946,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - __u16
       - ``nal_ref_idc``
       - NAL reference ID value coming from the NAL Unit header
-    * - __u8
-      - ``ref_pic_list_p0[32]``
-      - Backward reference list used by P-frames in the original bitstream order
-    * - __u8
-      - ``ref_pic_list_b0[32]``
-      - Backward reference list used by B-frames in the original bitstream order
-    * - __u8
-      - ``ref_pic_list_b1[32]``
-      - Forward reference list used by B-frames in the original bitstream order
     * - __s32
       - ``top_field_order_cnt``
       - Picture Order Count for the coded top field
-- 
2.21.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-03 12:28 ` [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Boris Brezillon
@ 2019-07-03 17:18   ` Nicolas Dufresne
  2019-07-05 15:24     ` Ezequiel Garcia
  2019-07-24  3:39   ` Tomasz Figa
  1 sibling, 1 reply; 29+ messages in thread
From: Nicolas Dufresne @ 2019-07-03 17:18 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, linux-media
  Cc: Tomasz Figa, kernel, Paul Kocialkowski, Maxime Ripard,
	Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding

[-- Attachment #1: Type: text/plain, Size: 1614 bytes --]

Le mercredi 03 juillet 2019 à 14:28 +0200, Boris Brezillon a écrit :
> Those lists can be extracted from the dpb, let's simplify userspace
> life and build that list kernel-side (generic helpers will be provided
> for drivers that need this list).
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Those only existed for Rockchip/Hantro anyway.

Reviewed-by: Nicolas Dufresne <nicolas­.dufresne@collabora.com>

> ---
> Changes in v3:
> * None
> 
> Changes in v2:
> * None
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 47ba2d057a92..237c8e8e6bca 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1946,15 +1946,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - __u16
>        - ``nal_ref_idc``
>        - NAL reference ID value coming from the NAL Unit header
> -    * - __u8
> -      - ``ref_pic_list_p0[32]``
> -      - Backward reference list used by P-frames in the original bitstream order
> -    * - __u8
> -      - ``ref_pic_list_b0[32]``
> -      - Backward reference list used by B-frames in the original bitstream order
> -    * - __u8
> -      - ``ref_pic_list_b1[32]``
> -      - Forward reference list used by B-frames in the original bitstream order
>      * - __s32
>        - ``top_field_order_cnt``
>        - Picture Order Count for the coded top field

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-03 17:18   ` Nicolas Dufresne
@ 2019-07-05 15:24     ` Ezequiel Garcia
  0 siblings, 0 replies; 29+ messages in thread
From: Ezequiel Garcia @ 2019-07-05 15:24 UTC (permalink / raw)
  To: Nicolas Dufresne, Boris Brezillon, Mauro Carvalho Chehab,
	Hans Verkuil, Laurent Pinchart, Sakari Ailus, linux-media
  Cc: Tomasz Figa, kernel, Paul Kocialkowski, Maxime Ripard,
	Jonas Karlman, Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Wed, 2019-07-03 at 13:18 -0400, Nicolas Dufresne wrote:
> Le mercredi 03 juillet 2019 à 14:28 +0200, Boris Brezillon a écrit :
> > Those lists can be extracted from the dpb, let's simplify userspace
> > life and build that list kernel-side (generic helpers will be provided
> > for drivers that need this list).
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> 
> Those only existed for Rockchip/Hantro anyway.
> 
> Reviewed-by: Nicolas Dufresne <nicolas­.dufresne@collabora.com>
> 

Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com>

Thanks,
Eze


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
@ 2019-07-05 16:40   ` Ezequiel Garcia
  2019-07-05 17:16     ` Boris Brezillon
  2019-07-25 19:26   ` Paul Kocialkowski
  1 sibling, 1 reply; 29+ messages in thread
From: Ezequiel Garcia @ 2019-07-05 16:40 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Jonas Karlman, Jernej Skrabec, Alexandre Courbot,
	Thierry Reding

Hi Boris, Paul,

On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:
> Looks like some stateless decoders expect slices to be prefixed with
> ANNEX B start codes (they most likely do some kind of bitstream parsing
> and/or need that to delimit slices when doing per frame decoding).
> Since skipping those start codes for dummy stateless decoders (those
> expecting all params to be passed through controls) should be pretty
> easy, let's mandate that all slices be prepended with ANNEX B start
> codes.
> 
> If we ever need to support AVC headers, we can add a new menu control
> to select the type of NAL header to use.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
> Changes in v3:
> * Add Paul's R-b
> 
> Changes in v2:
> * None
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 7a1947f5be96..3ae1367806cf 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
>      documentation, refer to the above specification, unless there is
>      an explicit comment stating otherwise.
> +    All slices should be prepended with an ANNEX B start code.
>  

Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
is specified to _not_ contain the ANNEX B start code.

As you know, this is used in the cedrus driver, which is not
expecting the start code.

What's the plan regarding that?

Thanks,
Ezequiel


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-05 16:40   ` Ezequiel Garcia
@ 2019-07-05 17:16     ` Boris Brezillon
  2019-07-25  6:42       ` Boris Brezillon
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-05 17:16 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Maxime Ripard, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding

On Fri, 05 Jul 2019 13:40:03 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:

> Hi Boris, Paul,
> 
> On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:
> > Looks like some stateless decoders expect slices to be prefixed with
> > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > and/or need that to delimit slices when doing per frame decoding).
> > Since skipping those start codes for dummy stateless decoders (those
> > expecting all params to be passed through controls) should be pretty
> > easy, let's mandate that all slices be prepended with ANNEX B start
> > codes.
> > 
> > If we ever need to support AVC headers, we can add a new menu control
> > to select the type of NAL header to use.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> > Changes in v3:
> > * Add Paul's R-b
> > 
> > Changes in v2:
> > * None
> > ---
> >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > index 7a1947f5be96..3ae1367806cf 100644
> > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> >      documentation, refer to the above specification, unless there is
> >      an explicit comment stating otherwise.
> > +    All slices should be prepended with an ANNEX B start code.
> >    
> 
> Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> is specified to _not_ contain the ANNEX B start code.

Yep, we should provably rename the format.

> 
> As you know, this is used in the cedrus driver, which is not
> expecting the start code.
> 
> What's the plan regarding that?

I had a few patches modifying the cedrus driver accordingly, but I
dropped them in v2. I guess we can fix the driver once we've settled on
the uAPI changes.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode
  2019-07-03 12:28 ` [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode Boris Brezillon
@ 2019-07-22 15:29   ` Hans Verkuil
  2019-07-22 17:54     ` Boris Brezillon
  2019-07-25 19:20   ` Paul Kocialkowski
  1 sibling, 1 reply; 29+ messages in thread
From: Hans Verkuil @ 2019-07-22 15:29 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, linux-media
  Cc: Tomasz Figa, Nicolas Dufresne, kernel, Paul Kocialkowski,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding

On 7/3/19 2:28 PM, Boris Brezillon wrote:
> Some stateless decoders don't support per-slice decoding (or at least
> not in a way that would make them efficient or easy to use).
> Let's expose a menu to control and expose the supported decoding modes.
> Drivers are allowed to support only one decoding but they can support
> both too.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
> Changes in v3:
> * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
> * Add Paul's R-b
> 
> Changes in v2:
> * Allow decoding multiple slices in per-slice decoding mode
> * Minor doc improvement/fixes
> ---
>  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
>  include/media/h264-ctrls.h                    | 13 +++++
>  3 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 3ae1367806cf..47ba2d057a92 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1748,6 +1748,14 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - __u32
>        - ``size``
>        -
> +    * - __u32
> +      - ``start_byte_offset``
> +      - Where the slice payload starts in the output buffer. Useful when
> +        operating in frame-based decoding mode and decoding multi-slice
> +        content. In this case, the output buffer will contain more than one
> +        slice and some codecs need to know where each slice starts. Note that
> +        this offsets points to the beginning of the slice which is supposed to

offsets -> offset

> +        contain an ANNEX B start code

Add . at the end of the sentence.

I think this is a bit awkward. How about:

"Note that the slice at this offset shall start with an ANNEX B start code."

I'm assuming it has to actually start with an ANNEX B code? Or should it
just 'contain' an ANNEX B code?

When in sliced-based decoding mode, what should be used here? I assume that in
that case start_byte_offset would be 0, and that the slice shall still begin
with an ANNEX B start code?

>      * - __u32
>        - ``header_bit_size``
>        -
> @@ -1931,7 +1939,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        -
>      * - __u16
>        - ``num_slices``
> -      - Number of slices needed to decode the current frame
> +      - Number of slices needed to decode the current frame/field. When
> +        operating in slice-based decoding mode (see
> +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
> +        should always be set to one

Add . at the end of the sentence.

>      * - __u16
>        - ``nal_ref_idc``
>        - NAL reference ID value coming from the NAL Unit header
> @@ -2022,6 +2033,40 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>        - 0x00000004
>        - The DPB entry is a long term reference frame
>  
> +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
> +    Specifies the decoding mode to use. Currently exposes slice-based and
> +    frame-based decoding but new modes might be added later on.
> +
> +    .. note::
> +
> +       This menu control is not yet part of the public kernel API and
> +       it is expected to change.
> +
> +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
> +
> +.. cssclass:: longtable
> +
> +.. flat-table::
> +    :header-rows:  0
> +    :stub-columns: 0
> +    :widths:       1 1 2
> +
> +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
> +      - 0
> +      - The decoding is done at the slice granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices can be set to anything between
> +        1 and then number of slices that remain to fully decode the

then -> the

> +        frame/field.
> +        The output buffer should contain
> +        v4l2_ctrl_h264_decode_params->num_slices slices.
> +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
> +      - 1
> +      - The decoding is done at the frame granularity.
> +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
> +        slices forming a frame.
> +        The output buffer should contain all slices needed to decode the
> +        frame/field.
> +
>  .. _v4l2-mpeg-mpeg2:
>  
>  ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> index 471ff5c91f43..70d994be27e1 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> @@ -394,6 +394,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		"Explicit",
>  		NULL,
>  	};
> +	static const char * const h264_decoding_mode[] = {
> +		"Slice-based",
> +		"Frame-based",

based -> Based

> +		NULL,
> +	};
>  	static const char * const mpeg_mpeg2_level[] = {
>  		"Low",
>  		"Main",
> @@ -625,6 +630,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>  		return h264_fp_arrangement_type;
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
>  		return h264_fmo_map_type;
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
> +		return h264_decoding_mode;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  		return mpeg_mpeg2_level;
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
> @@ -844,6 +851,7 @@ const char *v4l2_ctrl_get_name(u32 id)
>  	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
>  	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
>  	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
> @@ -1212,6 +1220,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
>  	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
>  	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
>  	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index e1404d78d6ff..206fd5ada620 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -26,6 +26,7 @@
>  #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
>  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
>  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> +#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
>  
>  /* enum v4l2_ctrl_type type values */
>  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> @@ -33,6 +34,12 @@
>  #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
>  #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
>  #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
> +#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
> +
> +enum v4l2_mpeg_video_h264_decoding_mode {
> +	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
> +	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> +};
>  
>  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
>  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
> @@ -111,6 +118,8 @@ struct v4l2_h264_pred_weight_table {
>  	struct v4l2_h264_weight_factors weight_factors[2];
>  };
>  
> +#define V4L2_H264_MAX_SLICES_PER_FRAME			16

Are there arrays in these compound control structs where this define can be used?
Is this define standards-based or a restriction of V4L2?

Regards,

	Hans

> +
>  #define V4L2_H264_SLICE_TYPE_P				0
>  #define V4L2_H264_SLICE_TYPE_B				1
>  #define V4L2_H264_SLICE_TYPE_I				2
> @@ -125,6 +134,10 @@ struct v4l2_h264_pred_weight_table {
>  struct v4l2_ctrl_h264_slice_params {
>  	/* Size in bytes, including header */
>  	__u32 size;
> +
> +	/* Where the slice starts in the output buffer (expressed in bytes). */
> +	__u32 start_byte_offset;
> +
>  	/* Offset in bits to slice_data() from the beginning of this slice. */
>  	__u32 header_bit_size;
>  
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode
  2019-07-22 15:29   ` Hans Verkuil
@ 2019-07-22 17:54     ` Boris Brezillon
  2019-07-22 19:00       ` Hans Verkuil
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-22 17:54 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Maxime Ripard, Ezequiel Garcia, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

Hi Hans,

On Mon, 22 Jul 2019 17:29:21 +0200
Hans Verkuil <hverkuil@xs4all.nl> wrote:

> On 7/3/19 2:28 PM, Boris Brezillon wrote:
> > Some stateless decoders don't support per-slice decoding (or at least
> > not in a way that would make them efficient or easy to use).
> > Let's expose a menu to control and expose the supported decoding modes.
> > Drivers are allowed to support only one decoding but they can support
> > both too.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> > Changes in v3:
> > * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
> > * Add Paul's R-b
> > 
> > Changes in v2:
> > * Allow decoding multiple slices in per-slice decoding mode
> > * Minor doc improvement/fixes
> > ---
> >  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
> >  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
> >  include/media/h264-ctrls.h                    | 13 +++++
> >  3 files changed, 68 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > index 3ae1367806cf..47ba2d057a92 100644
> > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > @@ -1748,6 +1748,14 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >      * - __u32
> >        - ``size``
> >        -
> > +    * - __u32
> > +      - ``start_byte_offset``
> > +      - Where the slice payload starts in the output buffer. Useful when
> > +        operating in frame-based decoding mode and decoding multi-slice
> > +        content. In this case, the output buffer will contain more than one
> > +        slice and some codecs need to know where each slice starts. Note that
> > +        this offsets points to the beginning of the slice which is supposed to  
> 
> offsets -> offset
> 
> > +        contain an ANNEX B start code  
> 
> Add . at the end of the sentence.
> 
> I think this is a bit awkward. How about:
> 
> "Note that the slice at this offset shall start with an ANNEX B start code."

Definitely better.

> 
> I'm assuming it has to actually start with an ANNEX B code? Or should it
> just 'contain' an ANNEX B code?

It has to start with an ANNEX B code.

> 
> When in sliced-based decoding mode, what should be used here? I assume that in
> that case start_byte_offset would be 0, and that the slice shall still begin
> with an ANNEX B start code?

The first slice should have start_byte_offset set to 0 and should start
with an ANNEX B start code, but even in slice-based decoding mode, the
driver can be passed several slices in the same buffer, in which case,
the second slice will have start_byte_offset > 0.


> 
> >      * - __u32
> >        - ``header_bit_size``
> >        -
> > @@ -1931,7 +1939,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >        -
> >      * - __u16
> >        - ``num_slices``
> > -      - Number of slices needed to decode the current frame
> > +      - Number of slices needed to decode the current frame/field. When
> > +        operating in slice-based decoding mode (see
> > +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
> > +        should always be set to one  
> 
> Add . at the end of the sentence.
> 
> >      * - __u16
> >        - ``nal_ref_idc``
> >        - NAL reference ID value coming from the NAL Unit header
> > @@ -2022,6 +2033,40 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >        - 0x00000004
> >        - The DPB entry is a long term reference frame
> >  
> > +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
> > +    Specifies the decoding mode to use. Currently exposes slice-based and
> > +    frame-based decoding but new modes might be added later on.
> > +
> > +    .. note::
> > +
> > +       This menu control is not yet part of the public kernel API and
> > +       it is expected to change.
> > +
> > +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
> > +
> > +.. cssclass:: longtable
> > +
> > +.. flat-table::
> > +    :header-rows:  0
> > +    :stub-columns: 0
> > +    :widths:       1 1 2
> > +
> > +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
> > +      - 0
> > +      - The decoding is done at the slice granularity.
> > +        v4l2_ctrl_h264_decode_params->num_slices can be set to anything between
> > +        1 and then number of slices that remain to fully decode the  
> 
> then -> the
> 
> > +        frame/field.
> > +        The output buffer should contain
> > +        v4l2_ctrl_h264_decode_params->num_slices slices.
> > +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
> > +      - 1
> > +      - The decoding is done at the frame granularity.
> > +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
> > +        slices forming a frame.
> > +        The output buffer should contain all slices needed to decode the
> > +        frame/field.
> > +
> >  .. _v4l2-mpeg-mpeg2:
> >  
> >  ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
> > diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> > index 471ff5c91f43..70d994be27e1 100644
> > --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> > +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> > @@ -394,6 +394,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
> >  		"Explicit",
> >  		NULL,
> >  	};
> > +	static const char * const h264_decoding_mode[] = {
> > +		"Slice-based",
> > +		"Frame-based",  
> 
> based -> Based
> 
> > +		NULL,
> > +	};
> >  	static const char * const mpeg_mpeg2_level[] = {
> >  		"Low",
> >  		"Main",
> > @@ -625,6 +630,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
> >  		return h264_fp_arrangement_type;
> >  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
> >  		return h264_fmo_map_type;
> > +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
> > +		return h264_decoding_mode;
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
> >  		return mpeg_mpeg2_level;
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
> > @@ -844,6 +851,7 @@ const char *v4l2_ctrl_get_name(u32 id)
> >  	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
> >  	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
> >  	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
> > +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
> >  	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
> > @@ -1212,6 +1220,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
> >  	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
> >  	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
> >  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
> > +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
> >  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
> >  	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
> > diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> > index e1404d78d6ff..206fd5ada620 100644
> > --- a/include/media/h264-ctrls.h
> > +++ b/include/media/h264-ctrls.h
> > @@ -26,6 +26,7 @@
> >  #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
> >  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
> >  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
> > +#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
> >  
> >  /* enum v4l2_ctrl_type type values */
> >  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
> > @@ -33,6 +34,12 @@
> >  #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
> >  #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
> >  #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
> > +#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
> > +
> > +enum v4l2_mpeg_video_h264_decoding_mode {
> > +	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
> > +	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
> > +};
> >  
> >  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
> >  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
> > @@ -111,6 +118,8 @@ struct v4l2_h264_pred_weight_table {
> >  	struct v4l2_h264_weight_factors weight_factors[2];
> >  };
> >  
> > +#define V4L2_H264_MAX_SLICES_PER_FRAME			16  
> 
> Are there arrays in these compound control structs where this define can be used?

No, slices_params is a separate control, but I initialize
slices_params_ctrl_cfg.dims[0] to this value.

> Is this define standards-based or a restriction of V4L2?

It's defined by the standard.

Will fix the other typos you reported.

Thanks for the review.

Boris

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode
  2019-07-22 17:54     ` Boris Brezillon
@ 2019-07-22 19:00       ` Hans Verkuil
  0 siblings, 0 replies; 29+ messages in thread
From: Hans Verkuil @ 2019-07-22 19:00 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Tomasz Figa, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Maxime Ripard, Ezequiel Garcia, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On 7/22/19 7:54 PM, Boris Brezillon wrote:
> Hi Hans,
> 
> On Mon, 22 Jul 2019 17:29:21 +0200
> Hans Verkuil <hverkuil@xs4all.nl> wrote:
> 
>> On 7/3/19 2:28 PM, Boris Brezillon wrote:
>>> Some stateless decoders don't support per-slice decoding (or at least
>>> not in a way that would make them efficient or easy to use).
>>> Let's expose a menu to control and expose the supported decoding modes.
>>> Drivers are allowed to support only one decoding but they can support
>>> both too.
>>>
>>> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
>>> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>>> ---
>>> Changes in v3:
>>> * s/per-{slice,frame} decoding/{slice,frame}-based decoding/
>>> * Add Paul's R-b
>>>
>>> Changes in v2:
>>> * Allow decoding multiple slices in per-slice decoding mode
>>> * Minor doc improvement/fixes
>>> ---
>>>  .../media/uapi/v4l/ext-ctrls-codec.rst        | 47 ++++++++++++++++++-
>>>  drivers/media/v4l2-core/v4l2-ctrls.c          |  9 ++++
>>>  include/media/h264-ctrls.h                    | 13 +++++
>>>  3 files changed, 68 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> index 3ae1367806cf..47ba2d057a92 100644
>>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>> @@ -1748,6 +1748,14 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>      * - __u32
>>>        - ``size``
>>>        -
>>> +    * - __u32
>>> +      - ``start_byte_offset``
>>> +      - Where the slice payload starts in the output buffer. Useful when
>>> +        operating in frame-based decoding mode and decoding multi-slice
>>> +        content. In this case, the output buffer will contain more than one
>>> +        slice and some codecs need to know where each slice starts. Note that
>>> +        this offsets points to the beginning of the slice which is supposed to  
>>
>> offsets -> offset
>>
>>> +        contain an ANNEX B start code  
>>
>> Add . at the end of the sentence.
>>
>> I think this is a bit awkward. How about:
>>
>> "Note that the slice at this offset shall start with an ANNEX B start code."
> 
> Definitely better.
> 
>>
>> I'm assuming it has to actually start with an ANNEX B code? Or should it
>> just 'contain' an ANNEX B code?
> 
> It has to start with an ANNEX B code.
> 
>>
>> When in sliced-based decoding mode, what should be used here? I assume that in
>> that case start_byte_offset would be 0, and that the slice shall still begin
>> with an ANNEX B start code?
> 
> The first slice should have start_byte_offset set to 0 and should start
> with an ANNEX B start code, but even in slice-based decoding mode, the
> driver can be passed several slices in the same buffer, in which case,
> the second slice will have start_byte_offset > 0.
> 
> 
>>
>>>      * - __u32
>>>        - ``header_bit_size``
>>>        -
>>> @@ -1931,7 +1939,10 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>        -
>>>      * - __u16
>>>        - ``num_slices``
>>> -      - Number of slices needed to decode the current frame
>>> +      - Number of slices needed to decode the current frame/field. When
>>> +        operating in slice-based decoding mode (see
>>> +        :c:type:`v4l2_mpeg_video_h264_decoding_mode`), this field
>>> +        should always be set to one  
>>
>> Add . at the end of the sentence.
>>
>>>      * - __u16
>>>        - ``nal_ref_idc``
>>>        - NAL reference ID value coming from the NAL Unit header
>>> @@ -2022,6 +2033,40 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>        - 0x00000004
>>>        - The DPB entry is a long term reference frame
>>>  
>>> +``V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE (enum)``
>>> +    Specifies the decoding mode to use. Currently exposes slice-based and
>>> +    frame-based decoding but new modes might be added later on.
>>> +
>>> +    .. note::
>>> +
>>> +       This menu control is not yet part of the public kernel API and
>>> +       it is expected to change.
>>> +
>>> +.. c:type:: v4l2_mpeg_video_h264_decoding_mode
>>> +
>>> +.. cssclass:: longtable
>>> +
>>> +.. flat-table::
>>> +    :header-rows:  0
>>> +    :stub-columns: 0
>>> +    :widths:       1 1 2
>>> +
>>> +    * - ``V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING``
>>> +      - 0
>>> +      - The decoding is done at the slice granularity.
>>> +        v4l2_ctrl_h264_decode_params->num_slices can be set to anything between
>>> +        1 and then number of slices that remain to fully decode the  
>>
>> then -> the
>>
>>> +        frame/field.
>>> +        The output buffer should contain
>>> +        v4l2_ctrl_h264_decode_params->num_slices slices.
>>> +    * - ``V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING``
>>> +      - 1
>>> +      - The decoding is done at the frame granularity.
>>> +        v4l2_ctrl_h264_decode_params->num_slices should be set to the number of
>>> +        slices forming a frame.
>>> +        The output buffer should contain all slices needed to decode the
>>> +        frame/field.
>>> +
>>>  .. _v4l2-mpeg-mpeg2:
>>>  
>>>  ``V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (struct)``
>>> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
>>> index 471ff5c91f43..70d994be27e1 100644
>>> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
>>> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
>>> @@ -394,6 +394,11 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>>>  		"Explicit",
>>>  		NULL,
>>>  	};
>>> +	static const char * const h264_decoding_mode[] = {
>>> +		"Slice-based",
>>> +		"Frame-based",  
>>
>> based -> Based
>>
>>> +		NULL,
>>> +	};
>>>  	static const char * const mpeg_mpeg2_level[] = {
>>>  		"Low",
>>>  		"Main",
>>> @@ -625,6 +630,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
>>>  		return h264_fp_arrangement_type;
>>>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
>>>  		return h264_fmo_map_type;
>>> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
>>> +		return h264_decoding_mode;
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>>>  		return mpeg_mpeg2_level;
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
>>> @@ -844,6 +851,7 @@ const char *v4l2_ctrl_get_name(u32 id)
>>>  	case V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX:		return "H264 Scaling Matrix";
>>>  	case V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS:		return "H264 Slice Parameters";
>>>  	case V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS:		return "H264 Decode Parameters";
>>> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:		return "H264 Decoding Mode";
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:			return "MPEG2 Level";
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:			return "MPEG2 Profile";
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG4_I_FRAME_QP:		return "MPEG4 I-Frame QP Value";
>>> @@ -1212,6 +1220,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
>>>  	case V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC:
>>>  	case V4L2_CID_MPEG_VIDEO_H264_SEI_FP_ARRANGEMENT_TYPE:
>>>  	case V4L2_CID_MPEG_VIDEO_H264_FMO_MAP_TYPE:
>>> +	case V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE:
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL:
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG2_PROFILE:
>>>  	case V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL:
>>> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
>>> index e1404d78d6ff..206fd5ada620 100644
>>> --- a/include/media/h264-ctrls.h
>>> +++ b/include/media/h264-ctrls.h
>>> @@ -26,6 +26,7 @@
>>>  #define V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX	(V4L2_CID_MPEG_BASE+1002)
>>>  #define V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS	(V4L2_CID_MPEG_BASE+1003)
>>>  #define V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS	(V4L2_CID_MPEG_BASE+1004)
>>> +#define V4L2_CID_MPEG_VIDEO_H264_DECODING_MODE	(V4L2_CID_MPEG_BASE+1005)
>>>  
>>>  /* enum v4l2_ctrl_type type values */
>>>  #define V4L2_CTRL_TYPE_H264_SPS			0x0110
>>> @@ -33,6 +34,12 @@
>>>  #define V4L2_CTRL_TYPE_H264_SCALING_MATRIX	0x0112
>>>  #define V4L2_CTRL_TYPE_H264_SLICE_PARAMS	0x0113
>>>  #define V4L2_CTRL_TYPE_H264_DECODE_PARAMS	0x0114
>>> +#define V4L2_CTRL_TYPE_H264_DECODING_MODE	0x0115
>>> +
>>> +enum v4l2_mpeg_video_h264_decoding_mode {
>>> +	V4L2_MPEG_VIDEO_H264_SLICE_BASED_DECODING,
>>> +	V4L2_MPEG_VIDEO_H264_FRAME_BASED_DECODING,
>>> +};
>>>  
>>>  #define V4L2_H264_SPS_CONSTRAINT_SET0_FLAG			0x01
>>>  #define V4L2_H264_SPS_CONSTRAINT_SET1_FLAG			0x02
>>> @@ -111,6 +118,8 @@ struct v4l2_h264_pred_weight_table {
>>>  	struct v4l2_h264_weight_factors weight_factors[2];
>>>  };
>>>  
>>> +#define V4L2_H264_MAX_SLICES_PER_FRAME			16  
>>
>> Are there arrays in these compound control structs where this define can be used?
> 
> No, slices_params is a separate control, but I initialize
> slices_params_ctrl_cfg.dims[0] to this value.
> 
>> Is this define standards-based or a restriction of V4L2?
> 
> It's defined by the standard.

OK, can you add a comment before the V4L2_H264_MAX_SLICES_PER_FRAME define
mentioning this? Perhaps with a reference to the standard as well.

> 
> Will fix the other typos you reported.
> 
> Thanks for the review.

My pleasure,

	Hans

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-03 12:28 ` [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Boris Brezillon
  2019-07-03 17:18   ` Nicolas Dufresne
@ 2019-07-24  3:39   ` Tomasz Figa
  2019-07-24  5:46     ` Boris Brezillon
  1 sibling, 1 reply; 29+ messages in thread
From: Tomasz Figa @ 2019-07-24  3:39 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, Linux Media Mailing List, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Maxime Ripard, Ezequiel Garcia, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

Hi Boris,

On Wed, Jul 3, 2019 at 9:28 PM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> Those lists can be extracted from the dpb, let's simplify userspace
> life and build that list kernel-side (generic helpers will be provided
> for drivers that need this list).
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> Changes in v3:
> * None
>
> Changes in v2:
> * None
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
>  1 file changed, 9 deletions(-)
>
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 47ba2d057a92..237c8e8e6bca 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1946,15 +1946,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - __u16
>        - ``nal_ref_idc``
>        - NAL reference ID value coming from the NAL Unit header
> -    * - __u8
> -      - ``ref_pic_list_p0[32]``
> -      - Backward reference list used by P-frames in the original bitstream order
> -    * - __u8
> -      - ``ref_pic_list_b0[32]``
> -      - Backward reference list used by B-frames in the original bitstream order
> -    * - __u8
> -      - ``ref_pic_list_b1[32]``
> -      - Forward reference list used by B-frames in the original bitstream order
>      * - __s32
>        - ``top_field_order_cnt``
>        - Picture Order Count for the coded top field

Thanks for the patch.

Don't we also need to remove these fields from the UAPI structs?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-24  3:39   ` Tomasz Figa
@ 2019-07-24  5:46     ` Boris Brezillon
  2019-07-25 19:38       ` Paul Kocialkowski
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-24  5:46 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, Linux Media Mailing List, Nicolas Dufresne, kernel,
	Paul Kocialkowski, Maxime Ripard, Ezequiel Garcia, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Wed, 24 Jul 2019 12:39:55 +0900
Tomasz Figa <tfiga@chromium.org> wrote:

> Hi Boris,
> 
> On Wed, Jul 3, 2019 at 9:28 PM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> >
> > Those lists can be extracted from the dpb, let's simplify userspace
> > life and build that list kernel-side (generic helpers will be provided
> > for drivers that need this list).
> >
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > Changes in v3:
> > * None
> >
> > Changes in v2:
> > * None
> > ---
> >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
> >  1 file changed, 9 deletions(-)
> >
> > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > index 47ba2d057a92..237c8e8e6bca 100644
> > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > @@ -1946,15 +1946,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >      * - __u16
> >        - ``nal_ref_idc``
> >        - NAL reference ID value coming from the NAL Unit header
> > -    * - __u8
> > -      - ``ref_pic_list_p0[32]``
> > -      - Backward reference list used by P-frames in the original bitstream order
> > -    * - __u8
> > -      - ``ref_pic_list_b0[32]``
> > -      - Backward reference list used by B-frames in the original bitstream order
> > -    * - __u8
> > -      - ``ref_pic_list_b1[32]``
> > -      - Forward reference list used by B-frames in the original bitstream order
> >      * - __s32
> >        - ``top_field_order_cnt``
> >        - Picture Order Count for the coded top field  
> 
> Thanks for the patch.
> 
> Don't we also need to remove these fields from the UAPI structs?

Oops, indeed. I'll drop them in the next version.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-05 17:16     ` Boris Brezillon
@ 2019-07-25  6:42       ` Boris Brezillon
  2019-07-25 19:36         ` Paul Kocialkowski
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-25  6:42 UTC (permalink / raw)
  To: Ezequiel Garcia, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Paul Kocialkowski
  Cc: Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Fri, 5 Jul 2019 19:16:18 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Fri, 05 Jul 2019 13:40:03 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
> > Hi Boris, Paul,
> > 
> > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:  
> > > Looks like some stateless decoders expect slices to be prefixed with
> > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > and/or need that to delimit slices when doing per frame decoding).
> > > Since skipping those start codes for dummy stateless decoders (those
> > > expecting all params to be passed through controls) should be pretty
> > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > codes.
> > > 
> > > If we ever need to support AVC headers, we can add a new menu control
> > > to select the type of NAL header to use.
> > > 
> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > ---
> > > Changes in v3:
> > > * Add Paul's R-b
> > > 
> > > Changes in v2:
> > > * None
> > > ---
> > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > index 7a1947f5be96..3ae1367806cf 100644
> > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > >      documentation, refer to the above specification, unless there is
> > >      an explicit comment stating otherwise.
> > > +    All slices should be prepended with an ANNEX B start code.
> > >      
> > 
> > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > is specified to _not_ contain the ANNEX B start code.  
> 
> Yep, we should provably rename the format.

Paul, are you okay with this rename?

s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/

or

s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/

I'd also to discuss some concerns Ezequiel and I have regarding this
change. Some (most?) codec have alignment constraints on the buffer
they pass to the HW. For HW that support Annex B parsing, that's no
problem because the start of the buffer will be aligned on a page (I'm
assuming page alignment should cover 99% of the alignment constraints).
But HW that need to skip the start code will have to pass a non-aligned
buffer (annex B start code is 3 byte long).
Paul looked at the Cedrus driver and thinks it can be handled correctly
thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
but I fear this might be a problem on other HW.

So, I'm asking again, are we sure we want to handle the raw (no start
code) and annex-b cases using the same pixel format? If we do, what's
the plan to address those potential alignment constraints? Should
we provide a way for userspace to define where the start-code ends so it
can align things properly (annex B can be extended with extra 00
bytes at the beginning)? If we do that, that means userspace has to
know about those alignment constraints, or take something big enough.
Another option would be to use a bounce buffer when things are not
aligned properly.

I'd really like to get feedback on those points before sending a v4.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode
  2019-07-03 12:28 ` [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode Boris Brezillon
  2019-07-22 15:29   ` Hans Verkuil
@ 2019-07-25 19:20   ` Paul Kocialkowski
  1 sibling, 0 replies; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-25 19:20 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Tomasz Figa, Nicolas Dufresne, kernel,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding

Hi,

On Wed 03 Jul 19, 14:28, Boris Brezillon wrote:
> @@ -125,6 +134,10 @@ struct v4l2_h264_pred_weight_table {
>  struct v4l2_ctrl_h264_slice_params {
>  	/* Size in bytes, including header */
>  	__u32 size;
> +
> +	/* Where the slice starts in the output buffer (expressed in bytes). */
> +	__u32 start_byte_offset;
> +

As I understood, we would also need to specify the beginning of the slice
relative to the beginning of the start code so we don't have to assume
fixed-size start codes or have to do the parsing in the kernel.

This can certainly be done as a subsequent patch/series, and definitely not
necessarily by you (don't want to add more on your shoulders at this point)!

Cheers,

Paul

>  	/* Offset in bits to slice_data() from the beginning of this slice. */
>  	__u32 header_bit_size;
>  
> -- 
> 2.21.0
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
  2019-07-05 16:40   ` Ezequiel Garcia
@ 2019-07-25 19:26   ` Paul Kocialkowski
  1 sibling, 0 replies; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-25 19:26 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	Sakari Ailus, linux-media, Tomasz Figa, Nicolas Dufresne, kernel,
	Maxime Ripard, Ezequiel Garcia, Jonas Karlman, Jernej Skrabec,
	Alexandre Courbot, Thierry Reding

Hi,

On Wed 03 Jul 19, 14:28, Boris Brezillon wrote:
> Looks like some stateless decoders expect slices to be prefixed with
> ANNEX B start codes (they most likely do some kind of bitstream parsing
> and/or need that to delimit slices when doing per frame decoding).
> Since skipping those start codes for dummy stateless decoders (those
> expecting all params to be passed through controls) should be pretty
> easy, let's mandate that all slices be prepended with ANNEX B start
> codes.
> 
> If we ever need to support AVC headers, we can add a new menu control
> to select the type of NAL header to use.

After thinking a bit about this, I'd rather be in favor of having offsets
in the control structures rather than forcing the start code type or having a
dedicated control for that, as I've mentionned on the other patch.

What do you think?

Cheers,

Paul

> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
> Changes in v3:
> * Add Paul's R-b
> 
> Changes in v2:
> * None
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 7a1947f5be96..3ae1367806cf 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
>      documentation, refer to the above specification, unless there is
>      an explicit comment stating otherwise.
> +    All slices should be prepended with an ANNEX B start code.
>  
>      .. note::
>  
> -- 
> 2.21.0
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-25  6:42       ` Boris Brezillon
@ 2019-07-25 19:36         ` Paul Kocialkowski
  2019-07-26  2:39           ` Ezequiel Garcia
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-25 19:36 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Ezequiel Garcia, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

Hi,

On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:
> On Fri, 5 Jul 2019 19:16:18 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > On Fri, 05 Jul 2019 13:40:03 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > 
> > > Hi Boris, Paul,
> > > 
> > > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:  
> > > > Looks like some stateless decoders expect slices to be prefixed with
> > > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > > and/or need that to delimit slices when doing per frame decoding).
> > > > Since skipping those start codes for dummy stateless decoders (those
> > > > expecting all params to be passed through controls) should be pretty
> > > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > > codes.
> > > > 
> > > > If we ever need to support AVC headers, we can add a new menu control
> > > > to select the type of NAL header to use.
> > > > 
> > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > ---
> > > > Changes in v3:
> > > > * Add Paul's R-b
> > > > 
> > > > Changes in v2:
> > > > * None
> > > > ---
> > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > index 7a1947f5be96..3ae1367806cf 100644
> > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > > >      documentation, refer to the above specification, unless there is
> > > >      an explicit comment stating otherwise.
> > > > +    All slices should be prepended with an ANNEX B start code.
> > > >      
> > > 
> > > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > > is specified to _not_ contain the ANNEX B start code.  
> > 
> > Yep, we should provably rename the format.
> 
> Paul, are you okay with this rename?

Sorry for the very long response time here, I've had a hard time getting back
into the context of all this.

> s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
> 
> or
> 
> s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/

I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
to the beinning and after the start code. That would be more flexible, but one
downside could be decoders that some decoders only take a specific start code.

On the other hand I don't think that having one pixel format for each type of
start code would be very reasonable, so I'd rather see an offset for now and
perhaps a menu control later if needed to specify which types of start codes are
supported.

> I'd also to discuss some concerns Ezequiel and I have regarding this
> change. Some (most?) codec have alignment constraints on the buffer
> they pass to the HW. For HW that support Annex B parsing, that's no
> problem because the start of the buffer will be aligned on a page (I'm
> assuming page alignment should cover 99% of the alignment constraints).
> But HW that need to skip the start code will have to pass a non-aligned
> buffer (annex B start code is 3 byte long).
> Paul looked at the Cedrus driver and thinks it can be handled correctly
> thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
> but I fear this might be a problem on other HW.
> 
> So, I'm asking again, are we sure we want to handle the raw (no start
> code) and annex-b cases using the same pixel format? If we do, what's
> the plan to address those potential alignment constraints? Should
> we provide a way for userspace to define where the start-code ends so it
> can align things properly (annex B can be extended with extra 00
> bytes at the beginning)? If we do that, that means userspace has to
> know about those alignment constraints, or take something big enough.
> Another option would be to use a bounce buffer when things are not
> aligned properly.
> 
> I'd really like to get feedback on those points before sending a v4.

Mhh I don't really know what would be best for handling that. Either way, I
don't see how more pixel formats would really help solve the issue, so I'm still
in favor of one.

Having a control that specifies an alignment constraint for the slice beginning
could work (as long as we make it optional, although userspace should be
required to abide by it when it is present).

I guess it's not such a high price to pay for a unified codec interface :)

Cheers,

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists
  2019-07-24  5:46     ` Boris Brezillon
@ 2019-07-25 19:38       ` Paul Kocialkowski
  0 siblings, 0 replies; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-25 19:38 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Tomasz Figa, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, Sakari Ailus, Linux Media Mailing List,
	Nicolas Dufresne, kernel, Maxime Ripard, Ezequiel Garcia,
	Jonas Karlman, Jernej Skrabec, Alexandre Courbot, Thierry Reding

Hi,

On Wed 24 Jul 19, 07:46, Boris Brezillon wrote:
> On Wed, 24 Jul 2019 12:39:55 +0900
> Tomasz Figa <tfiga@chromium.org> wrote:
> 
> > Hi Boris,
> > 
> > On Wed, Jul 3, 2019 at 9:28 PM Boris Brezillon
> > <boris.brezillon@collabora.com> wrote:
> > >
> > > Those lists can be extracted from the dpb, let's simplify userspace
> > > life and build that list kernel-side (generic helpers will be provided
> > > for drivers that need this list).
> > >
> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > ---
> > > Changes in v3:
> > > * None
> > >
> > > Changes in v2:
> > > * None
> > > ---
> > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 9 ---------
> > >  1 file changed, 9 deletions(-)
> > >
> > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > index 47ba2d057a92..237c8e8e6bca 100644
> > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > @@ -1946,15 +1946,6 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > >      * - __u16
> > >        - ``nal_ref_idc``
> > >        - NAL reference ID value coming from the NAL Unit header
> > > -    * - __u8
> > > -      - ``ref_pic_list_p0[32]``
> > > -      - Backward reference list used by P-frames in the original bitstream order
> > > -    * - __u8
> > > -      - ``ref_pic_list_b0[32]``
> > > -      - Backward reference list used by B-frames in the original bitstream order
> > > -    * - __u8
> > > -      - ``ref_pic_list_b1[32]``
> > > -      - Forward reference list used by B-frames in the original bitstream order
> > >      * - __s32
> > >        - ``top_field_order_cnt``
> > >        - Picture Order Count for the coded top field  
> > 
> > Thanks for the patch.
> > 
> > Don't we also need to remove these fields from the UAPI structs?
> 
> Oops, indeed. I'll drop them in the next version.

With that change, feel free to add my:
Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>

Cheers and thanks!

Paul

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-25 19:36         ` Paul Kocialkowski
@ 2019-07-26  2:39           ` Ezequiel Garcia
  2019-07-26  6:28             ` Boris Brezillon
  0 siblings, 1 reply; 29+ messages in thread
From: Ezequiel Garcia @ 2019-07-26  2:39 UTC (permalink / raw)
  To: Paul Kocialkowski, Boris Brezillon
  Cc: Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:
> Hi,
> 
> On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:
> > On Fri, 5 Jul 2019 19:16:18 +0200
> > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > 
> > > On Fri, 05 Jul 2019 13:40:03 -0300
> > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > 
> > > > Hi Boris, Paul,
> > > > 
> > > > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:  
> > > > > Looks like some stateless decoders expect slices to be prefixed with
> > > > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > > > and/or need that to delimit slices when doing per frame decoding).
> > > > > Since skipping those start codes for dummy stateless decoders (those
> > > > > expecting all params to be passed through controls) should be pretty
> > > > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > > > codes.
> > > > > 
> > > > > If we ever need to support AVC headers, we can add a new menu control
> > > > > to select the type of NAL header to use.
> > > > > 
> > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > ---
> > > > > Changes in v3:
> > > > > * Add Paul's R-b
> > > > > 
> > > > > Changes in v2:
> > > > > * None
> > > > > ---
> > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > > 
> > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > index 7a1947f5be96..3ae1367806cf 100644
> > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > > > >      documentation, refer to the above specification, unless there is
> > > > >      an explicit comment stating otherwise.
> > > > > +    All slices should be prepended with an ANNEX B start code.
> > > > >      
> > > > 
> > > > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > > > is specified to _not_ contain the ANNEX B start code.  
> > > 
> > > Yep, we should provably rename the format.
> > 
> > Paul, are you okay with this rename?
> 
> Sorry for the very long response time here, I've had a hard time getting back
> into the context of all this.
> 
> > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
> > 
> > or
> > 
> > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/
> 
> I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
> to the beinning and after the start code. That would be more flexible, but one
> downside could be decoders that some decoders only take a specific start code.
> 
> On the other hand I don't think that having one pixel format for each type of
> start code would be very reasonable, so I'd rather see an offset for now and
> perhaps a menu control later if needed to specify which types of start codes are
> supported.
> 

If I am reading the spec correctly, Annex B start code is specified to always
be the 3-byte start code: 0x000001.

The first NAL of a frame may have an additional 0x00, which effectively means
the start code of the first NAL of a frame _can_ be 4-byte 0x00000001,
in addition to the 3-byte 0x000001.

In other words, there aren't multiple Annex B type of start codes, and only
two options for the format of the slice: NAL units with or without a start code.

Therefore, I can't see any point in having this offset.

> > I'd also to discuss some concerns Ezequiel and I have regarding this
> > change. Some (most?) codec have alignment constraints on the buffer
> > they pass to the HW. For HW that support Annex B parsing, that's no
> > problem because the start of the buffer will be aligned on a page (I'm
> > assuming page alignment should cover 99% of the alignment constraints).
> > But HW that need to skip the start code will have to pass a non-aligned
> > buffer (annex B start code is 3 byte long).
> > Paul looked at the Cedrus driver and thinks it can be handled correctly
> > thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
> > but I fear this might be a problem on other HW.
> > 
> > So, I'm asking again, are we sure we want to handle the raw (no start
> > code) and annex-b cases using the same pixel format? If we do, what's
> > the plan to address those potential alignment constraints? Should
> > we provide a way for userspace to define where the start-code ends so it
> > can align things properly (annex B can be extended with extra 00
> > bytes at the beginning)? If we do that, that means userspace has to
> > know about those alignment constraints, or take something big enough.
> > Another option would be to use a bounce buffer when things are not
> > aligned properly.
> > 
> > I'd really like to get feedback on those points before sending a v4.
> 
> Mhh I don't really know what would be best for handling that. Either way, I
> don't see how more pixel formats would really help solve the issue, so I'm still
> in favor of one.
> 
> Having a control that specifies an alignment constraint for the slice beginning
> could work (as long as we make it optional, although userspace should be
> required to abide by it when it is present).
> 
> I guess it's not such a high price to pay for a unified codec interface :)
> 

I don't think the two pixfmts are such a big deal, but at the same time,
it would be much simpler for applications to forget about this entirely.

Note that the goal of providing an unified interface is making applications
simpler, not the other way around. If we kill the no-start-code pixfmt,
but we expose the hw alignment requirement, doesn't sound like an improvement :-)

Maybe we can see how it works with having just V4L2_PIX_FMT_H264_SLICE_ANNEXB,
specified to have a 3-byte start code, as specified by the annex B.

Maybe I can put some code together and test it on the Allwinner board
I have here (thanks Maxime for that one!).

Regards,
Eze


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-26  2:39           ` Ezequiel Garcia
@ 2019-07-26  6:28             ` Boris Brezillon
  2019-07-26  7:30               ` Boris Brezillon
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-26  6:28 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Paul Kocialkowski, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Thu, 25 Jul 2019 23:39:11 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:

> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:
> > Hi,
> > 
> > On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:  
> > > On Fri, 5 Jul 2019 19:16:18 +0200
> > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > >   
> > > > On Fri, 05 Jul 2019 13:40:03 -0300
> > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > >   
> > > > > Hi Boris, Paul,
> > > > > 
> > > > > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:    
> > > > > > Looks like some stateless decoders expect slices to be prefixed with
> > > > > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > > > > and/or need that to delimit slices when doing per frame decoding).
> > > > > > Since skipping those start codes for dummy stateless decoders (those
> > > > > > expecting all params to be passed through controls) should be pretty
> > > > > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > > > > codes.
> > > > > > 
> > > > > > If we ever need to support AVC headers, we can add a new menu control
> > > > > > to select the type of NAL header to use.
> > > > > > 
> > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > ---
> > > > > > Changes in v3:
> > > > > > * Add Paul's R-b
> > > > > > 
> > > > > > Changes in v2:
> > > > > > * None
> > > > > > ---
> > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > > > > >  1 file changed, 1 insertion(+)
> > > > > > 
> > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > index 7a1947f5be96..3ae1367806cf 100644
> > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > > > > >      documentation, refer to the above specification, unless there is
> > > > > >      an explicit comment stating otherwise.
> > > > > > +    All slices should be prepended with an ANNEX B start code.
> > > > > >        
> > > > > 
> > > > > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > > > > is specified to _not_ contain the ANNEX B start code.    
> > > > 
> > > > Yep, we should provably rename the format.  
> > > 
> > > Paul, are you okay with this rename?  
> > 
> > Sorry for the very long response time here, I've had a hard time getting back
> > into the context of all this.
> >   
> > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
> > > 
> > > or
> > > 
> > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/  
> > 
> > I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
> > to the beinning and after the start code. That would be more flexible, but one
> > downside could be decoders that some decoders only take a specific start code.
> > 
> > On the other hand I don't think that having one pixel format for each type of
> > start code would be very reasonable, so I'd rather see an offset for now and
> > perhaps a menu control later if needed to specify which types of start codes are
> > supported.
> >   
> 
> If I am reading the spec correctly, Annex B start code is specified to always
> be the 3-byte start code: 0x000001.
> 
> The first NAL of a frame may have an additional 0x00, which effectively means
> the start code of the first NAL of a frame _can_ be 4-byte 0x00000001,
> in addition to the 3-byte 0x000001.

That's not my understanding of the Annex B section (quoting the spec
for reference):

"
The byte stream format consists of a sequence of byte stream NAL unit
syntax structures. Each byte stream NAL unit syntax structure contains
one start code prefix followed by one nal_unit( NumBytesInNALunit )
syntax structure. It may (and under some circumstances, it shall) also
contain an additional zero_byte syntax element. It may also contain one
or more additional trailing_zero_8bits syntax elements. When it is the
first byte stream NAL unit in the bitstream, it may also contain one or
more additional leading_zero_8bits syntax elements.
"

To me, it looks like the start code can always be 0x000001 or
0x00000001. The first NAL can have extra leading 0x00 bytes, not the
following ones, *but*, all NALs can have an arbitrary number of
trailing 0x00 bytes. I guess stateless decoders unconditionally apply
the "skip_leading_0_bytes logic" described in chapter B.1.1 of the spec
to deal with these potential trailing 0x00 bytes.
Stateful decoders (those parsing Annex B NAL headers) might skip this
"skip leading 0x00 bytes" step for NAL > 0, but I suspect they just
always skip leading 0x00 bytes because
- it's simple enough
- they anyway need the logic for the first NAL
- that would require extra information about the NAL number which
  stateless decoder usually don't have

> 
> In other words, there aren't multiple Annex B type of start codes, and only
> two options for the format of the slice: NAL units with or without a start code.

There's also the AVC NAL header, and I'm pretty sure you can't use the
"add leading 0x00 bytes" trick to align things as you wish with that
one.

> 
> Therefore, I can't see any point in having this offset.

Assuming the Annex B start codes can contain an arbitrary number of
leading 0x00 bytes, we can align things according to the codec
expectations. But, as I said below, that implies exposing those
alignment constraints.

> 
> > > I'd also to discuss some concerns Ezequiel and I have regarding this
> > > change. Some (most?) codec have alignment constraints on the buffer
> > > they pass to the HW. For HW that support Annex B parsing, that's no
> > > problem because the start of the buffer will be aligned on a page (I'm
> > > assuming page alignment should cover 99% of the alignment constraints).
> > > But HW that need to skip the start code will have to pass a non-aligned
> > > buffer (annex B start code is 3 byte long).
> > > Paul looked at the Cedrus driver and thinks it can be handled correctly
> > > thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
> > > but I fear this might be a problem on other HW.
> > > 
> > > So, I'm asking again, are we sure we want to handle the raw (no start
> > > code) and annex-b cases using the same pixel format? If we do, what's
> > > the plan to address those potential alignment constraints? Should
> > > we provide a way for userspace to define where the start-code ends so it
> > > can align things properly (annex B can be extended with extra 00
> > > bytes at the beginning)? If we do that, that means userspace has to
> > > know about those alignment constraints, or take something big enough.
> > > Another option would be to use a bounce buffer when things are not
> > > aligned properly.
> > > 
> > > I'd really like to get feedback on those points before sending a v4.  
> > 
> > Mhh I don't really know what would be best for handling that. Either way, I
> > don't see how more pixel formats would really help solve the issue, so I'm still
> > in favor of one.
> > 
> > Having a control that specifies an alignment constraint for the slice beginning
> > could work (as long as we make it optional, although userspace should be
> > required to abide by it when it is present).

By making that, you put the burden on both sides of the stack:

- the kernel side will have to deal with the unaligned cases (using a
  bounce buffer)
- userspace apps/libs that want to avoid an extra copy will have to
  check this constraint and align things properly anyway

Plus, the alignment thing won't work for AVC headers, so I think we
should actually have a control to select the NAL header type rather
than expose some alignment constraints (or have one pix fmt per NAL
header type, but you don't seem to like the idea, so I'm trying
to find something else :-)).

And if we go for this option (control to select the NAL header type),
I'm wondering why we're not making that NAL-header type selection
mandatory from the start. We don't have to support all NAL headers at
first (can be Annex B only), but, by making this control selection
non-optional, we'll at least give a decent feedback to userspace
(setting NAL header control fails because the selected NAL header type
is not supported by the HW) instead of returning an error on the
decoding operation (which, depending on how verbose the driver is, can
be quite hard to figure out).

> > 
> > I guess it's not such a high price to pay for a unified codec interface :)

If by unified you mean exposing only one pixel format, then yes, it's
unified. Doesn't make it easier to deal with from the userspace
perspective IMHO.

To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
not exposing the NAL header type is a good option. We've seen that
providing alignment guarantees for HW expecting raw bitstream (without
the start code) might become challenging at some point. So I'd opt for
making this selection explicit. After all, it's just an extra control
to set from userspace, and 2 extra switch-case: one to select the most
appropriate NAL header type, and another one to fill the buffer with
the appropriate header (if there's one).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-26  6:28             ` Boris Brezillon
@ 2019-07-26  7:30               ` Boris Brezillon
  2019-07-26  8:53                 ` Hans Verkuil
  2019-07-27 12:52                 ` Ezequiel Garcia
  0 siblings, 2 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-26  7:30 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Paul Kocialkowski, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Fri, 26 Jul 2019 08:28:28 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Thu, 25 Jul 2019 23:39:11 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
> > On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:  
> > > Hi,
> > > 
> > > On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:    
> > > > On Fri, 5 Jul 2019 19:16:18 +0200
> > > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > > >     
> > > > > On Fri, 05 Jul 2019 13:40:03 -0300
> > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > >     
> > > > > > Hi Boris, Paul,
> > > > > > 
> > > > > > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:      
> > > > > > > Looks like some stateless decoders expect slices to be prefixed with
> > > > > > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > > > > > and/or need that to delimit slices when doing per frame decoding).
> > > > > > > Since skipping those start codes for dummy stateless decoders (those
> > > > > > > expecting all params to be passed through controls) should be pretty
> > > > > > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > > > > > codes.
> > > > > > > 
> > > > > > > If we ever need to support AVC headers, we can add a new menu control
> > > > > > > to select the type of NAL header to use.
> > > > > > > 
> > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > > > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > ---
> > > > > > > Changes in v3:
> > > > > > > * Add Paul's R-b
> > > > > > > 
> > > > > > > Changes in v2:
> > > > > > > * None
> > > > > > > ---
> > > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > > > > > >  1 file changed, 1 insertion(+)
> > > > > > > 
> > > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > index 7a1947f5be96..3ae1367806cf 100644
> > > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > > > > > >      documentation, refer to the above specification, unless there is
> > > > > > >      an explicit comment stating otherwise.
> > > > > > > +    All slices should be prepended with an ANNEX B start code.
> > > > > > >          
> > > > > > 
> > > > > > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > > > > > is specified to _not_ contain the ANNEX B start code.      
> > > > > 
> > > > > Yep, we should provably rename the format.    
> > > > 
> > > > Paul, are you okay with this rename?    
> > > 
> > > Sorry for the very long response time here, I've had a hard time getting back
> > > into the context of all this.
> > >     
> > > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
> > > > 
> > > > or
> > > > 
> > > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/    
> > > 
> > > I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
> > > to the beinning and after the start code. That would be more flexible, but one
> > > downside could be decoders that some decoders only take a specific start code.
> > > 
> > > On the other hand I don't think that having one pixel format for each type of
> > > start code would be very reasonable, so I'd rather see an offset for now and
> > > perhaps a menu control later if needed to specify which types of start codes are
> > > supported.
> > >     
> > 
> > If I am reading the spec correctly, Annex B start code is specified to always
> > be the 3-byte start code: 0x000001.
> > 
> > The first NAL of a frame may have an additional 0x00, which effectively means
> > the start code of the first NAL of a frame _can_ be 4-byte 0x00000001,
> > in addition to the 3-byte 0x000001.  
> 
> That's not my understanding of the Annex B section (quoting the spec
> for reference):
> 
> "
> The byte stream format consists of a sequence of byte stream NAL unit
> syntax structures. Each byte stream NAL unit syntax structure contains
> one start code prefix followed by one nal_unit( NumBytesInNALunit )
> syntax structure. It may (and under some circumstances, it shall) also
> contain an additional zero_byte syntax element. It may also contain one
> or more additional trailing_zero_8bits syntax elements. When it is the
> first byte stream NAL unit in the bitstream, it may also contain one or
> more additional leading_zero_8bits syntax elements.
> "
> 
> To me, it looks like the start code can always be 0x000001 or
> 0x00000001. The first NAL can have extra leading 0x00 bytes, not the
> following ones, *but*, all NALs can have an arbitrary number of
> trailing 0x00 bytes. I guess stateless decoders unconditionally apply
> the "skip_leading_0_bytes logic" described in chapter B.1.1 of the spec
> to deal with these potential trailing 0x00 bytes.
> Stateful decoders (those parsing Annex B NAL headers) might skip this
> "skip leading 0x00 bytes" step for NAL > 0, but I suspect they just
> always skip leading 0x00 bytes because
> - it's simple enough
> - they anyway need the logic for the first NAL
> - that would require extra information about the NAL number which
>   stateless decoder usually don't have
> 
> > 
> > In other words, there aren't multiple Annex B type of start codes, and only
> > two options for the format of the slice: NAL units with or without a start code.  
> 
> There's also the AVC NAL header, and I'm pretty sure you can't use the
> "add leading 0x00 bytes" trick to align things as you wish with that
> one.
> 
> > 
> > Therefore, I can't see any point in having this offset.  
> 
> Assuming the Annex B start codes can contain an arbitrary number of
> leading 0x00 bytes, we can align things according to the codec
> expectations. But, as I said below, that implies exposing those
> alignment constraints.
> 
> >   
> > > > I'd also to discuss some concerns Ezequiel and I have regarding this
> > > > change. Some (most?) codec have alignment constraints on the buffer
> > > > they pass to the HW. For HW that support Annex B parsing, that's no
> > > > problem because the start of the buffer will be aligned on a page (I'm
> > > > assuming page alignment should cover 99% of the alignment constraints).
> > > > But HW that need to skip the start code will have to pass a non-aligned
> > > > buffer (annex B start code is 3 byte long).
> > > > Paul looked at the Cedrus driver and thinks it can be handled correctly
> > > > thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
> > > > but I fear this might be a problem on other HW.
> > > > 
> > > > So, I'm asking again, are we sure we want to handle the raw (no start
> > > > code) and annex-b cases using the same pixel format? If we do, what's
> > > > the plan to address those potential alignment constraints? Should
> > > > we provide a way for userspace to define where the start-code ends so it
> > > > can align things properly (annex B can be extended with extra 00
> > > > bytes at the beginning)? If we do that, that means userspace has to
> > > > know about those alignment constraints, or take something big enough.
> > > > Another option would be to use a bounce buffer when things are not
> > > > aligned properly.
> > > > 
> > > > I'd really like to get feedback on those points before sending a v4.    
> > > 
> > > Mhh I don't really know what would be best for handling that. Either way, I
> > > don't see how more pixel formats would really help solve the issue, so I'm still
> > > in favor of one.
> > > 
> > > Having a control that specifies an alignment constraint for the slice beginning
> > > could work (as long as we make it optional, although userspace should be
> > > required to abide by it when it is present).  
> 
> By making that, you put the burden on both sides of the stack:
> 
> - the kernel side will have to deal with the unaligned cases (using a
>   bounce buffer)
> - userspace apps/libs that want to avoid an extra copy will have to
>   check this constraint and align things properly anyway

I'd like to revise my statement. Ideally, the drivers should take care
of such mis-alignments or unsupported NAL header types by
copying/re-formatting the OUTPUT buffer so that existing apps work
out of the box when the driver is added, which means we'll have to take
care of that kernel-side anyway. Handling selection of the best
encoding-mode/NAL-header-type in userspace is useful if one wants to
improve perfs.

> 
> Plus, the alignment thing won't work for AVC headers, so I think we
> should actually have a control to select the NAL header type rather
> than expose some alignment constraints (or have one pix fmt per NAL
> header type, but you don't seem to like the idea, so I'm trying
> to find something else :-)).
> 
> And if we go for this option (control to select the NAL header type),
> I'm wondering why we're not making that NAL-header type selection
> mandatory from the start. We don't have to support all NAL headers at
> first (can be Annex B only), but, by making this control selection
> non-optional, we'll at least give a decent feedback to userspace
> (setting NAL header control fails because the selected NAL header type
> is not supported by the HW) instead of returning an error on the
> decoding operation (which, depending on how verbose the driver is, can
> be quite hard to figure out).
> 
> > > 
> > > I guess it's not such a high price to pay for a unified codec interface :)  
> 
> If by unified you mean exposing only one pixel format, then yes, it's
> unified. Doesn't make it easier to deal with from the userspace
> perspective IMHO.
> 
> To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
> not exposing the NAL header type is a good option. We've seen that
> providing alignment guarantees for HW expecting raw bitstream (without
> the start code) might become challenging at some point. So I'd opt for
> making this selection explicit. After all, it's just an extra control
> to set from userspace, and 2 extra switch-case: one to select the most
> appropriate NAL header type, and another one to fill the buffer with
> the appropriate header (if there's one).


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-26  7:30               ` Boris Brezillon
@ 2019-07-26  8:53                 ` Hans Verkuil
  2019-07-27  9:27                   ` Paul Kocialkowski
  2019-07-27 12:52                 ` Ezequiel Garcia
  1 sibling, 1 reply; 29+ messages in thread
From: Hans Verkuil @ 2019-07-26  8:53 UTC (permalink / raw)
  To: Boris Brezillon, Ezequiel Garcia
  Cc: Paul Kocialkowski, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On 7/26/19 9:30 AM, Boris Brezillon wrote:
> On Fri, 26 Jul 2019 08:28:28 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
>> On Thu, 25 Jul 2019 23:39:11 -0300
>> Ezequiel Garcia <ezequiel@collabora.com> wrote:
>>
>>> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:  
>>>> Hi,
>>>>
>>>> On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:    
>>>>> On Fri, 5 Jul 2019 19:16:18 +0200
>>>>> Boris Brezillon <boris.brezillon@collabora.com> wrote:
>>>>>     
>>>>>> On Fri, 05 Jul 2019 13:40:03 -0300
>>>>>> Ezequiel Garcia <ezequiel@collabora.com> wrote:
>>>>>>     
>>>>>>> Hi Boris, Paul,
>>>>>>>
>>>>>>> On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:      
>>>>>>>> Looks like some stateless decoders expect slices to be prefixed with
>>>>>>>> ANNEX B start codes (they most likely do some kind of bitstream parsing
>>>>>>>> and/or need that to delimit slices when doing per frame decoding).
>>>>>>>> Since skipping those start codes for dummy stateless decoders (those
>>>>>>>> expecting all params to be passed through controls) should be pretty
>>>>>>>> easy, let's mandate that all slices be prepended with ANNEX B start
>>>>>>>> codes.
>>>>>>>>
>>>>>>>> If we ever need to support AVC headers, we can add a new menu control
>>>>>>>> to select the type of NAL header to use.
>>>>>>>>
>>>>>>>> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
>>>>>>>> Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>>>>>>>> ---
>>>>>>>> Changes in v3:
>>>>>>>> * Add Paul's R-b
>>>>>>>>
>>>>>>>> Changes in v2:
>>>>>>>> * None
>>>>>>>> ---
>>>>>>>>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
>>>>>>>>  1 file changed, 1 insertion(+)
>>>>>>>>
>>>>>>>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>>> index 7a1947f5be96..3ae1367806cf 100644
>>>>>>>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>>>>>> @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>>>>>>      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
>>>>>>>>      documentation, refer to the above specification, unless there is
>>>>>>>>      an explicit comment stating otherwise.
>>>>>>>> +    All slices should be prepended with an ANNEX B start code.
>>>>>>>>          
>>>>>>>
>>>>>>> Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
>>>>>>> is specified to _not_ contain the ANNEX B start code.      
>>>>>>
>>>>>> Yep, we should provably rename the format.    
>>>>>
>>>>> Paul, are you okay with this rename?    
>>>>
>>>> Sorry for the very long response time here, I've had a hard time getting back
>>>> into the context of all this.
>>>>     
>>>>> s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
>>>>>
>>>>> or
>>>>>
>>>>> s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/    
>>>>
>>>> I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
>>>> to the beinning and after the start code. That would be more flexible, but one
>>>> downside could be decoders that some decoders only take a specific start code.
>>>>
>>>> On the other hand I don't think that having one pixel format for each type of
>>>> start code would be very reasonable, so I'd rather see an offset for now and
>>>> perhaps a menu control later if needed to specify which types of start codes are
>>>> supported.
>>>>     
>>>
>>> If I am reading the spec correctly, Annex B start code is specified to always
>>> be the 3-byte start code: 0x000001.
>>>
>>> The first NAL of a frame may have an additional 0x00, which effectively means
>>> the start code of the first NAL of a frame _can_ be 4-byte 0x00000001,
>>> in addition to the 3-byte 0x000001.  
>>
>> That's not my understanding of the Annex B section (quoting the spec
>> for reference):
>>
>> "
>> The byte stream format consists of a sequence of byte stream NAL unit
>> syntax structures. Each byte stream NAL unit syntax structure contains
>> one start code prefix followed by one nal_unit( NumBytesInNALunit )
>> syntax structure. It may (and under some circumstances, it shall) also
>> contain an additional zero_byte syntax element. It may also contain one
>> or more additional trailing_zero_8bits syntax elements. When it is the
>> first byte stream NAL unit in the bitstream, it may also contain one or
>> more additional leading_zero_8bits syntax elements.
>> "
>>
>> To me, it looks like the start code can always be 0x000001 or
>> 0x00000001. The first NAL can have extra leading 0x00 bytes, not the
>> following ones, *but*, all NALs can have an arbitrary number of
>> trailing 0x00 bytes. I guess stateless decoders unconditionally apply
>> the "skip_leading_0_bytes logic" described in chapter B.1.1 of the spec
>> to deal with these potential trailing 0x00 bytes.
>> Stateful decoders (those parsing Annex B NAL headers) might skip this
>> "skip leading 0x00 bytes" step for NAL > 0, but I suspect they just
>> always skip leading 0x00 bytes because
>> - it's simple enough
>> - they anyway need the logic for the first NAL
>> - that would require extra information about the NAL number which
>>   stateless decoder usually don't have
>>
>>>
>>> In other words, there aren't multiple Annex B type of start codes, and only
>>> two options for the format of the slice: NAL units with or without a start code.  
>>
>> There's also the AVC NAL header, and I'm pretty sure you can't use the
>> "add leading 0x00 bytes" trick to align things as you wish with that
>> one.
>>
>>>
>>> Therefore, I can't see any point in having this offset.  
>>
>> Assuming the Annex B start codes can contain an arbitrary number of
>> leading 0x00 bytes, we can align things according to the codec
>> expectations. But, as I said below, that implies exposing those
>> alignment constraints.
>>
>>>   
>>>>> I'd also to discuss some concerns Ezequiel and I have regarding this
>>>>> change. Some (most?) codec have alignment constraints on the buffer
>>>>> they pass to the HW. For HW that support Annex B parsing, that's no
>>>>> problem because the start of the buffer will be aligned on a page (I'm
>>>>> assuming page alignment should cover 99% of the alignment constraints).
>>>>> But HW that need to skip the start code will have to pass a non-aligned
>>>>> buffer (annex B start code is 3 byte long).
>>>>> Paul looked at the Cedrus driver and thinks it can be handled correctly
>>>>> thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
>>>>> but I fear this might be a problem on other HW.
>>>>>
>>>>> So, I'm asking again, are we sure we want to handle the raw (no start
>>>>> code) and annex-b cases using the same pixel format? If we do, what's
>>>>> the plan to address those potential alignment constraints? Should
>>>>> we provide a way for userspace to define where the start-code ends so it
>>>>> can align things properly (annex B can be extended with extra 00
>>>>> bytes at the beginning)? If we do that, that means userspace has to
>>>>> know about those alignment constraints, or take something big enough.
>>>>> Another option would be to use a bounce buffer when things are not
>>>>> aligned properly.
>>>>>
>>>>> I'd really like to get feedback on those points before sending a v4.    
>>>>
>>>> Mhh I don't really know what would be best for handling that. Either way, I
>>>> don't see how more pixel formats would really help solve the issue, so I'm still
>>>> in favor of one.
>>>>
>>>> Having a control that specifies an alignment constraint for the slice beginning
>>>> could work (as long as we make it optional, although userspace should be
>>>> required to abide by it when it is present).  
>>
>> By making that, you put the burden on both sides of the stack:
>>
>> - the kernel side will have to deal with the unaligned cases (using a
>>   bounce buffer)
>> - userspace apps/libs that want to avoid an extra copy will have to
>>   check this constraint and align things properly anyway
> 
> I'd like to revise my statement. Ideally, the drivers should take care
> of such mis-alignments or unsupported NAL header types by
> copying/re-formatting the OUTPUT buffer so that existing apps work
> out of the box when the driver is added, which means we'll have to take
> care of that kernel-side anyway. Handling selection of the best
> encoding-mode/NAL-header-type in userspace is useful if one wants to
> improve perfs.

Just my 5 cents:

You very much want to avoid the situation where drivers have to copy or
reformat the OUTPUT buffer. That's asking for problems, not to mention
that it is no longer zero-copy.

> 
>>
>> Plus, the alignment thing won't work for AVC headers, so I think we
>> should actually have a control to select the NAL header type rather
>> than expose some alignment constraints (or have one pix fmt per NAL
>> header type, but you don't seem to like the idea, so I'm trying
>> to find something else :-)).
>>
>> And if we go for this option (control to select the NAL header type),
>> I'm wondering why we're not making that NAL-header type selection
>> mandatory from the start. We don't have to support all NAL headers at
>> first (can be Annex B only), but, by making this control selection
>> non-optional, we'll at least give a decent feedback to userspace
>> (setting NAL header control fails because the selected NAL header type
>> is not supported by the HW) instead of returning an error on the
>> decoding operation (which, depending on how verbose the driver is, can
>> be quite hard to figure out).

This sounds reasonable.

This control should be mandatory, and it should be referred to from
the H264/5 pixelformat definitions (see also https://patchwork.linuxtv.org/patch/57709/).

Regards,

	Hans

>>
>>>>
>>>> I guess it's not such a high price to pay for a unified codec interface :)  
>>
>> If by unified you mean exposing only one pixel format, then yes, it's
>> unified. Doesn't make it easier to deal with from the userspace
>> perspective IMHO.
>>
>> To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
>> not exposing the NAL header type is a good option. We've seen that
>> providing alignment guarantees for HW expecting raw bitstream (without
>> the start code) might become challenging at some point. So I'd opt for
>> making this selection explicit. After all, it's just an extra control
>> to set from userspace, and 2 extra switch-case: one to select the most
>> appropriate NAL header type, and another one to fill the buffer with
>> the appropriate header (if there's one).
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-26  8:53                 ` Hans Verkuil
@ 2019-07-27  9:27                   ` Paul Kocialkowski
  2019-07-27  9:46                     ` Boris Brezillon
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-27  9:27 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Boris Brezillon, Ezequiel Garcia, Hans Verkuil, Tomasz Figa,
	Nicolas Dufresne, Mauro Carvalho Chehab, Laurent Pinchart,
	Sakari Ailus, linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

Hi,

On Fri 26 Jul 19, 10:53, Hans Verkuil wrote:
> On 7/26/19 9:30 AM, Boris Brezillon wrote:
> > On Fri, 26 Jul 2019 08:28:28 +0200
> > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > 
> >> On Thu, 25 Jul 2019 23:39:11 -0300
> >> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> >>
> >>> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:  
> >>>> Having a control that specifies an alignment constraint for the slice beginning
> >>>> could work (as long as we make it optional, although userspace should be
> >>>> required to abide by it when it is present).  
> >>
> >> By making that, you put the burden on both sides of the stack:
> >>
> >> - the kernel side will have to deal with the unaligned cases (using a
> >>   bounce buffer)
> >> - userspace apps/libs that want to avoid an extra copy will have to
> >>   check this constraint and align things properly anyway
> > 
> > I'd like to revise my statement. Ideally, the drivers should take care
> > of such mis-alignments or unsupported NAL header types by
> > copying/re-formatting the OUTPUT buffer so that existing apps work
> > out of the box when the driver is added, which means we'll have to take
> > care of that kernel-side anyway. Handling selection of the best
> > encoding-mode/NAL-header-type in userspace is useful if one wants to
> > improve perfs.
> 
> Just my 5 cents:
> 
> You very much want to avoid the situation where drivers have to copy or
> reformat the OUTPUT buffer. That's asking for problems, not to mention
> that it is no longer zero-copy.

I definitely agree on that, since such constraints are likely to exist, we are
certainly better off exposing them to userspace.

I understand that it does add some complexity and asks for userspace code to be
more complex, but let's be realistic: this is a complex topic with lots of
hardware-specific details getting in the way. I don't think we can act as if
things were simpler.

My feeling is that we should keep trying to find "as elegant as possible" ways
to expose constraints instead of putting strict and easy definitions for
userspace that end up making drivers perform sub-optimally.

Since the initial cedrus proposal, we have covered more ground to allow the
API to fit the rockchip case, without conflicting with cedrus. We're now facing
new constraints and issue and I really think we should keep trying to integrate
them in the unified API.

> >> Plus, the alignment thing won't work for AVC headers, so I think we
> >> should actually have a control to select the NAL header type rather
> >> than expose some alignment constraints (or have one pix fmt per NAL
> >> header type, but you don't seem to like the idea, so I'm trying
> >> to find something else :-)).
> >>
> >> And if we go for this option (control to select the NAL header type),
> >> I'm wondering why we're not making that NAL-header type selection
> >> mandatory from the start. We don't have to support all NAL headers at
> >> first (can be Annex B only), but, by making this control selection
> >> non-optional, we'll at least give a decent feedback to userspace
> >> (setting NAL header control fails because the selected NAL header type
> >> is not supported by the HW) instead of returning an error on the
> >> decoding operation (which, depending on how verbose the driver is, can
> >> be quite hard to figure out).
> 
> This sounds reasonable.
> 
> This control should be mandatory, and it should be referred to from
> the H264/5 pixelformat definitions (see also https://patchwork.linuxtv.org/patch/57709/).

I am growing confused about one thing: are we talking about selecting
the type of *start code* (which can have a variable number of heading and
trailing zeros depending on the situation) or about the *NAL header type*, which
follows the start code?

I like the idea of drivers providing what types of start codes they can support,
but I don't really see how it helps regarding the alignment constraints and how
it relates to the zero-padding.

Cheers,

Paul

> Regards,
> 
> 	Hans
> 
> >>
> >>>>
> >>>> I guess it's not such a high price to pay for a unified codec interface :)  
> >>
> >> If by unified you mean exposing only one pixel format, then yes, it's
> >> unified. Doesn't make it easier to deal with from the userspace
> >> perspective IMHO.
> >>
> >> To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
> >> not exposing the NAL header type is a good option. We've seen that
> >> providing alignment guarantees for HW expecting raw bitstream (without
> >> the start code) might become challenging at some point. So I'd opt for
> >> making this selection explicit. After all, it's just an extra control
> >> to set from userspace, and 2 extra switch-case: one to select the most
> >> appropriate NAL header type, and another one to fill the buffer with
> >> the appropriate header (if there's one).
> > 
> 

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-27  9:27                   ` Paul Kocialkowski
@ 2019-07-27  9:46                     ` Boris Brezillon
  2019-07-29 13:25                       ` Paul Kocialkowski
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-27  9:46 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: Hans Verkuil, Ezequiel Garcia, Hans Verkuil, Tomasz Figa,
	Nicolas Dufresne, Mauro Carvalho Chehab, Laurent Pinchart,
	Sakari Ailus, linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Sat, 27 Jul 2019 11:27:43 +0200
Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:

> Hi,
> 
> On Fri 26 Jul 19, 10:53, Hans Verkuil wrote:
> > On 7/26/19 9:30 AM, Boris Brezillon wrote:  
> > > On Fri, 26 Jul 2019 08:28:28 +0200
> > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > >   
> > >> On Thu, 25 Jul 2019 23:39:11 -0300
> > >> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > >>  
> > >>> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:    
> > >>>> Having a control that specifies an alignment constraint for the slice beginning
> > >>>> could work (as long as we make it optional, although userspace should be
> > >>>> required to abide by it when it is present).    
> > >>
> > >> By making that, you put the burden on both sides of the stack:
> > >>
> > >> - the kernel side will have to deal with the unaligned cases (using a
> > >>   bounce buffer)
> > >> - userspace apps/libs that want to avoid an extra copy will have to
> > >>   check this constraint and align things properly anyway  
> > > 
> > > I'd like to revise my statement. Ideally, the drivers should take care
> > > of such mis-alignments or unsupported NAL header types by
> > > copying/re-formatting the OUTPUT buffer so that existing apps work
> > > out of the box when the driver is added, which means we'll have to take
> > > care of that kernel-side anyway. Handling selection of the best
> > > encoding-mode/NAL-header-type in userspace is useful if one wants to
> > > improve perfs.  
> > 
> > Just my 5 cents:
> > 
> > You very much want to avoid the situation where drivers have to copy or
> > reformat the OUTPUT buffer. That's asking for problems, not to mention
> > that it is no longer zero-copy.  
> 
> I definitely agree on that, since such constraints are likely to exist, we are
> certainly better off exposing them to userspace.
> 
> I understand that it does add some complexity and asks for userspace code to be
> more complex, but let's be realistic: this is a complex topic with lots of
> hardware-specific details getting in the way. I don't think we can act as if
> things were simpler.
> 
> My feeling is that we should keep trying to find "as elegant as possible" ways
> to expose constraints instead of putting strict and easy definitions for
> userspace that end up making drivers perform sub-optimally.
> 
> Since the initial cedrus proposal, we have covered more ground to allow the
> API to fit the rockchip case, without conflicting with cedrus. We're now facing
> new constraints and issue and I really think we should keep trying to integrate
> them in the unified API.
> 
> > >> Plus, the alignment thing won't work for AVC headers, so I think we
> > >> should actually have a control to select the NAL header type rather
> > >> than expose some alignment constraints (or have one pix fmt per NAL
> > >> header type, but you don't seem to like the idea, so I'm trying
> > >> to find something else :-)).
> > >>
> > >> And if we go for this option (control to select the NAL header type),
> > >> I'm wondering why we're not making that NAL-header type selection
> > >> mandatory from the start. We don't have to support all NAL headers at
> > >> first (can be Annex B only), but, by making this control selection
> > >> non-optional, we'll at least give a decent feedback to userspace
> > >> (setting NAL header control fails because the selected NAL header type
> > >> is not supported by the HW) instead of returning an error on the
> > >> decoding operation (which, depending on how verbose the driver is, can
> > >> be quite hard to figure out).  
> > 
> > This sounds reasonable.
> > 
> > This control should be mandatory, and it should be referred to from
> > the H264/5 pixelformat definitions (see also https://patchwork.linuxtv.org/patch/57709/).  
> 
> I am growing confused about one thing: are we talking about selecting
> the type of *start code* (which can have a variable number of heading and
> trailing zeros depending on the situation) or about the *NAL header type*, which
> follows the start code?

We're talking about start codes, but Nicolas called them nal_header in
this email [1], so I thought it was the appropriate naming.

> 
> I like the idea of drivers providing what types of start codes they can support,
> but I don't really see how it helps regarding the alignment constraints and how
> it relates to the zero-padding.

It does help with alignment constraints because buffers allocated by
the driver are usually matching the HW alignment constraints and by
passing the type of NAL header (or start code if you prefer) we now
guarantee that the raw bitstream (when in NO_NAL_HEADER is selected) is
placed at the beginning of the buffer. Doesn't solve the case of
imported buffers, but that problem is orthogonal I think (it's a
problem we already have right now, and would indeed require some way
to expose HW alignment constraints).

Not sure what the zero padding issue is. If you know the type of start
code, you don't have add extra 0 at the beginning to meet the alignment
constraints. If you're talking about padding bytes added at the end of
the bitstream (there's such a constraint on the rkvdec block), I think
that's something driver specific and should be handled by the driver.

[1]https://www.mail-archive.com/linux-media@vger.kernel.org/msg146836.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-26  7:30               ` Boris Brezillon
  2019-07-26  8:53                 ` Hans Verkuil
@ 2019-07-27 12:52                 ` Ezequiel Garcia
  2019-07-27 13:49                   ` Boris Brezillon
  1 sibling, 1 reply; 29+ messages in thread
From: Ezequiel Garcia @ 2019-07-27 12:52 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Paul Kocialkowski, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Fri, 2019-07-26 at 09:30 +0200, Boris Brezillon wrote:
> On Fri, 26 Jul 2019 08:28:28 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > On Thu, 25 Jul 2019 23:39:11 -0300
> > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > 
> > > On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:  
> > > > Hi,
> > > > 
> > > > On Thu 25 Jul 19, 08:42, Boris Brezillon wrote:    
> > > > > On Fri, 5 Jul 2019 19:16:18 +0200
> > > > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > > > >     
> > > > > > On Fri, 05 Jul 2019 13:40:03 -0300
> > > > > > Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > > >     
> > > > > > > Hi Boris, Paul,
> > > > > > > 
> > > > > > > On Wed, 2019-07-03 at 14:28 +0200, Boris Brezillon wrote:      
> > > > > > > > Looks like some stateless decoders expect slices to be prefixed with
> > > > > > > > ANNEX B start codes (they most likely do some kind of bitstream parsing
> > > > > > > > and/or need that to delimit slices when doing per frame decoding).
> > > > > > > > Since skipping those start codes for dummy stateless decoders (those
> > > > > > > > expecting all params to be passed through controls) should be pretty
> > > > > > > > easy, let's mandate that all slices be prepended with ANNEX B start
> > > > > > > > codes.
> > > > > > > > 
> > > > > > > > If we ever need to support AVC headers, we can add a new menu control
> > > > > > > > to select the type of NAL header to use.
> > > > > > > > 
> > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > > > > > Reviewed-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > > ---
> > > > > > > > Changes in v3:
> > > > > > > > * Add Paul's R-b
> > > > > > > > 
> > > > > > > > Changes in v2:
> > > > > > > > * None
> > > > > > > > ---
> > > > > > > >  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 1 +
> > > > > > > >  1 file changed, 1 insertion(+)
> > > > > > > > 
> > > > > > > > diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > index 7a1947f5be96..3ae1367806cf 100644
> > > > > > > > --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> > > > > > > > @@ -1726,6 +1726,7 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> > > > > > > >      :ref:`h264`, section 7.4.3 "Slice Header Semantics". For further
> > > > > > > >      documentation, refer to the above specification, unless there is
> > > > > > > >      an explicit comment stating otherwise.
> > > > > > > > +    All slices should be prepended with an ANNEX B start code.
> > > > > > > >          
> > > > > > > 
> > > > > > > Currently, the H264 slice V4L2_PIX_FMT_H264_SLICE_RAW,
> > > > > > > is specified to _not_ contain the ANNEX B start code.      
> > > > > > 
> > > > > > Yep, we should provably rename the format.    
> > > > > 
> > > > > Paul, are you okay with this rename?    
> > > > 
> > > > Sorry for the very long response time here, I've had a hard time getting back
> > > > into the context of all this.
> > > >     
> > > > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE/
> > > > > 
> > > > > or
> > > > > 
> > > > > s/V4L2_PIX_FMT_H264_SLICE_RAW/V4L2_PIX_FMT_H264_SLICE_ANNEXB/    
> > > > 
> > > > I'd be in favor of the former (V4L2_PIX_FMT_H264_SLICE) and passing offsets
> > > > to the beinning and after the start code. That would be more flexible, but one
> > > > downside could be decoders that some decoders only take a specific start code.
> > > > 
> > > > On the other hand I don't think that having one pixel format for each type of
> > > > start code would be very reasonable, so I'd rather see an offset for now and
> > > > perhaps a menu control later if needed to specify which types of start codes are
> > > > supported.
> > > >     
> > > 
> > > If I am reading the spec correctly, Annex B start code is specified to always
> > > be the 3-byte start code: 0x000001.
> > > 
> > > The first NAL of a frame may have an additional 0x00, which effectively means
> > > the start code of the first NAL of a frame _can_ be 4-byte 0x00000001,
> > > in addition to the 3-byte 0x000001.  
> > 
> > That's not my understanding of the Annex B section (quoting the spec
> > for reference):
> > 
> > "
> > The byte stream format consists of a sequence of byte stream NAL unit
> > syntax structures. Each byte stream NAL unit syntax structure contains
> > one start code prefix followed by one nal_unit( NumBytesInNALunit )
> > syntax structure. It may (and under some circumstances, it shall) also
> > contain an additional zero_byte syntax element. It may also contain one
> > or more additional trailing_zero_8bits syntax elements. When it is the
> > first byte stream NAL unit in the bitstream, it may also contain one or
> > more additional leading_zero_8bits syntax elements.
> > "
> > 

Right. I wonder what the "may or shall" part is really specifying.

However, note that the table B.1.1 and its comments B.1.2 is might
be interpreted differently. To me, there's a difference between the different
syntax elements (zero-bytes elements vs. the start code prefix element).

This is what it says about the zero_byte syntax element:

"""
zero_byte is a single byte equal to 0x00.
When any of the following conditions are fulfilled, the zero_byte syntax element shall be present.
– the nal_unit_type within the nal_unit( ) is equal to 7 (sequence parameter set) or 8 (picture parameter set)
– the byte stream NAL unit syntax structure contains the first NAL unit of an access unit in decoding order, as
specified by subclause 7.4.1.2.3.
"""

We are not dealing with SPS or PPS here, but we are discussing multislice content,
so IIUC this syntax element would be part of our bitstream pixfmt.

And this is what it says about the start code prefix:

"""
start_code_prefix_one_3bytes is a fixed-value sequence of 3 bytes equal to 0x000001. This syntax element is called a
start code prefix.
"""

These elements are used in such a way that it might seem
you have two start codes options 3-byte or 4-byte, though.

> > To me, it looks like the start code can always be 0x000001 or
> > 0x00000001. The first NAL can have extra leading 0x00 bytes, not the
> > following ones, *but*, all NALs can have an arbitrary number of
> > trailing 0x00 bytes. I guess stateless decoders unconditionally apply
> > the "skip_leading_0_bytes logic" described in chapter B.1.1 of the spec
> > to deal with these potential trailing 0x00 bytes.
> > Stateful decoders (those parsing Annex B NAL headers) might skip this
> > "skip leading 0x00 bytes" step for NAL > 0, but I suspect they just
> > always skip leading 0x00 bytes because
> > - it's simple enough
> > - they anyway need the logic for the first NAL
> > - that would require extra information about the NAL number which
> >   stateless decoder usually don't have
> > 
> > > In other words, there aren't multiple Annex B type of start codes, and only
> > > two options for the format of the slice: NAL units with or without a start code.  
> > 
> > There's also the AVC NAL header, and I'm pretty sure you can't use the
> > "add leading 0x00 bytes" trick to align things as you wish with that
> > one.
> > 
> > > Therefore, I can't see any point in having this offset.  
> > 
> > Assuming the Annex B start codes can contain an arbitrary number of
> > leading 0x00 bytes, we can align things according to the codec
> > expectations. But, as I said below, that implies exposing those
> > alignment constraints.
> > 
> > >   
> > > > > I'd also to discuss some concerns Ezequiel and I have regarding this
> > > > > change. Some (most?) codec have alignment constraints on the buffer
> > > > > they pass to the HW. For HW that support Annex B parsing, that's no
> > > > > problem because the start of the buffer will be aligned on a page (I'm
> > > > > assuming page alignment should cover 99% of the alignment constraints).
> > > > > But HW that need to skip the start code will have to pass a non-aligned
> > > > > buffer (annex B start code is 3 byte long).
> > > > > Paul looked at the Cedrus driver and thinks it can be handled correctly
> > > > > thanks to the VE_H264_VLD_OFFSET field (which encodes an offset in bit),
> > > > > but I fear this might be a problem on other HW.
> > > > > 
> > > > > So, I'm asking again, are we sure we want to handle the raw (no start
> > > > > code) and annex-b cases using the same pixel format? If we do, what's
> > > > > the plan to address those potential alignment constraints? Should
> > > > > we provide a way for userspace to define where the start-code ends so it
> > > > > can align things properly (annex B can be extended with extra 00
> > > > > bytes at the beginning)? If we do that, that means userspace has to
> > > > > know about those alignment constraints, or take something big enough.
> > > > > Another option would be to use a bounce buffer when things are not
> > > > > aligned properly.
> > > > > 
> > > > > I'd really like to get feedback on those points before sending a v4.    
> > > > 
> > > > Mhh I don't really know what would be best for handling that. Either way, I
> > > > don't see how more pixel formats would really help solve the issue, so I'm still
> > > > in favor of one.
> > > > 
> > > > Having a control that specifies an alignment constraint for the slice beginning
> > > > could work (as long as we make it optional, although userspace should be
> > > > required to abide by it when it is present).  
> > 
> > By making that, you put the burden on both sides of the stack:
> > 
> > - the kernel side will have to deal with the unaligned cases (using a
> >   bounce buffer)
> > - userspace apps/libs that want to avoid an extra copy will have to
> >   check this constraint and align things properly anyway
> 
> I'd like to revise my statement. Ideally, the drivers should take care
> of such mis-alignments or unsupported NAL header types by
> copying/re-formatting the OUTPUT buffer so that existing apps work
> out of the box when the driver is added, which means we'll have to take
> care of that kernel-side anyway. Handling selection of the best
> encoding-mode/NAL-header-type in userspace is useful if one wants to
> improve perfs.
> 
> > Plus, the alignment thing won't work for AVC headers, so I think we
> > should actually have a control to select the NAL header type rather
> > than expose some alignment constraints (or have one pix fmt per NAL
> > header type, but you don't seem to like the idea, so I'm trying
> > to find something else :-)).
> > 
> > And if we go for this option (control to select the NAL header type),
> > I'm wondering why we're not making that NAL-header type selection
> > mandatory from the start. We don't have to support all NAL headers at
> > first (can be Annex B only), but, by making this control selection
> > non-optional, we'll at least give a decent feedback to userspace
> > (setting NAL header control fails because the selected NAL header type
> > is not supported by the HW) instead of returning an error on the
> > decoding operation (which, depending on how verbose the driver is, can
> > be quite hard to figure out).
> > 
> > > > I guess it's not such a high price to pay for a unified codec interface :)  
> > 
> > If by unified you mean exposing only one pixel format, then yes, it's
> > unified. Doesn't make it easier to deal with from the userspace
> > perspective IMHO.
> > 
> > To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
> > not exposing the NAL header type is a good option. We've seen that
> > providing alignment guarantees for HW expecting raw bitstream (without
> > the start code) might become challenging at some point. So I'd opt for
> > making this selection explicit. After all, it's just an extra control
> > to set from userspace, and 2 extra switch-case: one to select the most
> > appropriate NAL header type, and another one to fill the buffer with
> > the appropriate header (if there's one).

I must admit I'm confused by what you mean about NAL header type, I thought
we weren't trying to support AVC, and only the Annex B bitstream format.
 
That said, I don't see the interface getting any simpler with a unified
pixfmt, given the menu control to expose frame or slice decoding.

Proper applications need to support both modes anyway, where in the latter
it'll have to parse the bitstream to extract the slices. What's so bad
about just supporting an extra pixel format, where the slices are stripped
of its start codes and zero-byte elements?

And how come this is any more complex than exposing alignment constraints,
so that applications can artificially add leading_zero_8bits or trailing_zero_8bits
elements to comply with a driver dma alignment. To be honest, the more I think
about it, the more this option sounds just horrible :-)

To me, it's far simpler to just expose what the devices support. If a driver
will expect to parse the bitstream, and accepts multi-slice content,
we expose that as a bitstream pixfmt. And if another driver expects
no-start-code slices, then we have another pixfmt.

Regards,
Eze


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-27 12:52                 ` Ezequiel Garcia
@ 2019-07-27 13:49                   ` Boris Brezillon
  2019-07-29 13:33                     ` Paul Kocialkowski
  0 siblings, 1 reply; 29+ messages in thread
From: Boris Brezillon @ 2019-07-27 13:49 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Paul Kocialkowski, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Sat, 27 Jul 2019 09:52:24 -0300
Ezequiel Garcia <ezequiel@collabora.com> wrote:


> > > 
> > > That's not my understanding of the Annex B section (quoting the spec
> > > for reference):
> > > 
> > > "
> > > The byte stream format consists of a sequence of byte stream NAL unit
> > > syntax structures. Each byte stream NAL unit syntax structure contains
> > > one start code prefix followed by one nal_unit( NumBytesInNALunit )
> > > syntax structure. It may (and under some circumstances, it shall) also
> > > contain an additional zero_byte syntax element. It may also contain one
> > > or more additional trailing_zero_8bits syntax elements. When it is the
> > > first byte stream NAL unit in the bitstream, it may also contain one or
> > > more additional leading_zero_8bits syntax elements.
> > > "
> > >   
> 
> Right. I wonder what the "may or shall" part is really specifying.
> 
> However, note that the table B.1.1 and its comments B.1.2 is might
> be interpreted differently. To me, there's a difference between the different
> syntax elements (zero-bytes elements vs. the start code prefix element).
> 
> This is what it says about the zero_byte syntax element:
> 
> """
> zero_byte is a single byte equal to 0x00.
> When any of the following conditions are fulfilled, the zero_byte syntax element shall be present.
> – the nal_unit_type within the nal_unit( ) is equal to 7 (sequence parameter set) or 8 (picture parameter set)
> – the byte stream NAL unit syntax structure contains the first NAL unit of an access unit in decoding order, as
> specified by subclause 7.4.1.2.3.
> """
> 
> We are not dealing with SPS or PPS here, but we are discussing multislice content,
> so IIUC this syntax element would be part of our bitstream pixfmt.
> 
> And this is what it says about the start code prefix:
> 
> """
> start_code_prefix_one_3bytes is a fixed-value sequence of 3 bytes equal to 0x000001. This syntax element is called a
> start code prefix.
> """
> 
> These elements are used in such a way that it might seem
> you have two start codes options 3-byte or 4-byte, though.

This is correct, but I was actually referring to:

"
It may also contain one or more additional trailing_zero_8bits syntax
elements. When it is the first byte stream NAL unit in the
bitstream, it may also contain one or more additional
leading_zero_8bits syntax elements.
"

which would allow userspace to put additional zeros at the beginning
in order to fulfill the HW alignment constraints. I'm not saying this is
a good solution, just saying it can be done.


> > > > > I guess it's not such a high price to pay for a unified codec interface :)    
> > > 
> > > If by unified you mean exposing only one pixel format, then yes, it's
> > > unified. Doesn't make it easier to deal with from the userspace
> > > perspective IMHO.
> > > 
> > > To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
> > > not exposing the NAL header type is a good option. We've seen that
> > > providing alignment guarantees for HW expecting raw bitstream (without
> > > the start code) might become challenging at some point. So I'd opt for
> > > making this selection explicit. After all, it's just an extra control
> > > to set from userspace, and 2 extra switch-case: one to select the most
> > > appropriate NAL header type, and another one to fill the buffer with
> > > the appropriate header (if there's one).  
> 
> I must admit I'm confused by what you mean about NAL header type, I thought
> we weren't trying to support AVC, and only the Annex B bitstream format.

I'm not trying to support AVC headers, but designing something that
allows us to extend the interface and support that case (if we ever need
to) is a good thing IMO.

>  
> That said, I don't see the interface getting any simpler with a unified
> pixfmt, given the menu control to expose frame or slice decoding.

I agree. I think it's pretty much the same complexity anyway
('additional ctrl to set the start-code/header/preamble type' vs
'additional pixfmt'), so it's mostly a matter of taste.

> 
> Proper applications need to support both modes anyway, where in the latter
> it'll have to parse the bitstream to extract the slices.

Hm, the current uAPI forces us to pass slice offsets, which means we
have to parse the bitstream anyway. I think we should keep it like that
because I don't think we can assume the HW is smart enough to detect
slice boundaries.

> What's so bad
> about just supporting an extra pixel format, where the slices are stripped
> of its start codes and zero-byte elements?

I'm not opposed to that solution, but Paul is, so I'm just trying to
find something that makes everyone happy, hence the "NAL header
type" (or "start code type"/"preamble type" if you prefer, though
it's not really a start code for AVC) proposal.

> 
> And how come this is any more complex than exposing alignment constraints,
> so that applications can artificially add leading_zero_8bits or trailing_zero_8bits
> elements to comply with a driver dma alignment. To be honest, the more I think
> about it, the more this option sounds just horrible :-)

Also think this option is more complicated and less future proof
(AFAICT, AVC headers/start-codes can't be extended like Annex B ones).

> 
> To me, it's far simpler to just expose what the devices support. If a driver
> will expect to parse the bitstream, and accepts multi-slice content,
> we expose that as a bitstream pixfmt.

Those 2 problems are orthogonal. You could have HW dealing with
multi-slice content while still requiring things to be passed without
Annex B start codes. The H264 pixfmt is really just about NAL headers:
No NAL headers vs Annex B headers vs AVC headers.

> And if another driver expects
> no-start-code slices, then we have another pixfmt.

Yep, but I don't want to argue endlessly on this, and I'd be fine with
the "NAL header/preamble/start-code/whatever type ctrl" too.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-27  9:46                     ` Boris Brezillon
@ 2019-07-29 13:25                       ` Paul Kocialkowski
  2019-07-29 14:19                         ` Boris Brezillon
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-29 13:25 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Hans Verkuil, Ezequiel Garcia, Hans Verkuil, Tomasz Figa,
	Nicolas Dufresne, Mauro Carvalho Chehab, Laurent Pinchart,
	Sakari Ailus, linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

[-- Attachment #1: Type: text/plain, Size: 6882 bytes --]

Hi,

On Sat 27 Jul 19, 11:46, Boris Brezillon wrote:
> On Sat, 27 Jul 2019 11:27:43 +0200
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> 
> > Hi,
> > 
> > On Fri 26 Jul 19, 10:53, Hans Verkuil wrote:
> > > On 7/26/19 9:30 AM, Boris Brezillon wrote:  
> > > > On Fri, 26 Jul 2019 08:28:28 +0200
> > > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > > >   
> > > >> On Thu, 25 Jul 2019 23:39:11 -0300
> > > >> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > >>  
> > > >>> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:    
> > > >>>> Having a control that specifies an alignment constraint for the slice beginning
> > > >>>> could work (as long as we make it optional, although userspace should be
> > > >>>> required to abide by it when it is present).    
> > > >>
> > > >> By making that, you put the burden on both sides of the stack:
> > > >>
> > > >> - the kernel side will have to deal with the unaligned cases (using a
> > > >>   bounce buffer)
> > > >> - userspace apps/libs that want to avoid an extra copy will have to
> > > >>   check this constraint and align things properly anyway  
> > > > 
> > > > I'd like to revise my statement. Ideally, the drivers should take care
> > > > of such mis-alignments or unsupported NAL header types by
> > > > copying/re-formatting the OUTPUT buffer so that existing apps work
> > > > out of the box when the driver is added, which means we'll have to take
> > > > care of that kernel-side anyway. Handling selection of the best
> > > > encoding-mode/NAL-header-type in userspace is useful if one wants to
> > > > improve perfs.  
> > > 
> > > Just my 5 cents:
> > > 
> > > You very much want to avoid the situation where drivers have to copy or
> > > reformat the OUTPUT buffer. That's asking for problems, not to mention
> > > that it is no longer zero-copy.  
> > 
> > I definitely agree on that, since such constraints are likely to exist, we are
> > certainly better off exposing them to userspace.
> > 
> > I understand that it does add some complexity and asks for userspace code to be
> > more complex, but let's be realistic: this is a complex topic with lots of
> > hardware-specific details getting in the way. I don't think we can act as if
> > things were simpler.
> > 
> > My feeling is that we should keep trying to find "as elegant as possible" ways
> > to expose constraints instead of putting strict and easy definitions for
> > userspace that end up making drivers perform sub-optimally.
> > 
> > Since the initial cedrus proposal, we have covered more ground to allow the
> > API to fit the rockchip case, without conflicting with cedrus. We're now facing
> > new constraints and issue and I really think we should keep trying to integrate
> > them in the unified API.
> > 
> > > >> Plus, the alignment thing won't work for AVC headers, so I think we
> > > >> should actually have a control to select the NAL header type rather
> > > >> than expose some alignment constraints (or have one pix fmt per NAL
> > > >> header type, but you don't seem to like the idea, so I'm trying
> > > >> to find something else :-)).
> > > >>
> > > >> And if we go for this option (control to select the NAL header type),
> > > >> I'm wondering why we're not making that NAL-header type selection
> > > >> mandatory from the start. We don't have to support all NAL headers at
> > > >> first (can be Annex B only), but, by making this control selection
> > > >> non-optional, we'll at least give a decent feedback to userspace
> > > >> (setting NAL header control fails because the selected NAL header type
> > > >> is not supported by the HW) instead of returning an error on the
> > > >> decoding operation (which, depending on how verbose the driver is, can
> > > >> be quite hard to figure out).  
> > > 
> > > This sounds reasonable.
> > > 
> > > This control should be mandatory, and it should be referred to from
> > > the H264/5 pixelformat definitions (see also https://patchwork.linuxtv.org/patch/57709/).  
> > 
> > I am growing confused about one thing: are we talking about selecting
> > the type of *start code* (which can have a variable number of heading and
> > trailing zeros depending on the situation) or about the *NAL header type*, which
> > follows the start code?
> 
> We're talking about start codes, but Nicolas called them nal_header in
> this email [1], so I thought it was the appropriate naming.

Okay, the representation I had in mind was:
[zeros][start code][nal unit header][zeros][nal unit data]

but maybe I'm mixing things up on my side.

> > I like the idea of drivers providing what types of start codes they can support,
> > but I don't really see how it helps regarding the alignment constraints and how
> > it relates to the zero-padding.
> 
> It does help with alignment constraints because buffers allocated by
> the driver are usually matching the HW alignment constraints and by
> passing the type of NAL header (or start code if you prefer) we now
> guarantee that the raw bitstream (when in NO_NAL_HEADER is selected) is
> placed at the beginning of the buffer.

I thought the issue at hand was that we needed the nal unit header to start at
an aligned address while still needing a start code of 3 bytes. I feel like I'm
missing something in my understanding of the issue here.

> Doesn't solve the case of
> imported buffers, but that problem is orthogonal I think (it's a
> problem we already have right now, and would indeed require some way
> to expose HW alignment constraints).

If we expose the constraints explicitly, then we can honestly say that it's up
to user-space to abide by them so there should be no particular difference
between imported and allocated buffers. Userspace just has to know what it's
doing. And drivers chould refuse imports that don't follow the reported
constraints (or are outside the pool dedicated to the VPU).

> Not sure what the zero padding issue is. If you know the type of start
> code, you don't have add extra 0 at the beginning to meet the alignment
> constraints. If you're talking about padding bytes added at the end of
> the bitstream (there's such a constraint on the rkvdec block), I think
> that's something driver specific and should be handled by the driver.

My point would rather be that it is (as far as I understood) valid regarding the
H.264 spec to have extra zeros added by userspace (whatever the reason), so even
if the type of start code is reported, it doesn't imply that we know the length
of the heading zeros and start code, so the issue remains.

Cheers,

Paul

> [1]https://www.mail-archive.com/linux-media@vger.kernel.org/msg146836.html

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-27 13:49                   ` Boris Brezillon
@ 2019-07-29 13:33                     ` Paul Kocialkowski
  0 siblings, 0 replies; 29+ messages in thread
From: Paul Kocialkowski @ 2019-07-29 13:33 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Ezequiel Garcia, Hans Verkuil, Tomasz Figa, Nicolas Dufresne,
	Mauro Carvalho Chehab, Laurent Pinchart, Sakari Ailus,
	linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

[-- Attachment #1: Type: text/plain, Size: 7391 bytes --]

Hi,

On Sat 27 Jul 19, 15:49, Boris Brezillon wrote:
> On Sat, 27 Jul 2019 09:52:24 -0300
> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> 
> 
> > > > 
> > > > That's not my understanding of the Annex B section (quoting the spec
> > > > for reference):
> > > > 
> > > > "
> > > > The byte stream format consists of a sequence of byte stream NAL unit
> > > > syntax structures. Each byte stream NAL unit syntax structure contains
> > > > one start code prefix followed by one nal_unit( NumBytesInNALunit )
> > > > syntax structure. It may (and under some circumstances, it shall) also
> > > > contain an additional zero_byte syntax element. It may also contain one
> > > > or more additional trailing_zero_8bits syntax elements. When it is the
> > > > first byte stream NAL unit in the bitstream, it may also contain one or
> > > > more additional leading_zero_8bits syntax elements.
> > > > "
> > > >   
> > 
> > Right. I wonder what the "may or shall" part is really specifying.
> > 
> > However, note that the table B.1.1 and its comments B.1.2 is might
> > be interpreted differently. To me, there's a difference between the different
> > syntax elements (zero-bytes elements vs. the start code prefix element).
> > 
> > This is what it says about the zero_byte syntax element:
> > 
> > """
> > zero_byte is a single byte equal to 0x00.
> > When any of the following conditions are fulfilled, the zero_byte syntax element shall be present.
> > – the nal_unit_type within the nal_unit( ) is equal to 7 (sequence parameter set) or 8 (picture parameter set)
> > – the byte stream NAL unit syntax structure contains the first NAL unit of an access unit in decoding order, as
> > specified by subclause 7.4.1.2.3.
> > """
> > 
> > We are not dealing with SPS or PPS here, but we are discussing multislice content,
> > so IIUC this syntax element would be part of our bitstream pixfmt.
> > 
> > And this is what it says about the start code prefix:
> > 
> > """
> > start_code_prefix_one_3bytes is a fixed-value sequence of 3 bytes equal to 0x000001. This syntax element is called a
> > start code prefix.
> > """
> > 
> > These elements are used in such a way that it might seem
> > you have two start codes options 3-byte or 4-byte, though.
> 
> This is correct, but I was actually referring to:
> 
> "
> It may also contain one or more additional trailing_zero_8bits syntax
> elements. When it is the first byte stream NAL unit in the
> bitstream, it may also contain one or more additional
> leading_zero_8bits syntax elements.
> "
> 
> which would allow userspace to put additional zeros at the beginning
> in order to fulfill the HW alignment constraints. I'm not saying this is
> a good solution, just saying it can be done.
> 
> 
> > > > > > I guess it's not such a high price to pay for a unified codec interface :)    
> > > > 
> > > > If by unified you mean exposing only one pixel format, then yes, it's
> > > > unified. Doesn't make it easier to deal with from the userspace
> > > > perspective IMHO.
> > > > 
> > > > To sum-up, I'm fine keeping one pixel format, but I'm no longer sure
> > > > not exposing the NAL header type is a good option. We've seen that
> > > > providing alignment guarantees for HW expecting raw bitstream (without
> > > > the start code) might become challenging at some point. So I'd opt for
> > > > making this selection explicit. After all, it's just an extra control
> > > > to set from userspace, and 2 extra switch-case: one to select the most
> > > > appropriate NAL header type, and another one to fill the buffer with
> > > > the appropriate header (if there's one).  
> > 
> > I must admit I'm confused by what you mean about NAL header type, I thought
> > we weren't trying to support AVC, and only the Annex B bitstream format.
> 
> I'm not trying to support AVC headers, but designing something that
> allows us to extend the interface and support that case (if we ever need
> to) is a good thing IMO.
> 
> >  
> > That said, I don't see the interface getting any simpler with a unified
> > pixfmt, given the menu control to expose frame or slice decoding.
> 
> I agree. I think it's pretty much the same complexity anyway
> ('additional ctrl to set the start-code/header/preamble type' vs
> 'additional pixfmt'), so it's mostly a matter of taste.

The reason why I am strongly in favor of a single format is that the combinatory
possibilities of all the little things that can be different would eventually
lead us to having one pixel format per hardware implementation and I believe it
makes no sense. I really don't find it to be an elegant or scalable way to
expose the differences between decoder implementations.

So the approach I currently like best is to have as much information as possible
passed to the driver, both as offsets to each relevant part of the data in codec
controls and through specific controls (or maybe pixfmt flags could be relevant
in some cases) that reflect hardware-specific constraints.

Apparently this is also the approach on stateful codecs (e.g. the patch exposing
whether boundaries can be detected by the hardware or not, without adding a new
pixfmt for each case).

> > 
> > Proper applications need to support both modes anyway, where in the latter
> > it'll have to parse the bitstream to extract the slices.
> 
> Hm, the current uAPI forces us to pass slice offsets, which means we
> have to parse the bitstream anyway. I think we should keep it like that
> because I don't think we can assume the HW is smart enough to detect
> slice boundaries.

Agreed.

Cheers,

Paul

> > What's so bad
> > about just supporting an extra pixel format, where the slices are stripped
> > of its start codes and zero-byte elements?
> 
> I'm not opposed to that solution, but Paul is, so I'm just trying to
> find something that makes everyone happy, hence the "NAL header
> type" (or "start code type"/"preamble type" if you prefer, though
> it's not really a start code for AVC) proposal.
> 
> > 
> > And how come this is any more complex than exposing alignment constraints,
> > so that applications can artificially add leading_zero_8bits or trailing_zero_8bits
> > elements to comply with a driver dma alignment. To be honest, the more I think
> > about it, the more this option sounds just horrible :-)
> 
> Also think this option is more complicated and less future proof
> (AFAICT, AVC headers/start-codes can't be extended like Annex B ones).
> 
> > 
> > To me, it's far simpler to just expose what the devices support. If a driver
> > will expect to parse the bitstream, and accepts multi-slice content,
> > we expose that as a bitstream pixfmt.
> 
> Those 2 problems are orthogonal. You could have HW dealing with
> multi-slice content while still requiring things to be passed without
> Annex B start codes. The H264 pixfmt is really just about NAL headers:
> No NAL headers vs Annex B headers vs AVC headers.
> 
> > And if another driver expects
> > no-start-code slices, then we have another pixfmt.
> 
> Yep, but I don't want to argue endlessly on this, and I'd be fine with
> the "NAL header/preamble/start-code/whatever type ctrl" too.

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format
  2019-07-29 13:25                       ` Paul Kocialkowski
@ 2019-07-29 14:19                         ` Boris Brezillon
  0 siblings, 0 replies; 29+ messages in thread
From: Boris Brezillon @ 2019-07-29 14:19 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: Hans Verkuil, Ezequiel Garcia, Hans Verkuil, Tomasz Figa,
	Nicolas Dufresne, Mauro Carvalho Chehab, Laurent Pinchart,
	Sakari Ailus, linux-media, kernel, Maxime Ripard, Jonas Karlman,
	Jernej Skrabec, Alexandre Courbot, Thierry Reding

On Mon, 29 Jul 2019 15:25:21 +0200
Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:

> Hi,
> 
> On Sat 27 Jul 19, 11:46, Boris Brezillon wrote:
> > On Sat, 27 Jul 2019 11:27:43 +0200
> > Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> >   
> > > Hi,
> > > 
> > > On Fri 26 Jul 19, 10:53, Hans Verkuil wrote:  
> > > > On 7/26/19 9:30 AM, Boris Brezillon wrote:    
> > > > > On Fri, 26 Jul 2019 08:28:28 +0200
> > > > > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > > > >     
> > > > >> On Thu, 25 Jul 2019 23:39:11 -0300
> > > > >> Ezequiel Garcia <ezequiel@collabora.com> wrote:
> > > > >>    
> > > > >>> On Thu, 2019-07-25 at 21:36 +0200, Paul Kocialkowski wrote:      
> > > > >>>> Having a control that specifies an alignment constraint for the slice beginning
> > > > >>>> could work (as long as we make it optional, although userspace should be
> > > > >>>> required to abide by it when it is present).      
> > > > >>
> > > > >> By making that, you put the burden on both sides of the stack:
> > > > >>
> > > > >> - the kernel side will have to deal with the unaligned cases (using a
> > > > >>   bounce buffer)
> > > > >> - userspace apps/libs that want to avoid an extra copy will have to
> > > > >>   check this constraint and align things properly anyway    
> > > > > 
> > > > > I'd like to revise my statement. Ideally, the drivers should take care
> > > > > of such mis-alignments or unsupported NAL header types by
> > > > > copying/re-formatting the OUTPUT buffer so that existing apps work
> > > > > out of the box when the driver is added, which means we'll have to take
> > > > > care of that kernel-side anyway. Handling selection of the best
> > > > > encoding-mode/NAL-header-type in userspace is useful if one wants to
> > > > > improve perfs.    
> > > > 
> > > > Just my 5 cents:
> > > > 
> > > > You very much want to avoid the situation where drivers have to copy or
> > > > reformat the OUTPUT buffer. That's asking for problems, not to mention
> > > > that it is no longer zero-copy.    
> > > 
> > > I definitely agree on that, since such constraints are likely to exist, we are
> > > certainly better off exposing them to userspace.
> > > 
> > > I understand that it does add some complexity and asks for userspace code to be
> > > more complex, but let's be realistic: this is a complex topic with lots of
> > > hardware-specific details getting in the way. I don't think we can act as if
> > > things were simpler.
> > > 
> > > My feeling is that we should keep trying to find "as elegant as possible" ways
> > > to expose constraints instead of putting strict and easy definitions for
> > > userspace that end up making drivers perform sub-optimally.
> > > 
> > > Since the initial cedrus proposal, we have covered more ground to allow the
> > > API to fit the rockchip case, without conflicting with cedrus. We're now facing
> > > new constraints and issue and I really think we should keep trying to integrate
> > > them in the unified API.
> > >   
> > > > >> Plus, the alignment thing won't work for AVC headers, so I think we
> > > > >> should actually have a control to select the NAL header type rather
> > > > >> than expose some alignment constraints (or have one pix fmt per NAL
> > > > >> header type, but you don't seem to like the idea, so I'm trying
> > > > >> to find something else :-)).
> > > > >>
> > > > >> And if we go for this option (control to select the NAL header type),
> > > > >> I'm wondering why we're not making that NAL-header type selection
> > > > >> mandatory from the start. We don't have to support all NAL headers at
> > > > >> first (can be Annex B only), but, by making this control selection
> > > > >> non-optional, we'll at least give a decent feedback to userspace
> > > > >> (setting NAL header control fails because the selected NAL header type
> > > > >> is not supported by the HW) instead of returning an error on the
> > > > >> decoding operation (which, depending on how verbose the driver is, can
> > > > >> be quite hard to figure out).    
> > > > 
> > > > This sounds reasonable.
> > > > 
> > > > This control should be mandatory, and it should be referred to from
> > > > the H264/5 pixelformat definitions (see also https://patchwork.linuxtv.org/patch/57709/).    
> > > 
> > > I am growing confused about one thing: are we talking about selecting
> > > the type of *start code* (which can have a variable number of heading and
> > > trailing zeros depending on the situation) or about the *NAL header type*, which
> > > follows the start code?  
> > 
> > We're talking about start codes, but Nicolas called them nal_header in
> > this email [1], so I thought it was the appropriate naming.  
> 
> Okay, the representation I had in mind was:
> [zeros][start code][nal unit header][zeros][nal unit data]
> 
> but maybe I'm mixing things up on my side.
> 
> > > I like the idea of drivers providing what types of start codes they can support,
> > > but I don't really see how it helps regarding the alignment constraints and how
> > > it relates to the zero-padding.  
> > 
> > It does help with alignment constraints because buffers allocated by
> > the driver are usually matching the HW alignment constraints and by
> > passing the type of NAL header (or start code if you prefer) we now
> > guarantee that the raw bitstream (when in NO_NAL_HEADER is selected) is
> > placed at the beginning of the buffer.  
> 
> I thought the issue at hand was that we needed the nal unit header to start at
> an aligned address

That's exactly the problem I'm trying to solve, and having a solution
that allows us to pass data starting at [nal unit header] when the
HW is not able to parse Annex B start codes solves that.

> while still needing a start code of 3 bytes.

Why do we need a start code if the HW doesn't care about it?

> I feel like I'm
> missing something in my understanding of the issue here.
> 
> > Doesn't solve the case of
> > imported buffers, but that problem is orthogonal I think (it's a
> > problem we already have right now, and would indeed require some way
> > to expose HW alignment constraints).  
> 
> If we expose the constraints explicitly, then we can honestly say that it's up
> to user-space to abide by them so there should be no particular difference
> between imported and allocated buffers. Userspace just has to know what it's
> doing. And drivers chould refuse imports that don't follow the reported
> constraints (or are outside the pool dedicated to the VPU).

Again, this problem is orthogonal to the problem I'm trying to solve.
I'm not saying it shouldn't be addressed at some point, but that's a
completely different topic.

> 
> > Not sure what the zero padding issue is. If you know the type of start
> > code, you don't have add extra 0 at the beginning to meet the alignment
> > constraints. If you're talking about padding bytes added at the end of
> > the bitstream (there's such a constraint on the rkvdec block), I think
> > that's something driver specific and should be handled by the driver.  
> 
> My point would rather be that it is (as far as I understood) valid regarding the
> H.264 spec to have extra zeros added by userspace (whatever the reason),

AFAICT, it's only valid when the bitstream is wrapped with an Annex B
format. Not sure other wrappers are skipping leading/trailing 0-bytes.

> so even
> if the type of start code is reported, it doesn't imply that we know the length
> of the heading zeros and start code, so the issue remains.

Hm, in the NO_START_CODE case we expect userspace to remove those
leading/trailing 0-bytes as well as the 000001 start code pattern.
That means we no longer have to specify an offset, and the payload is
guaranteed to be aligned as the HW expects.



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-07-29 14:19 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03 12:28 [PATCH v3 0/3] media: uapi: h264: First batch of adjusments Boris Brezillon
2019-07-03 12:28 ` [PATCH v3 1/3] media: uapi: h264: Clarify our expectations regarding NAL header format Boris Brezillon
2019-07-05 16:40   ` Ezequiel Garcia
2019-07-05 17:16     ` Boris Brezillon
2019-07-25  6:42       ` Boris Brezillon
2019-07-25 19:36         ` Paul Kocialkowski
2019-07-26  2:39           ` Ezequiel Garcia
2019-07-26  6:28             ` Boris Brezillon
2019-07-26  7:30               ` Boris Brezillon
2019-07-26  8:53                 ` Hans Verkuil
2019-07-27  9:27                   ` Paul Kocialkowski
2019-07-27  9:46                     ` Boris Brezillon
2019-07-29 13:25                       ` Paul Kocialkowski
2019-07-29 14:19                         ` Boris Brezillon
2019-07-27 12:52                 ` Ezequiel Garcia
2019-07-27 13:49                   ` Boris Brezillon
2019-07-29 13:33                     ` Paul Kocialkowski
2019-07-25 19:26   ` Paul Kocialkowski
2019-07-03 12:28 ` [PATCH v3 2/3] media: uapi: h264: Add the concept of decoding mode Boris Brezillon
2019-07-22 15:29   ` Hans Verkuil
2019-07-22 17:54     ` Boris Brezillon
2019-07-22 19:00       ` Hans Verkuil
2019-07-25 19:20   ` Paul Kocialkowski
2019-07-03 12:28 ` [PATCH v3 3/3] media: uapi: h264: Get rid of the p0/b0/b1 ref-lists Boris Brezillon
2019-07-03 17:18   ` Nicolas Dufresne
2019-07-05 15:24     ` Ezequiel Garcia
2019-07-24  3:39   ` Tomasz Figa
2019-07-24  5:46     ` Boris Brezillon
2019-07-25 19:38       ` Paul Kocialkowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).