From: Tomasz Figa <tfiga@chromium.org> To: Nicolas Dufresne <nicolas@ndufresne.ca> Cc: Hans Verkuil <hverkuil@xs4all.nl>, Linux Media Mailing List <linux-media@vger.kernel.org>, Dave Stevenson <dave.stevenson@raspberrypi.org>, Boris Brezillon <boris.brezillon@collabora.com>, Paul Kocialkowski <paul.kocialkowski@bootlin.com>, Stanimir Varbanov <stanimir.varbanov@linaro.org>, Philipp Zabel <p.zabel@pengutronix.de>, Ezequiel Garcia <ezequiel@collabora.com>, Michael Tretter <m.tretter@pengutronix.de>, Sylwester Nawrocki <snawrocki@kernel.org> Subject: Re: [RFC] Stateful codecs and requirements for compressed formats Date: Wed, 3 Jul 2019 17:46:30 +0900 [thread overview] Message-ID: <CAAFQd5CW5=WUkGdv+=TiAM-x5zRFNrDFYVDfzf+En6xh6XUiMA@mail.gmail.com> (raw) In-Reply-To: <5b1362779132c1a47c26cd5080d5eb9920e72db3.camel@ndufresne.ca> On Sat, Jun 29, 2019 at 3:09 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote: > > Le vendredi 28 juin 2019 à 16:34 +0200, Hans Verkuil a écrit : > > Hi all, > > > > I hope I Cc-ed everyone with a stake in this issue. > > > > One recurring question is how a stateful encoder fills buffers and how a stateful > > decoder consumes buffers. > > > > The most generic case is that an encoder produces a bitstream and just fills each > > CAPTURE buffer to the brim before continuing with the next buffer. > > > > I don't think there are drivers that do this, I believe that all drivers just > > output a single compressed frame. For interlaced formats I understand it is either > > one compressed field per buffer, or two compressed fields per buffer (this is > > what I heard, I don't know if this is true). > > > > In any case, I don't think this is specified anywhere. Please correct me if I am > > wrong. > > > > The latest stateful codec spec is here: > > > > https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-mem2mem.html > > > > Assuming what I described above is indeed the case, then I think this should > > be documented. I don't know enough if a flag is needed somewhere to describe > > the behavior for interlaced formats, or can we leave this open and have userspace > > detect this? > > > > > > For decoders it is more complicated. The stateful decoder spec is written with > > the assumption that userspace can just fill each OUTPUT buffer to the brim with > > the compressed bitstream. I.e., no need to split at frame or other boundaries. > > > > See section 4.5.1.7 in the spec. > > > > But I understand that various HW decoders *do* have limitations. I would really > > like to know about those, since that needs to be exposed to userspace somehow. > > > > Specifically, the venus decoder needs to know the resolution of the coded video > > beforehand and it expects a single frame per buffer (how does that work for > > interlaced formats?). > > > > Such requirements mean that some userspace parsing is still required, so these > > decoders are not completely stateful. > > > > Can every codec author give information about their decoder/encoder? > > > > I'll start off with my virtual codec driver: > > > > vicodec: the decoder fully parses the bitstream. The encoder produces a single > > compressed frame per buffer. This driver doesn't yet support interlaced formats, > > but when that is added it will encode one field per buffer. > > > > Let's see what the results are. > > Hans though a summary of what existing userspace expects / assumes > would be nice. > > GStreamer: > ========== > Encodes: > fwht, h263, h264, hevc, jpeg, mpeg4, vp8, vp9 > Decodes: > fwht, h263, h264, hevc, jpeg, mpeg2, mpeg4, vc1, vp8, vp9 > > It assumes that each encoded v4l2_buffer contains exactly one frame > (any format, two fields for interlaced content). It may still work > otherwise, but some issues will appear, timestamp shift, lost of > metadata (e.g. timecode, cc, etc.). > > FFMpeg: > ======= > Encodes: > h263, h264, hevc, mpeg4, vp8 > Decodes: > h263, h264, hevc, mpeg2, mpeg4, vc1, vp8, vp9 > > Similarly to GStreamer, it assumes that one AVPacket will fit one > v4l2_buffer. On the encoding side, it seems less of a problem, but they > don't fully implement the FFMPEG CODEC API for frame matching, which I > suspect would create some ambiguity if it was. > > Chromium: > ========= > Decodes: > H264, VP8, VP9 > Encodes: > H264 VP8 too. It can in theory handle any format V4L2 could expose, but these 2 seem to be the only commonly used codecs used in practice and supported by hardware. > > That is the code I know the less, but the encoder does not seem > affected by the nal alignment. The keyframe flag and timestamps seems > to be used and are likely expected to correlate with the input, so I > suspect that there exist some possible ambiguity if the output is not > full frame. For the decoder, I'll have to ask someone else to comment, > the code is hard to follow and I could not get to the place where > output buffers are filled. I thought the GStreamer code was tough, but > this is quite similarly a mess. Not sure what's so complicated there. There is a clearly isolated function that does the parsing: https://cs.chromium.org/chromium/src/media/gpu/v4l2/v4l2_video_decode_accelerator.cc?rcl=2880fe4f6b246809f1be72c5a5698dced4cd85d1&l=984 It puts special NALUs like SPS and PPS in separate buffers and for frames it's 1 frame (all slices of the frame) : 1 buffer. Best regards, Tomasz
next prev parent reply other threads:[~2019-07-03 8:46 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-28 14:34 Hans Verkuil 2019-06-28 15:21 ` Dave Stevenson 2019-06-28 15:48 ` Nicolas Dufresne 2019-06-29 10:02 ` Dave Stevenson 2019-06-29 12:55 ` Nicolas Dufresne 2019-06-28 16:18 ` Nicolas Dufresne 2019-06-28 18:09 ` Nicolas Dufresne 2019-07-03 8:46 ` Tomasz Figa [this message] 2019-07-03 17:43 ` Nicolas Dufresne 2019-07-10 8:43 ` Hans Verkuil 2019-07-11 1:40 ` Nicolas Dufresne 2019-07-03 8:32 ` Tomasz Figa 2019-07-03 14:46 ` Philipp Zabel 2019-07-03 17:46 ` Nicolas Dufresne 2019-07-10 9:14 ` Hans Verkuil 2019-07-11 12:49 ` Tomasz Figa 2019-07-11 1:42 ` Nicolas Dufresne 2019-07-11 12:47 ` Tomasz Figa
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAAFQd5CW5=WUkGdv+=TiAM-x5zRFNrDFYVDfzf+En6xh6XUiMA@mail.gmail.com' \ --to=tfiga@chromium.org \ --cc=boris.brezillon@collabora.com \ --cc=dave.stevenson@raspberrypi.org \ --cc=ezequiel@collabora.com \ --cc=hverkuil@xs4all.nl \ --cc=linux-media@vger.kernel.org \ --cc=m.tretter@pengutronix.de \ --cc=nicolas@ndufresne.ca \ --cc=p.zabel@pengutronix.de \ --cc=paul.kocialkowski@bootlin.com \ --cc=snawrocki@kernel.org \ --cc=stanimir.varbanov@linaro.org \ --subject='Re: [RFC] Stateful codecs and requirements for compressed formats' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).