[RFP] Which V4L2 ioctls could be replaced by better versions?

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFP] Which V4L2 ioctls could be replaced by better versions?
@ 2018-09-20 14:42 Hans Verkuil
  2018-09-20 18:14 ` Nicolas Dufresne
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Hans Verkuil @ 2018-09-20 14:42 UTC (permalink / raw)
  To: Linux Media Mailing List; +Cc: Laurent Pinchart

Some parts of the V4L2 API are awkward to use and I think it would be
a good idea to look at possible candidates for that.

Examples are the ioctls that use struct v4l2_buffer: the multiplanar support is
really horrible, and writing code to support both single and multiplanar is hard.
We are also running out of fields and the timeval isn't y2038 compliant.

A proof-of-concept is here:

https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95549df06d9900f3559afdbb9da06bd4b22d1f3

It's a bit old, but it gives a good impression of what I have in mind.

Another candidate is VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS:
expressing frame intervals as a fraction is really awkward and so is the fact
that the subdev and 'normal' ioctls are not the same.

Would using nanoseconds or something along those lines for intervals be better?

I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it should
be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise support, I
think.

Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
order to improve single vs multiplanar handling.

It is not the intention to come to a full design, it's more to test the waters
so to speak.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-09-20 14:42 [RFP] Which V4L2 ioctls could be replaced by better versions? Hans Verkuil
@ 2018-09-20 18:14 ` Nicolas Dufresne
  2018-10-03  8:24   ` Tomasz Figa
  2018-10-21 16:17   ` Philipp Zabel
  2018-10-17  8:57 ` Laurent Pinchart
  2018-10-26 11:41 ` Tomasz Figa
  2 siblings, 2 replies; 26+ messages in thread
From: Nicolas Dufresne @ 2018-09-20 18:14 UTC (permalink / raw)
  To: Hans Verkuil, Linux Media Mailing List; +Cc: Laurent Pinchart

[-- Attachment #1: Type: text/plain, Size: 3095 bytes --]

Le jeudi 20 septembre 2018 à 16:42 +0200, Hans Verkuil a écrit :
> Some parts of the V4L2 API are awkward to use and I think it would be
> a good idea to look at possible candidates for that.
> 
> Examples are the ioctls that use struct v4l2_buffer: the multiplanar support is
> really horrible, and writing code to support both single and multiplanar is hard.
> We are also running out of fields and the timeval isn't y2038 compliant.
> 
> A proof-of-concept is here:
> 
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95549df06d9900f3559afdbb9da06bd4b22d1f3
> 
> It's a bit old, but it gives a good impression of what I have in mind.
> 
> Another candidate is VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS:
> expressing frame intervals as a fraction is really awkward and so is the fact
> that the subdev and 'normal' ioctls are not the same.
> 
> Would using nanoseconds or something along those lines for intervals be better?

This one is not a good idea, because you cannot represent well known
rates used a lot in the field. Like 60000/1001 also known as 59.94 Hz.
You could endup with drift issues.

For me, what is the most difficult with this API is the fact that it
uses frame internal (duration) instead of frame rate.

> 
> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
> stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it should
> be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise support, I
> think.

One of the thing to fix, maybe it's doable now, is the differentiation
between allocation size and display size. Pretty much all video capture
code assumes this is display size and ignores the selection API. This
should be documented explicitly.

In fact, the display/allocation dimension isn't very nice, as both
information overlaps in structures. As an example, you call S_FMT with
a display size you want, and it will return you an allocation size
(which yes, can be smaller, because we always round to the middle).

> 
> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
> order to improve single vs multiplanar handling.

Yes, but that would fall in a complete redesign I guess. The buffer
allocation scheme is very inflexible. You can't have buffers of two
dimensions allocated at the same time for the same queue. Worst, you
cannot leave even 1 buffer as your scannout buffer while reallocating
new buffers, this is not permitted by the framework (in software). As a
side effect, there is no way to optimize the resolution changes, you
even have to copy your scannout buffer on the CPU, to free it in order
to proceed. Resolution changes are thus painfully slow, by design.

You also cannot switch from internal buffers to importing buffers
easily (in some case you, like encoder, you cannot do that without
flushing the encoder state).

> 
> It is not the intention to come to a full design, it's more to test the waters
> so to speak.
> 
> Regards,
> 
> 	Hans

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-09-20 18:14 ` Nicolas Dufresne
@ 2018-10-03  8:24   ` Tomasz Figa
  2018-10-16 13:22     ` Hans Verkuil
  2018-10-21 16:28     ` Philipp Zabel
  2018-10-21 16:17   ` Philipp Zabel
  1 sibling, 2 replies; 26+ messages in thread
From: Tomasz Figa @ 2018-10-03  8:24 UTC (permalink / raw)
  To: nicolas; +Cc: Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

On Fri, Sep 21, 2018 at 3:14 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le jeudi 20 septembre 2018 à 16:42 +0200, Hans Verkuil a écrit :
> > Some parts of the V4L2 API are awkward to use and I think it would be
> > a good idea to look at possible candidates for that.
> >
> > Examples are the ioctls that use struct v4l2_buffer: the multiplanar support is
> > really horrible, and writing code to support both single and multiplanar is hard.
> > We are also running out of fields and the timeval isn't y2038 compliant.
> >
> > A proof-of-concept is here:
> >
> > https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95549df06d9900f3559afdbb9da06bd4b22d1f3
> >
> > It's a bit old, but it gives a good impression of what I have in mind.
> >
> > Another candidate is VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS:
> > expressing frame intervals as a fraction is really awkward and so is the fact
> > that the subdev and 'normal' ioctls are not the same.
> >
> > Would using nanoseconds or something along those lines for intervals be better?
>
> This one is not a good idea, because you cannot represent well known
> rates used a lot in the field. Like 60000/1001 also known as 59.94 Hz.
> You could endup with drift issues.
>
> For me, what is the most difficult with this API is the fact that it
> uses frame internal (duration) instead of frame rate.
>
> >
> > I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
> > stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it should
> > be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise support, I
> > think.
>
> One of the thing to fix, maybe it's doable now, is the differentiation
> between allocation size and display size. Pretty much all video capture
> code assumes this is display size and ignores the selection API. This
> should be documented explicitly.

Technically, there is already a distinction between allocation and
display size at format level - width and height could be interpreted
as the rectangle containing the captured video, while bytesperline and
sizeimage would match to the allocation size. The behavior between
drivers seems to be extremely variable - some would just enforce
bytesperline = bpp*width and sizeimage = bytesperline*height, while
others would actually give control over them to the user space.

It's a bit more complicated with video decoders, because the stream,
in addition to the 2 sizes mentioned before, is also characterized by
coded size, which corresponds to the actual number of pixels encoded
in the stream, even if not all contain pixels to be displayed. This is
something that needs to be specified explicitly (and is, in my
documentation patches) in the Codec Interface.

>
> In fact, the display/allocation dimension isn't very nice, as both
> information overlaps in structures. As an example, you call S_FMT with
> a display size you want, and it will return you an allocation size
> (which yes, can be smaller, because we always round to the middle).
>
> >
> > Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
> > order to improve single vs multiplanar handling.
>
> Yes, but that would fall in a complete redesign I guess. The buffer
> allocation scheme is very inflexible. You can't have buffers of two
> dimensions allocated at the same time for the same queue. Worst, you
> cannot leave even 1 buffer as your scannout buffer while reallocating
> new buffers, this is not permitted by the framework (in software). As a
> side effect, there is no way to optimize the resolution changes, you
> even have to copy your scannout buffer on the CPU, to free it in order
> to proceed. Resolution changes are thus painfully slow, by design.

Hmm? There is VIDIOC_CREATEBUFS, which can allows you to allocate
buffers for explicitly specified format, with other buffers already
existing in the queue.

Also, I fail to understand the scanout issue. If one exports a vb2
buffer to a DMA-buf and import it to the scanout engine, it can keep
scanning out from it as long as it want, because the DMA-buf will hold
a reference on the buffer, even if it's removed from the vb2 queue.

>
> You also cannot switch from internal buffers to importing buffers
> easily (in some case you, like encoder, you cannot do that without
> flushing the encoder state).

Hmm, technically VIDIOC_CREATEBUFS accepts the "memory" type value,
but I'm not sure what happens if the queue already has buffers
requested with different one.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-03  8:24   ` Tomasz Figa
@ 2018-10-16 13:22     ` Hans Verkuil
  2018-10-21 16:28     ` Philipp Zabel
  1 sibling, 0 replies; 26+ messages in thread
From: Hans Verkuil @ 2018-10-16 13:22 UTC (permalink / raw)
  To: Tomasz Figa, nicolas; +Cc: Linux Media Mailing List, Laurent Pinchart

On 10/03/18 10:24, Tomasz Figa wrote:
> On Fri, Sep 21, 2018 at 3:14 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>
>> Le jeudi 20 septembre 2018 à 16:42 +0200, Hans Verkuil a écrit :
>>> Some parts of the V4L2 API are awkward to use and I think it would be
>>> a good idea to look at possible candidates for that.
>>>
>>> Examples are the ioctls that use struct v4l2_buffer: the multiplanar support is
>>> really horrible, and writing code to support both single and multiplanar is hard.
>>> We are also running out of fields and the timeval isn't y2038 compliant.
>>>
>>> A proof-of-concept is here:
>>>
>>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95549df06d9900f3559afdbb9da06bd4b22d1f3
>>>
>>> It's a bit old, but it gives a good impression of what I have in mind.
>>>
>>> Another candidate is VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS:
>>> expressing frame intervals as a fraction is really awkward and so is the fact
>>> that the subdev and 'normal' ioctls are not the same.
>>>
>>> Would using nanoseconds or something along those lines for intervals be better?
>>
>> This one is not a good idea, because you cannot represent well known
>> rates used a lot in the field. Like 60000/1001 also known as 59.94 Hz.
>> You could endup with drift issues.
>>
>> For me, what is the most difficult with this API is the fact that it
>> uses frame internal (duration) instead of frame rate.
>>
>>>
>>> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
>>> stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it should
>>> be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise support, I
>>> think.
>>
>> One of the thing to fix, maybe it's doable now, is the differentiation
>> between allocation size and display size. Pretty much all video capture
>> code assumes this is display size and ignores the selection API. This
>> should be documented explicitly.
> 
> Technically, there is already a distinction between allocation and
> display size at format level - width and height could be interpreted
> as the rectangle containing the captured video, while bytesperline and
> sizeimage would match to the allocation size. The behavior between
> drivers seems to be extremely variable - some would just enforce
> bytesperline = bpp*width and sizeimage = bytesperline*height, while
> others would actually give control over them to the user space.
> 
> It's a bit more complicated with video decoders, because the stream,
> in addition to the 2 sizes mentioned before, is also characterized by
> coded size, which corresponds to the actual number of pixels encoded
> in the stream, even if not all contain pixels to be displayed. This is
> something that needs to be specified explicitly (and is, in my
> documentation patches) in the Codec Interface.
> 
>>
>> In fact, the display/allocation dimension isn't very nice, as both
>> information overlaps in structures. As an example, you call S_FMT with
>> a display size you want, and it will return you an allocation size
>> (which yes, can be smaller, because we always round to the middle).
>>
>>>
>>> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
>>> order to improve single vs multiplanar handling.
>>
>> Yes, but that would fall in a complete redesign I guess. The buffer
>> allocation scheme is very inflexible. You can't have buffers of two
>> dimensions allocated at the same time for the same queue. Worst, you
>> cannot leave even 1 buffer as your scannout buffer while reallocating
>> new buffers, this is not permitted by the framework (in software). As a
>> side effect, there is no way to optimize the resolution changes, you
>> even have to copy your scannout buffer on the CPU, to free it in order
>> to proceed. Resolution changes are thus painfully slow, by design.
> 
> Hmm? There is VIDIOC_CREATEBUFS, which can allows you to allocate
> buffers for explicitly specified format, with other buffers already
> existing in the queue.

Of course, we are missing the VIDIOC_DELETEBUFS ioctl. Also, CREATEBUFS
is rather awful. Using v4l2_format in the struct was a major mistake.

> 
> Also, I fail to understand the scanout issue. If one exports a vb2
> buffer to a DMA-buf and import it to the scanout engine, it can keep
> scanning out from it as long as it want, because the DMA-buf will hold
> a reference on the buffer, even if it's removed from the vb2 queue.
> 
>>
>> You also cannot switch from internal buffers to importing buffers
>> easily (in some case you, like encoder, you cannot do that without
>> flushing the encoder state).
> 
> Hmm, technically VIDIOC_CREATEBUFS accepts the "memory" type value,
> but I'm not sure what happens if the queue already has buffers
> requested with different one.

It is not allowed to mix memory types, that will return -EINVAL.

I have to say that this is the first time I had this request.

It is probably doable, but the use-case is not clear to me.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-09-20 14:42 [RFP] Which V4L2 ioctls could be replaced by better versions? Hans Verkuil
  2018-09-20 18:14 ` Nicolas Dufresne
@ 2018-10-17  8:57 ` Laurent Pinchart
  2018-10-17  9:16   ` Hans Verkuil
  2018-10-26 11:41 ` Tomasz Figa
  2 siblings, 1 reply; 26+ messages in thread
From: Laurent Pinchart @ 2018-10-17  8:57 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Linux Media Mailing List

Hi Hans,

On Thursday, 20 September 2018 17:42:15 EEST Hans Verkuil wrote:
> Some parts of the V4L2 API are awkward to use and I think it would be
> a good idea to look at possible candidates for that.
> 
> Examples are the ioctls that use struct v4l2_buffer: the multiplanar support
> is really horrible, and writing code to support both single and multiplanar
> is hard. We are also running out of fields and the timeval isn't y2038
> compliant.
> 
> A proof-of-concept is here:
> 
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95
> 549df06d9900f3559afdbb9da06bd4b22d1f3
> 
> It's a bit old, but it gives a good impression of what I have in mind.
> 
> Another candidate is
> VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS: expressing
> frame intervals as a fraction is really awkward and so is the fact that the
> subdev and 'normal' ioctls are not the same.
> 
> Would using nanoseconds or something along those lines for intervals be
> better?
> 
> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
> stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it
> should be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise
> support, I think.

If we refresh the enumeration ioctls, I propose moving away from the one 
syscall per value model, and returning everything in one (userspace-allocated) 
buffer. The same could apply to all enumerations (such as controls for 
instance), even if we don't address them all in one go.

> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again
> in order to improve single vs multiplanar handling.

If we refresh the G/S/TRY format ioctls (and I think we should), I would 
propose moving to a G/S model with ACTIVE and TRY formats, as for subdevs. 
This should make it easier to construct full device states internally, in 
order to implement proper request API support for formats. We should then also 
document much better how formats and selection rectangles interact.

> It is not the intention to come to a full design, it's more to test the
> waters so to speak.

Another item that we're missing is a way to delete buffers (the counterpart of 
VIDIOC_CREATE_BUFS). As this will introduce holes in the buffer indices, we 
might also need to revamp VIDIOC_CREATE_BUFS (which would give us a chance to 
move away from the format structure passed to that ioctl).

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-17  8:57 ` Laurent Pinchart
@ 2018-10-17  9:16   ` Hans Verkuil
  2018-10-17 20:46     ` Laurent Pinchart
  0 siblings, 1 reply; 26+ messages in thread
From: Hans Verkuil @ 2018-10-17  9:16 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Linux Media Mailing List

On 10/17/2018 10:57 AM, Laurent Pinchart wrote:
> Hi Hans,
> 
> On Thursday, 20 September 2018 17:42:15 EEST Hans Verkuil wrote:
>> Some parts of the V4L2 API are awkward to use and I think it would be
>> a good idea to look at possible candidates for that.
>>
>> Examples are the ioctls that use struct v4l2_buffer: the multiplanar support
>> is really horrible, and writing code to support both single and multiplanar
>> is hard. We are also running out of fields and the timeval isn't y2038
>> compliant.
>>
>> A proof-of-concept is here:
>>
>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95
>> 549df06d9900f3559afdbb9da06bd4b22d1f3
>>
>> It's a bit old, but it gives a good impression of what I have in mind.
>>
>> Another candidate is
>> VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS: expressing
>> frame intervals as a fraction is really awkward and so is the fact that the
>> subdev and 'normal' ioctls are not the same.
>>
>> Would using nanoseconds or something along those lines for intervals be
>> better?
>>
>> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is no
>> stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But it
>> should be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with stepwise
>> support, I think.
> 
> If we refresh the enumeration ioctls, I propose moving away from the one 
> syscall per value model, and returning everything in one (userspace-allocated) 
> buffer. The same could apply to all enumerations (such as controls for 
> instance), even if we don't address them all in one go.

I'm not convinced about this, primarily because I think these enums are done
at configuration time, and rarely if ever while streaming. So does it really
make a difference in practice? Feedback on this would be welcome during the
summit meeting.

> 
>> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again
>> in order to improve single vs multiplanar handling.
> 
> If we refresh the G/S/TRY format ioctls (and I think we should), I would 
> propose moving to a G/S model with ACTIVE and TRY formats, as for subdevs. 
> This should make it easier to construct full device states internally, in 
> order to implement proper request API support for formats. We should then also 
> document much better how formats and selection rectangles interact.

Interesting. I was planning a slide for this.

>> It is not the intention to come to a full design, it's more to test the
>> waters so to speak.
> 
> Another item that we're missing is a way to delete buffers (the counterpart of 
> VIDIOC_CREATE_BUFS). As this will introduce holes in the buffer indices, we 
> might also need to revamp VIDIOC_CREATE_BUFS (which would give us a chance to 
> move away from the format structure passed to that ioctl).
> 

I'm just writing the slide for that :-)

Regards,

	Hans

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-17  9:16   ` Hans Verkuil
@ 2018-10-17 20:46     ` Laurent Pinchart
  2018-10-20 20:28       ` Sakari Ailus
  0 siblings, 1 reply; 26+ messages in thread
From: Laurent Pinchart @ 2018-10-17 20:46 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Linux Media Mailing List

Hi Hans,

On Wednesday, 17 October 2018 12:16:14 EEST Hans Verkuil wrote:
> On 10/17/2018 10:57 AM, Laurent Pinchart wrote:
> > On Thursday, 20 September 2018 17:42:15 EEST Hans Verkuil wrote:
> >> Some parts of the V4L2 API are awkward to use and I think it would be
> >> a good idea to look at possible candidates for that.
> >> 
> >> Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> >> support is really horrible, and writing code to support both single and
> >> multiplanar is hard. We are also running out of fields and the timeval
> >> isn't y2038 compliant.
> >> 
> >> A proof-of-concept is here:
> >> 
> >> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=
> >> a95 549df06d9900f3559afdbb9da06bd4b22d1f3
> >> 
> >> It's a bit old, but it gives a good impression of what I have in mind.
> >> 
> >> Another candidate is
> >> VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS: expressing
> >> frame intervals as a fraction is really awkward and so is the fact that
> >> the subdev and 'normal' ioctls are not the same.
> >> 
> >> Would using nanoseconds or something along those lines for intervals be
> >> better?
> >> 
> >> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is
> >> no stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But
> >> it should be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with
> >> stepwise support, I think.
> > 
> > If we refresh the enumeration ioctls, I propose moving away from the one
> > syscall per value model, and returning everything in one
> > (userspace-allocated) buffer. The same could apply to all enumerations
> > (such as controls for instance), even if we don't address them all in one
> > go.
> 
> I'm not convinced about this, primarily because I think these enums are done
> at configuration time, and rarely if ever while streaming. So does it
> really make a difference in practice? Feedback on this would be welcome
> during the summit meeting.

It's indeed not a hot path in most cases, but if you enumerate formats, frame 
sizes and frame intervals, you end up with three nested loops with lots of 
ioctl calls. This makes the code on the userspace side more complex than it 
should be, and incurs an overhead. If we rework the enumeration ioctls for 
other reasons, I believe we should fix this as wel.

> >> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps,
> >> again in order to improve single vs multiplanar handling.
> > 
> > If we refresh the G/S/TRY format ioctls (and I think we should), I would
> > propose moving to a G/S model with ACTIVE and TRY formats, as for subdevs.
> > This should make it easier to construct full device states internally, in
> > order to implement proper request API support for formats. We should then
> > also document much better how formats and selection rectangles interact.
> 
> Interesting. I was planning a slide for this.
> 
> >> It is not the intention to come to a full design, it's more to test the
> >> waters so to speak.
> > 
> > Another item that we're missing is a way to delete buffers (the
> > counterpart of VIDIOC_CREATE_BUFS). As this will introduce holes in the
> > buffer indices, we might also need to revamp VIDIOC_CREATE_BUFS (which
> > would give us a chance to move away from the format structure passed to
> > that ioctl).
> 
> I'm just writing the slide for that :-)

Thanks :-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-17 20:46     ` Laurent Pinchart
@ 2018-10-20 20:28       ` Sakari Ailus
  0 siblings, 0 replies; 26+ messages in thread
From: Sakari Ailus @ 2018-10-20 20:28 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Hans Verkuil, Linux Media Mailing List

On Wed, Oct 17, 2018 at 11:46:54PM +0300, Laurent Pinchart wrote:
> Hi Hans,
> 
> On Wednesday, 17 October 2018 12:16:14 EEST Hans Verkuil wrote:
> > On 10/17/2018 10:57 AM, Laurent Pinchart wrote:
> > > On Thursday, 20 September 2018 17:42:15 EEST Hans Verkuil wrote:
> > >> Some parts of the V4L2 API are awkward to use and I think it would be
> > >> a good idea to look at possible candidates for that.
> > >> 
> > >> Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> > >> support is really horrible, and writing code to support both single and
> > >> multiplanar is hard. We are also running out of fields and the timeval
> > >> isn't y2038 compliant.
> > >> 
> > >> A proof-of-concept is here:
> > >> 
> > >> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=
> > >> a95 549df06d9900f3559afdbb9da06bd4b22d1f3
> > >> 
> > >> It's a bit old, but it gives a good impression of what I have in mind.
> > >> 
> > >> Another candidate is
> > >> VIDIOC_SUBDEV_ENUM_FRAME_INTERVAL/VIDIOC_ENUM_FRAMEINTERVALS: expressing
> > >> frame intervals as a fraction is really awkward and so is the fact that
> > >> the subdev and 'normal' ioctls are not the same.
> > >> 
> > >> Would using nanoseconds or something along those lines for intervals be
> > >> better?
> > >> 
> > >> I have similar concerns with VIDIOC_SUBDEV_ENUM_FRAME_SIZE where there is
> > >> no stepwise option, making it different from VIDIOC_ENUM_FRAMESIZES. But
> > >> it should be possible to extend VIDIOC_SUBDEV_ENUM_FRAME_SIZE with
> > >> stepwise support, I think.
> > > 
> > > If we refresh the enumeration ioctls, I propose moving away from the one
> > > syscall per value model, and returning everything in one
> > > (userspace-allocated) buffer. The same could apply to all enumerations
> > > (such as controls for instance), even if we don't address them all in one
> > > go.
> > 
> > I'm not convinced about this, primarily because I think these enums are done
> > at configuration time, and rarely if ever while streaming. So does it
> > really make a difference in practice? Feedback on this would be welcome
> > during the summit meeting.
> 
> It's indeed not a hot path in most cases, but if you enumerate formats, frame 
> sizes and frame intervals, you end up with three nested loops with lots of 
> ioctl calls. This makes the code on the userspace side more complex than it 
> should be, and incurs an overhead. If we rework the enumeration ioctls for 
> other reasons, I believe we should fix this as wel.

Agreed; an interface that can tell you how many entries there are, and then
give those to you is more straightforward to use and more efficient. I
wouldn't change how things work just the sake of doing that, though; there
needs to be something else that needs to be changed and cannot be supported
using the existing API as well.

-- 
Sakari Ailus
e-mail: sakari.ailus@iki.fi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-09-20 18:14 ` Nicolas Dufresne
  2018-10-03  8:24   ` Tomasz Figa
@ 2018-10-21 16:17   ` Philipp Zabel
  1 sibling, 0 replies; 26+ messages in thread
From: Philipp Zabel @ 2018-10-21 16:17 UTC (permalink / raw)
  To: Nicolas Dufresne; +Cc: Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

On Thu, Sep 20, 2018 at 02:14:07PM -0400, Nicolas Dufresne wrote:
> > Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
> > order to improve single vs multiplanar handling.
> 
> Yes, but that would fall in a complete redesign I guess. The buffer
> allocation scheme is very inflexible. You can't have buffers of two
> dimensions allocated at the same time for the same queue. Worst, you
> cannot leave even 1 buffer as your scannout buffer while reallocating
> new buffers, this is not permitted by the framework (in software). As a
> side effect, there is no way to optimize the resolution changes, you
> even have to copy your scannout buffer on the CPU, to free it in order
> to proceed. Resolution changes are thus painfully slow, by design.

I've seen the same issue when exporting dmabufs from a V4L2 decoder and
importing them into OpenGL textures. Mesa caches state so aggressively,
even destroying all textures and flushing OpenGL is not enough to remove
all references to the imported resource. Only after another render step
the dmabuf fds are closed and thus make REQBUFS 0 possible on the
exporting capture queue.
This leads to a catch-22 situation during a resolution change, because
we'd already need the new buffers to do an OpenGL render without the old
buffers, so that the old buffers can be released back to V4L2, so that
V4L2 can allocate the new buffers...
It would be very helpful in this situation if exported dmabufs could
just be orphaned by REQBUFS 0.

regards
Philipp

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-03  8:24   ` Tomasz Figa
  2018-10-16 13:22     ` Hans Verkuil
@ 2018-10-21 16:28     ` Philipp Zabel
  2018-10-22  3:37       ` Tomasz Figa
  1 sibling, 1 reply; 26+ messages in thread
From: Philipp Zabel @ 2018-10-21 16:28 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: nicolas, Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

On Wed, Oct 03, 2018 at 05:24:39PM +0900, Tomasz Figa wrote:
[...]
> > Yes, but that would fall in a complete redesign I guess. The buffer
> > allocation scheme is very inflexible. You can't have buffers of two
> > dimensions allocated at the same time for the same queue. Worst, you
> > cannot leave even 1 buffer as your scannout buffer while reallocating
> > new buffers, this is not permitted by the framework (in software). As a
> > side effect, there is no way to optimize the resolution changes, you
> > even have to copy your scannout buffer on the CPU, to free it in order
> > to proceed. Resolution changes are thus painfully slow, by design.
[...]
> Also, I fail to understand the scanout issue. If one exports a vb2
> buffer to a DMA-buf and import it to the scanout engine, it can keep
> scanning out from it as long as it want, because the DMA-buf will hold
> a reference on the buffer, even if it's removed from the vb2 queue.

REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
attachments: vb2_buffer_in_use checks the num_users memop. The refcount
returned by num_users shared between the vmarea handler and dmabuf ops,
so any dmabuf attachment counts towards in_use.

regards
Philipp

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-21 16:28     ` Philipp Zabel
@ 2018-10-22  3:37       ` Tomasz Figa
  2018-10-22  9:49         ` Dave Stevenson
                           ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Tomasz Figa @ 2018-10-22  3:37 UTC (permalink / raw)
  To: pza; +Cc: nicolas, Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

Hi Philipp,

On Mon, Oct 22, 2018 at 1:28 AM Philipp Zabel <pza@pengutronix.de> wrote:
>
> On Wed, Oct 03, 2018 at 05:24:39PM +0900, Tomasz Figa wrote:
> [...]
> > > Yes, but that would fall in a complete redesign I guess. The buffer
> > > allocation scheme is very inflexible. You can't have buffers of two
> > > dimensions allocated at the same time for the same queue. Worst, you
> > > cannot leave even 1 buffer as your scannout buffer while reallocating
> > > new buffers, this is not permitted by the framework (in software). As a
> > > side effect, there is no way to optimize the resolution changes, you
> > > even have to copy your scannout buffer on the CPU, to free it in order
> > > to proceed. Resolution changes are thus painfully slow, by design.
> [...]
> > Also, I fail to understand the scanout issue. If one exports a vb2
> > buffer to a DMA-buf and import it to the scanout engine, it can keep
> > scanning out from it as long as it want, because the DMA-buf will hold
> > a reference on the buffer, even if it's removed from the vb2 queue.
>
> REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
> attachments: vb2_buffer_in_use checks the num_users memop. The refcount
> returned by num_users shared between the vmarea handler and dmabuf ops,
> so any dmabuf attachment counts towards in_use.

Ah, right. I've managed to completely forget about it, since we have a
downstream patch that we attempted to upstream earlier [1], but didn't
have a chance to follow up on the comments and there wasn't much
interest in it in general.

[1] https://lore.kernel.org/patchwork/patch/607853/

Perhaps it would be worth reviving?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-22  3:37       ` Tomasz Figa
@ 2018-10-22  9:49         ` Dave Stevenson
  2018-10-23 13:02         ` Philipp Zabel
  2018-10-27  9:38         ` Nicolas Dufresne
  2 siblings, 0 replies; 26+ messages in thread
From: Dave Stevenson @ 2018-10-22  9:49 UTC (permalink / raw)
  To: Tomasz Figa; +Cc: pza, Nicolas Dufresne, Hans Verkuil, LMML, Laurent Pinchart

Hi Tomasz

On Mon, 22 Oct 2018 at 04:38, Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Philipp,
>
> On Mon, Oct 22, 2018 at 1:28 AM Philipp Zabel <pza@pengutronix.de> wrote:
> >
> > On Wed, Oct 03, 2018 at 05:24:39PM +0900, Tomasz Figa wrote:
> > [...]
> > > > Yes, but that would fall in a complete redesign I guess. The buffer
> > > > allocation scheme is very inflexible. You can't have buffers of two
> > > > dimensions allocated at the same time for the same queue. Worst, you
> > > > cannot leave even 1 buffer as your scannout buffer while reallocating
> > > > new buffers, this is not permitted by the framework (in software). As a
> > > > side effect, there is no way to optimize the resolution changes, you
> > > > even have to copy your scannout buffer on the CPU, to free it in order
> > > > to proceed. Resolution changes are thus painfully slow, by design.
> > [...]
> > > Also, I fail to understand the scanout issue. If one exports a vb2
> > > buffer to a DMA-buf and import it to the scanout engine, it can keep
> > > scanning out from it as long as it want, because the DMA-buf will hold
> > > a reference on the buffer, even if it's removed from the vb2 queue.
> >
> > REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
> > attachments: vb2_buffer_in_use checks the num_users memop. The refcount
> > returned by num_users shared between the vmarea handler and dmabuf ops,
> > so any dmabuf attachment counts towards in_use.
>
> Ah, right. I've managed to completely forget about it, since we have a
> downstream patch that we attempted to upstream earlier [1], but didn't
> have a chance to follow up on the comments and there wasn't much
> interest in it in general.
>
> [1] https://lore.kernel.org/patchwork/patch/607853/
>
> Perhaps it would be worth reviving?

There's been recent interest in this from the LibreElec folk as they
are wanting to shift to using V4L2 M2M for codecs on as many platforms
as possible. They'll use it wrapped by FFmpeg, and they're working on
patches to get FFmpeg exporting dmabufs to be consumed by DRM.

Hans had pointed me at your patch, and I've been using it to get
around the issue.
I did start writing the requested docs changes [1], but I'm afraid
upstreaming stuff has been a low priority for me recently. If that
patch suffices, then feel free to pick it up. Otherwise I'll do so
when I get some time (likely to be a fair number of weeks away).

  Dave

[1] https://github.com/6by9/linux/commit/09619d9427d9d44ae5e72e0e85e64cb3ea9727da

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-22  3:37       ` Tomasz Figa
  2018-10-22  9:49         ` Dave Stevenson
@ 2018-10-23 13:02         ` Philipp Zabel
  2018-10-27  9:38         ` Nicolas Dufresne
  2 siblings, 0 replies; 26+ messages in thread
From: Philipp Zabel @ 2018-10-23 13:02 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: nicolas, Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

Hi Tomasz,

On Mon, Oct 22, 2018 at 12:37:57PM +0900, Tomasz Figa wrote:
[...]
> On Mon, Oct 22, 2018 at 1:28 AM Philipp Zabel <pza@pengutronix.de> wrote:
[...]
> > REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
> > attachments: vb2_buffer_in_use checks the num_users memop. The refcount
> > returned by num_users shared between the vmarea handler and dmabuf ops,
> > so any dmabuf attachment counts towards in_use.
> 
> Ah, right. I've managed to completely forget about it, since we have a
> downstream patch that we attempted to upstream earlier [1], but didn't
> have a chance to follow up on the comments and there wasn't much
> interest in it in general.
> 
> [1] https://lore.kernel.org/patchwork/patch/607853/
> 
> Perhaps it would be worth reviving?

Yes, thanks for the pointer. I've completely missed that patch.

I was under the mistaken impression that there was some technical reason
to keep the queue around until after the last dmabuf attachment is gone,
but everything is properly refcounted.

regards
Philipp

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-09-20 14:42 [RFP] Which V4L2 ioctls could be replaced by better versions? Hans Verkuil
  2018-09-20 18:14 ` Nicolas Dufresne
  2018-10-17  8:57 ` Laurent Pinchart
@ 2018-10-26 11:41 ` Tomasz Figa
  2018-10-26 13:42   ` Laurent Pinchart
  2 siblings, 1 reply; 26+ messages in thread
From: Tomasz Figa @ 2018-10-26 11:41 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Linux Media Mailing List, Laurent Pinchart, Hirokazu Honda

On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> Some parts of the V4L2 API are awkward to use and I think it would be
> a good idea to look at possible candidates for that.
>
> Examples are the ioctls that use struct v4l2_buffer: the multiplanar support is
> really horrible, and writing code to support both single and multiplanar is hard.
> We are also running out of fields and the timeval isn't y2038 compliant.
>
> A proof-of-concept is here:
>
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a95549df06d9900f3559afdbb9da06bd4b22d1f3
>
> It's a bit old, but it gives a good impression of what I have in mind.

On a related, but slightly different note, I'm wondering how we should
handle a case where we have an M format (e.g. NV12M with 2 memory
planes), but only 1 DMA-buf with all planes to import. That generally
means that we have to use the same DMA-buf FD with an offset for each
plane. In theory, v4l2_plane::data_offset could be used for this, but
the documentation says that it should be set by the application only
for OUTPUT planes. Moreover, existing drivers tend to just ignore
it...

There is also the opposite problem. Sometimes the application is given
3 different FDs but pointing to the same buffer. If it has to work
with a video device that only supports non-M formats, it can either
fail, making it unusable, or blindly assume that they all point to the
same buffer and just give the first FD to the video device (we do it
in Chromium, since our allocator is guaranteed to keep all planes of
planar formats in one buffer, if to be used with V4L2).

Something that we could do is allowing the QBUF semantics of M formats
for non-M formats, where the application would fill the planes[] array
for all planes with all the FDs it has and the kernel could then
figure out if they point to the same buffer (i.e. resolve to the same
dma_buf struct) or fail if not.

[...]

> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps, again in
> order to improve single vs multiplanar handling.

I'd definitely be more than happy to see the plane handling unified
between non-M and M formats, in general. The list of problems with
current interface:

1) The userspace has to hardcode the computations of bytesperline for
chroma planes of non-M formats (while they are reported for M
formats).

2) Similarly, offsets of the planes in the buffer for non-M formats
must be explicitly calculated in the application,

3) Drivers have to explicitly handle both non-M and M formats or
otherwise they would suffer from issues with application compatibility
or sharing buffers with other devices (one supporting only M and the
other only non-M),

4) Inconsistency in the meaning of planes[0].sizeimage for non-M
formats and M formats, making it impossible to use planes[0].sizeimage
to set the luma plane size in the hardware for non-M formats (since
it's the total size of all planes).

I might have probably forgotten about something, but generally fixing
the 4 above, would be a really big step forward.

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-26 11:41 ` Tomasz Figa
@ 2018-10-26 13:42   ` Laurent Pinchart
  2019-03-15  4:18     ` Tomasz Figa
  0 siblings, 1 reply; 26+ messages in thread
From: Laurent Pinchart @ 2018-10-26 13:42 UTC (permalink / raw)
  To: Tomasz Figa; +Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda

Hi Tomasz,

On Friday, 26 October 2018 14:41:26 EEST Tomasz Figa wrote:
> On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > Some parts of the V4L2 API are awkward to use and I think it would be
> > a good idea to look at possible candidates for that.
> > 
> > Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> > support is really horrible, and writing code to support both single and
> > multiplanar is hard. We are also running out of fields and the timeval
> > isn't y2038 compliant.
> > 
> > A proof-of-concept is here:
> > 
> > https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a
> > 95549df06d9900f3559afdbb9da06bd4b22d1f3
> > 
> > It's a bit old, but it gives a good impression of what I have in mind.
> 
> On a related, but slightly different note, I'm wondering how we should
> handle a case where we have an M format (e.g. NV12M with 2 memory
> planes), but only 1 DMA-buf with all planes to import. That generally
> means that we have to use the same DMA-buf FD with an offset for each
> plane. In theory, v4l2_plane::data_offset could be used for this, but
> the documentation says that it should be set by the application only
> for OUTPUT planes. Moreover, existing drivers tend to just ignore
> it...

The following patches may be of interest.

https://patchwork.linuxtv.org/patch/29177/
https://patchwork.linuxtv.org/patch/29178/

> There is also the opposite problem. Sometimes the application is given
> 3 different FDs but pointing to the same buffer. If it has to work
> with a video device that only supports non-M formats, it can either
> fail, making it unusable, or blindly assume that they all point to the
> same buffer and just give the first FD to the video device (we do it
> in Chromium, since our allocator is guaranteed to keep all planes of
> planar formats in one buffer, if to be used with V4L2).
> 
> Something that we could do is allowing the QBUF semantics of M formats
> for non-M formats, where the application would fill the planes[] array
> for all planes with all the FDs it has and the kernel could then
> figure out if they point to the same buffer (i.e. resolve to the same
> dma_buf struct) or fail if not.
> 
> [...]
> 
> > Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps,
> > again in order to improve single vs multiplanar handling.
> 
> I'd definitely be more than happy to see the plane handling unified
> between non-M and M formats, in general. The list of problems with
> current interface:
> 
> 1) The userspace has to hardcode the computations of bytesperline for
> chroma planes of non-M formats (while they are reported for M
> formats).
> 
> 2) Similarly, offsets of the planes in the buffer for non-M formats
> must be explicitly calculated in the application,
> 
> 3) Drivers have to explicitly handle both non-M and M formats or
> otherwise they would suffer from issues with application compatibility
> or sharing buffers with other devices (one supporting only M and the
> other only non-M),
> 
> 4) Inconsistency in the meaning of planes[0].sizeimage for non-M
> formats and M formats, making it impossible to use planes[0].sizeimage
> to set the luma plane size in the hardware for non-M formats (since
> it's the total size of all planes).
> 
> I might have probably forgotten about something, but generally fixing
> the 4 above, would be a really big step forward.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-22  3:37       ` Tomasz Figa
  2018-10-22  9:49         ` Dave Stevenson
  2018-10-23 13:02         ` Philipp Zabel
@ 2018-10-27  9:38         ` Nicolas Dufresne
  2018-11-08  7:45           ` Tomasz Figa
  2 siblings, 1 reply; 26+ messages in thread
From: Nicolas Dufresne @ 2018-10-27  9:38 UTC (permalink / raw)
  To: Tomasz Figa, pza; +Cc: Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

Le lundi 22 octobre 2018 à 12:37 +0900, Tomasz Figa a écrit :
> Hi Philipp,
> 
> On Mon, Oct 22, 2018 at 1:28 AM Philipp Zabel <pza@pengutronix.de> wrote:
> > 
> > On Wed, Oct 03, 2018 at 05:24:39PM +0900, Tomasz Figa wrote:
> > [...]
> > > > Yes, but that would fall in a complete redesign I guess. The buffer
> > > > allocation scheme is very inflexible. You can't have buffers of two
> > > > dimensions allocated at the same time for the same queue. Worst, you
> > > > cannot leave even 1 buffer as your scannout buffer while reallocating
> > > > new buffers, this is not permitted by the framework (in software). As a
> > > > side effect, there is no way to optimize the resolution changes, you
> > > > even have to copy your scannout buffer on the CPU, to free it in order
> > > > to proceed. Resolution changes are thus painfully slow, by design.
> > 
> > [...]
> > > Also, I fail to understand the scanout issue. If one exports a vb2
> > > buffer to a DMA-buf and import it to the scanout engine, it can keep
> > > scanning out from it as long as it want, because the DMA-buf will hold
> > > a reference on the buffer, even if it's removed from the vb2 queue.
> > 
> > REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
> > attachments: vb2_buffer_in_use checks the num_users memop. The refcount
> > returned by num_users shared between the vmarea handler and dmabuf ops,
> > so any dmabuf attachment counts towards in_use.
> 
> Ah, right. I've managed to completely forget about it, since we have a
> downstream patch that we attempted to upstream earlier [1], but didn't
> have a chance to follow up on the comments and there wasn't much
> interest in it in general.
> 
> [1] https://lore.kernel.org/patchwork/patch/607853/
> 
> Perhaps it would be worth reviving?

In this patch we should consider a way to tell userspace that this has
been opt in, otherwise existing userspace will have to remain using
sub-optimal copy based reclaiming in order to ensure that renegotiation
can work on older kernel tool. At worst someone could probably do trial
and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
this introduces extra startup time.

> 
> Best regards,
> Tomasz

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-27  9:38         ` Nicolas Dufresne
@ 2018-11-08  7:45           ` Tomasz Figa
  2018-11-09 21:06             ` Nicolas Dufresne
  0 siblings, 1 reply; 26+ messages in thread
From: Tomasz Figa @ 2018-11-08  7:45 UTC (permalink / raw)
  To: nicolas; +Cc: pza, Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

Hi Nicolas,

On Sat, Oct 27, 2018 at 6:38 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le lundi 22 octobre 2018 à 12:37 +0900, Tomasz Figa a écrit :
> > Hi Philipp,
> >
> > On Mon, Oct 22, 2018 at 1:28 AM Philipp Zabel <pza@pengutronix.de> wrote:
> > >
> > > On Wed, Oct 03, 2018 at 05:24:39PM +0900, Tomasz Figa wrote:
> > > [...]
> > > > > Yes, but that would fall in a complete redesign I guess. The buffer
> > > > > allocation scheme is very inflexible. You can't have buffers of two
> > > > > dimensions allocated at the same time for the same queue. Worst, you
> > > > > cannot leave even 1 buffer as your scannout buffer while reallocating
> > > > > new buffers, this is not permitted by the framework (in software). As a
> > > > > side effect, there is no way to optimize the resolution changes, you
> > > > > even have to copy your scannout buffer on the CPU, to free it in order
> > > > > to proceed. Resolution changes are thus painfully slow, by design.
> > >
> > > [...]
> > > > Also, I fail to understand the scanout issue. If one exports a vb2
> > > > buffer to a DMA-buf and import it to the scanout engine, it can keep
> > > > scanning out from it as long as it want, because the DMA-buf will hold
> > > > a reference on the buffer, even if it's removed from the vb2 queue.
> > >
> > > REQBUFS 0 fails if the vb2 buffer is still in use, including from dmabuf
> > > attachments: vb2_buffer_in_use checks the num_users memop. The refcount
> > > returned by num_users shared between the vmarea handler and dmabuf ops,
> > > so any dmabuf attachment counts towards in_use.
> >
> > Ah, right. I've managed to completely forget about it, since we have a
> > downstream patch that we attempted to upstream earlier [1], but didn't
> > have a chance to follow up on the comments and there wasn't much
> > interest in it in general.
> >
> > [1] https://lore.kernel.org/patchwork/patch/607853/
> >
> > Perhaps it would be worth reviving?
>
> In this patch we should consider a way to tell userspace that this has
> been opt in, otherwise existing userspace will have to remain using
> sub-optimal copy based reclaiming in order to ensure that renegotiation
> can work on older kernel tool. At worst someone could probably do trial
> and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
> this introduces extra startup time.

Would such REQBUFS dance be really needed? Couldn't one simply try
reqbufs(0) when it's really needed and if it fails then do the copy,
otherwise just proceed normally?

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-11-08  7:45           ` Tomasz Figa
@ 2018-11-09 21:06             ` Nicolas Dufresne
  2018-11-11  3:43               ` Tomasz Figa
  0 siblings, 1 reply; 26+ messages in thread
From: Nicolas Dufresne @ 2018-11-09 21:06 UTC (permalink / raw)
  To: Tomasz Figa; +Cc: pza, Hans Verkuil, Linux Media Mailing List, Laurent Pinchart

[-- Attachment #1: Type: text/plain, Size: 2560 bytes --]

Le jeudi 08 novembre 2018 à 16:45 +0900, Tomasz Figa a écrit :
> > In this patch we should consider a way to tell userspace that this has
> > been opt in, otherwise existing userspace will have to remain using
> > sub-optimal copy based reclaiming in order to ensure that renegotiation
> > can work on older kernel tool. At worst someone could probably do trial
> > and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
> > this introduces extra startup time.
> 
> Would such REQBUFS dance be really needed? Couldn't one simply try
> reqbufs(0) when it's really needed and if it fails then do the copy,
> otherwise just proceed normally?

In simple program, maybe, in modularized code, where the consumer of
these buffer (the one that is forced to make a copy) does not know the
origin of the DMABuf, it's a bit complicated.

In GStreamer as an example, the producer is a plugin called
libgstvideo4linux2.so, while the common consumer would be libgstkms.so.
They don't know each other. The pipeline would be described as:

  v4l2src ! kmssink

GStreamer does not have an explicit reclaiming mechanism. No one knew
about V4L2 restrictions when this was designed, DMABuf didn't exist and
GStreamer didn't have OMX support.

What we ended up crafting, as a plaster, is that when upstream element
(v4l2src) query a new allocation from downstream (kmssink), we always
copy and return any ancient buffers by copying. kmssink holds on a
buffer because we can't remove the scannout buffer on the display. This
is slow and inefficient, and also totally unneeded if the dmabuf
originate from other kernel subsystems (like DRM).

So what I'd like to be able to do, to support this in a more optimal
and generic way, is to mark the buffers that needs reclaiming before
letting them go. But for that, I would need a flag somewhere to tell me
this kernel allow this.

You got the context, maybe the conclusion is that I should simply do
kernel version check, though I'm sure a lot of people will backport
this, which means that check won't work so well.

Let me know, I understand adding more API is not fun, but as nothing is
ever versionned in the linux-media world, it's really hard to detect
and use new behaviour while supporting what everyone currently run on
their systems.

I would probably try and find a way to implement your suggestion, and
then introduce a flag in the query itself, but I would need to think
about it a little more. It's not as simple as it look like
unfortunately.

Nicolas

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-11-09 21:06             ` Nicolas Dufresne
@ 2018-11-11  3:43               ` Tomasz Figa
  2018-11-12  9:29                 ` Philipp Zabel
  0 siblings, 1 reply; 26+ messages in thread
From: Tomasz Figa @ 2018-11-11  3:43 UTC (permalink / raw)
  To: nicolas, Hans Verkuil; +Cc: pza, Linux Media Mailing List, Laurent Pinchart

On Sat, Nov 10, 2018 at 6:06 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le jeudi 08 novembre 2018 à 16:45 +0900, Tomasz Figa a écrit :
> > > In this patch we should consider a way to tell userspace that this has
> > > been opt in, otherwise existing userspace will have to remain using
> > > sub-optimal copy based reclaiming in order to ensure that renegotiation
> > > can work on older kernel tool. At worst someone could probably do trial
> > > and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
> > > this introduces extra startup time.
> >
> > Would such REQBUFS dance be really needed? Couldn't one simply try
> > reqbufs(0) when it's really needed and if it fails then do the copy,
> > otherwise just proceed normally?
>
> In simple program, maybe, in modularized code, where the consumer of
> these buffer (the one that is forced to make a copy) does not know the
> origin of the DMABuf, it's a bit complicated.
>
> In GStreamer as an example, the producer is a plugin called
> libgstvideo4linux2.so, while the common consumer would be libgstkms.so.
> They don't know each other. The pipeline would be described as:
>
>   v4l2src ! kmssink
>
> GStreamer does not have an explicit reclaiming mechanism. No one knew
> about V4L2 restrictions when this was designed, DMABuf didn't exist and
> GStreamer didn't have OMX support.
>
> What we ended up crafting, as a plaster, is that when upstream element
> (v4l2src) query a new allocation from downstream (kmssink), we always
> copy and return any ancient buffers by copying. kmssink holds on a
> buffer because we can't remove the scannout buffer on the display. This
> is slow and inefficient, and also totally unneeded if the dmabuf
> originate from other kernel subsystems (like DRM).
>
> So what I'd like to be able to do, to support this in a more optimal
> and generic way, is to mark the buffers that needs reclaiming before
> letting them go. But for that, I would need a flag somewhere to tell me
> this kernel allow this.

Okay, got it. Thanks for explaining it.

>
> You got the context, maybe the conclusion is that I should simply do
> kernel version check, though I'm sure a lot of people will backport
> this, which means that check won't work so well.
>
> Let me know, I understand adding more API is not fun, but as nothing is
> ever versionned in the linux-media world, it's really hard to detect
> and use new behaviour while supporting what everyone currently run on
> their systems.
>
> I would probably try and find a way to implement your suggestion, and
> then introduce a flag in the query itself, but I would need to think
> about it a little more. It's not as simple as it look like
> unfortunately.

It sounds like a good fit for a new capability in v4l2_requestbuffers
and v4l2_create_buffers structs [1]. Perhaps something like
V4L2_BUF_CAP_SUPPORTS_FREE_AFTER_EXPORT? Hans, what do you think?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/include/uapi/linux/videodev2.h?h=next-20181109#n866

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-11-11  3:43               ` Tomasz Figa
@ 2018-11-12  9:29                 ` Philipp Zabel
  2018-11-12 15:02                   ` Hans Verkuil
  0 siblings, 1 reply; 26+ messages in thread
From: Philipp Zabel @ 2018-11-12  9:29 UTC (permalink / raw)
  To: Tomasz Figa, nicolas, Hans Verkuil
  Cc: pza, Linux Media Mailing List, Laurent Pinchart

Hi Tomasz,

On Sun, 2018-11-11 at 12:43 +0900, Tomasz Figa wrote:
> On Sat, Nov 10, 2018 at 6:06 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > 
> > Le jeudi 08 novembre 2018 à 16:45 +0900, Tomasz Figa a écrit :
> > > > In this patch we should consider a way to tell userspace that this has
> > > > been opt in, otherwise existing userspace will have to remain using
> > > > sub-optimal copy based reclaiming in order to ensure that renegotiation
> > > > can work on older kernel tool. At worst someone could probably do trial
> > > > and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
> > > > this introduces extra startup time.
> > > 
> > > Would such REQBUFS dance be really needed? Couldn't one simply try
> > > reqbufs(0) when it's really needed and if it fails then do the copy,
> > > otherwise just proceed normally?
> > 
> > In simple program, maybe, in modularized code, where the consumer of
> > these buffer (the one that is forced to make a copy) does not know the
> > origin of the DMABuf, it's a bit complicated.
> > 
> > In GStreamer as an example, the producer is a plugin called
> > libgstvideo4linux2.so, while the common consumer would be libgstkms.so.
> > They don't know each other. The pipeline would be described as:
> > 
> >   v4l2src ! kmssink
> > 
> > GStreamer does not have an explicit reclaiming mechanism. No one knew
> > about V4L2 restrictions when this was designed, DMABuf didn't exist and
> > GStreamer didn't have OMX support.
> > 
> > What we ended up crafting, as a plaster, is that when upstream element
> > (v4l2src) query a new allocation from downstream (kmssink), we always
> > copy and return any ancient buffers by copying. kmssink holds on a
> > buffer because we can't remove the scannout buffer on the display. This
> > is slow and inefficient, and also totally unneeded if the dmabuf
> > originate from other kernel subsystems (like DRM).
> > 
> > So what I'd like to be able to do, to support this in a more optimal
> > and generic way, is to mark the buffers that needs reclaiming before
> > letting them go. But for that, I would need a flag somewhere to tell me
> > this kernel allow this.
> 
> Okay, got it. Thanks for explaining it.
> 
> > 
> > You got the context, maybe the conclusion is that I should simply do
> > kernel version check, though I'm sure a lot of people will backport
> > this, which means that check won't work so well.
> > 
> > Let me know, I understand adding more API is not fun, but as nothing is
> > ever versionned in the linux-media world, it's really hard to detect
> > and use new behaviour while supporting what everyone currently run on
> > their systems.
> > 
> > I would probably try and find a way to implement your suggestion, and
> > then introduce a flag in the query itself, but I would need to think
> > about it a little more. It's not as simple as it look like
> > unfortunately.
> 
> It sounds like a good fit for a new capability in v4l2_requestbuffers
> and v4l2_create_buffers structs [1]. Perhaps something like
> V4L2_BUF_CAP_SUPPORTS_FREE_AFTER_EXPORT? Hans, what do you think?

Maybe V4L2_BUF_CAP_SUPPORTS_ORPHANS? With this patch, while the buffers
are in use, reqbufs(0) doesn't free them, they are orphaned. Also, this
patch allows reqbufs(0) not only after export, but also while mmapped.

regards
Philipp

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-11-12  9:29                 ` Philipp Zabel
@ 2018-11-12 15:02                   ` Hans Verkuil
  0 siblings, 0 replies; 26+ messages in thread
From: Hans Verkuil @ 2018-11-12 15:02 UTC (permalink / raw)
  To: Philipp Zabel, Tomasz Figa, nicolas
  Cc: pza, Linux Media Mailing List, Laurent Pinchart

On 11/12/2018 10:29 AM, Philipp Zabel wrote:
> Hi Tomasz,
> 
> On Sun, 2018-11-11 at 12:43 +0900, Tomasz Figa wrote:
>> On Sat, Nov 10, 2018 at 6:06 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>>
>>> Le jeudi 08 novembre 2018 à 16:45 +0900, Tomasz Figa a écrit :
>>>>> In this patch we should consider a way to tell userspace that this has
>>>>> been opt in, otherwise existing userspace will have to remain using
>>>>> sub-optimal copy based reclaiming in order to ensure that renegotiation
>>>>> can work on older kernel tool. At worst someone could probably do trial
>>>>> and error (reqbufs(1)/mmap/reqbufs(0)) but on CMA with large buffers
>>>>> this introduces extra startup time.
>>>>
>>>> Would such REQBUFS dance be really needed? Couldn't one simply try
>>>> reqbufs(0) when it's really needed and if it fails then do the copy,
>>>> otherwise just proceed normally?
>>>
>>> In simple program, maybe, in modularized code, where the consumer of
>>> these buffer (the one that is forced to make a copy) does not know the
>>> origin of the DMABuf, it's a bit complicated.
>>>
>>> In GStreamer as an example, the producer is a plugin called
>>> libgstvideo4linux2.so, while the common consumer would be libgstkms.so.
>>> They don't know each other. The pipeline would be described as:
>>>
>>>   v4l2src ! kmssink
>>>
>>> GStreamer does not have an explicit reclaiming mechanism. No one knew
>>> about V4L2 restrictions when this was designed, DMABuf didn't exist and
>>> GStreamer didn't have OMX support.
>>>
>>> What we ended up crafting, as a plaster, is that when upstream element
>>> (v4l2src) query a new allocation from downstream (kmssink), we always
>>> copy and return any ancient buffers by copying. kmssink holds on a
>>> buffer because we can't remove the scannout buffer on the display. This
>>> is slow and inefficient, and also totally unneeded if the dmabuf
>>> originate from other kernel subsystems (like DRM).
>>>
>>> So what I'd like to be able to do, to support this in a more optimal
>>> and generic way, is to mark the buffers that needs reclaiming before
>>> letting them go. But for that, I would need a flag somewhere to tell me
>>> this kernel allow this.
>>
>> Okay, got it. Thanks for explaining it.
>>
>>>
>>> You got the context, maybe the conclusion is that I should simply do
>>> kernel version check, though I'm sure a lot of people will backport
>>> this, which means that check won't work so well.
>>>
>>> Let me know, I understand adding more API is not fun, but as nothing is
>>> ever versionned in the linux-media world, it's really hard to detect
>>> and use new behaviour while supporting what everyone currently run on
>>> their systems.
>>>
>>> I would probably try and find a way to implement your suggestion, and
>>> then introduce a flag in the query itself, but I would need to think
>>> about it a little more. It's not as simple as it look like
>>> unfortunately.
>>
>> It sounds like a good fit for a new capability in v4l2_requestbuffers
>> and v4l2_create_buffers structs [1]. Perhaps something like
>> V4L2_BUF_CAP_SUPPORTS_FREE_AFTER_EXPORT? Hans, what do you think?
> 
> Maybe V4L2_BUF_CAP_SUPPORTS_ORPHANS? With this patch, while the buffers
> are in use, reqbufs(0) doesn't free them, they are orphaned. Also, this
> patch allows reqbufs(0) not only after export, but also while mmapped.

Signaling this through a BUF_CAP makes sense. It's really what it is there
for.

If someone can make an updated patch of
https://lore.kernel.org/patchwork/patch/607853/, then it makes sense to
merge it.

How about: 'SUPPORTS_ORPHANED_BUFS'?

Regards,

	Hans

> 
> regards
> Philipp
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2018-10-26 13:42   ` Laurent Pinchart
@ 2019-03-15  4:18     ` Tomasz Figa
  2019-03-17 16:10       ` Laurent Pinchart
  0 siblings, 1 reply; 26+ messages in thread
From: Tomasz Figa @ 2019-03-15  4:18 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda, Boris Brezillon

On Fri, Oct 26, 2018 at 10:42 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Tomasz,
>
> On Friday, 26 October 2018 14:41:26 EEST Tomasz Figa wrote:
> > On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > > Some parts of the V4L2 API are awkward to use and I think it would be
> > > a good idea to look at possible candidates for that.
> > >
> > > Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> > > support is really horrible, and writing code to support both single and
> > > multiplanar is hard. We are also running out of fields and the timeval
> > > isn't y2038 compliant.
> > >
> > > A proof-of-concept is here:
> > >
> > > https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a
> > > 95549df06d9900f3559afdbb9da06bd4b22d1f3
> > >
> > > It's a bit old, but it gives a good impression of what I have in mind.
> >
> > On a related, but slightly different note, I'm wondering how we should
> > handle a case where we have an M format (e.g. NV12M with 2 memory
> > planes), but only 1 DMA-buf with all planes to import. That generally
> > means that we have to use the same DMA-buf FD with an offset for each
> > plane. In theory, v4l2_plane::data_offset could be used for this, but
> > the documentation says that it should be set by the application only
> > for OUTPUT planes. Moreover, existing drivers tend to just ignore
> > it...
>
> The following patches may be of interest.
>
> https://patchwork.linuxtv.org/patch/29177/
> https://patchwork.linuxtv.org/patch/29178/

[+CC Boris]

Thanks Laurent for pointing me to those patches.

Repurposing the data_offset field may sound like a plausible way to do
it, but it's not, for several reasons:

1) The relation between data_offset and other fields in v4l2_buffer
makes it hard to use in drivers and userspace.

2) It is not handled by vb2, so each driver would have to
explicitly add data_offset to the plane address and subtract it from
plane size and/or bytesused (since data_offset counts into plane size
and bytesused),

3) For CAPTURE buffers, it's actually defined as set-by-driver
(https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#struct-v4l2-plane),
so anything userspace sets there is bound to be ignored. I'm not sure
if we can change this now, as it would be a compatibility issue.

(There are actually real use cases for it, i.e. the venus driver
outputs VPx encoded frames prepended with the IVF header, but that's
not what the V4L2 VPx formats expect, so the data_offset is set by the
driver to point to the raw bitstream data.)

Best regards,
Tomasz

>
> > There is also the opposite problem. Sometimes the application is given
> > 3 different FDs but pointing to the same buffer. If it has to work
> > with a video device that only supports non-M formats, it can either
> > fail, making it unusable, or blindly assume that they all point to the
> > same buffer and just give the first FD to the video device (we do it
> > in Chromium, since our allocator is guaranteed to keep all planes of
> > planar formats in one buffer, if to be used with V4L2).
> >
> > Something that we could do is allowing the QBUF semantics of M formats
> > for non-M formats, where the application would fill the planes[] array
> > for all planes with all the FDs it has and the kernel could then
> > figure out if they point to the same buffer (i.e. resolve to the same
> > dma_buf struct) or fail if not.
> >
> > [...]
> >
> > > Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps,
> > > again in order to improve single vs multiplanar handling.
> >
> > I'd definitely be more than happy to see the plane handling unified
> > between non-M and M formats, in general. The list of problems with
> > current interface:
> >
> > 1) The userspace has to hardcode the computations of bytesperline for
> > chroma planes of non-M formats (while they are reported for M
> > formats).
> >
> > 2) Similarly, offsets of the planes in the buffer for non-M formats
> > must be explicitly calculated in the application,
> >
> > 3) Drivers have to explicitly handle both non-M and M formats or
> > otherwise they would suffer from issues with application compatibility
> > or sharing buffers with other devices (one supporting only M and the
> > other only non-M),
> >
> > 4) Inconsistency in the meaning of planes[0].sizeimage for non-M
> > formats and M formats, making it impossible to use planes[0].sizeimage
> > to set the luma plane size in the hardware for non-M formats (since
> > it's the total size of all planes).
> >
> > I might have probably forgotten about something, but generally fixing
> > the 4 above, would be a really big step forward.
>
> --
> Regards,
>
> Laurent Pinchart
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2019-03-15  4:18     ` Tomasz Figa
@ 2019-03-17 16:10       ` Laurent Pinchart
  2019-03-17 17:10         ` Nicolas Dufresne
                           ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Laurent Pinchart @ 2019-03-17 16:10 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda, Boris Brezillon

Hi Tomasz,

On Fri, Mar 15, 2019 at 01:18:17PM +0900, Tomasz Figa wrote:
> On Fri, Oct 26, 2018 at 10:42 PM Laurent Pinchart wrote:
> > On Friday, 26 October 2018 14:41:26 EEST Tomasz Figa wrote:
> >> On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>> Some parts of the V4L2 API are awkward to use and I think it would be
> >>> a good idea to look at possible candidates for that.
> >>>
> >>> Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> >>> support is really horrible, and writing code to support both single and
> >>> multiplanar is hard. We are also running out of fields and the timeval
> >>> isn't y2038 compliant.
> >>>
> >>> A proof-of-concept is here:
> >>>
> >>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a
> >>> 95549df06d9900f3559afdbb9da06bd4b22d1f3
> >>>
> >>> It's a bit old, but it gives a good impression of what I have in mind.
> >>
> >> On a related, but slightly different note, I'm wondering how we should
> >> handle a case where we have an M format (e.g. NV12M with 2 memory
> >> planes), but only 1 DMA-buf with all planes to import. That generally
> >> means that we have to use the same DMA-buf FD with an offset for each
> >> plane. In theory, v4l2_plane::data_offset could be used for this, but
> >> the documentation says that it should be set by the application only
> >> for OUTPUT planes. Moreover, existing drivers tend to just ignore
> >> it...
> >
> > The following patches may be of interest.
> >
> > https://patchwork.linuxtv.org/patch/29177/
> > https://patchwork.linuxtv.org/patch/29178/
> 
> [+CC Boris]
> 
> Thanks Laurent for pointing me to those patches.
> 
> Repurposing the data_offset field may sound like a plausible way to do
> it, but it's not, for several reasons:
> 
> 1) The relation between data_offset and other fields in v4l2_buffer
> makes it hard to use in drivers and userspace.

Could you elaborate on this ?

> 2) It is not handled by vb2, so each driver would have to
> explicitly add data_offset to the plane address and subtract it from
> plane size and/or bytesused (since data_offset counts into plane size
> and bytesused),

We should certainly handle that in the V4L2 core and/or in vb2. I think
we should go one step further and handle the compose rectangle there
too, as composing for capture devices is essentially offsetting the
buffer and setting the correct stride. I wonder if it was a mistake to
expose compose rectangle on capture video nodes, maybe stride + offset
would be a better API.

> 3) For CAPTURE buffers, it's actually defined as set-by-driver
> (https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#struct-v4l2-plane),
> so anything userspace sets there is bound to be ignored. I'm not sure
> if we can change this now, as it would be a compatibility issue.
> 
> (There are actually real use cases for it, i.e. the venus driver
> outputs VPx encoded frames prepended with the IVF header, but that's
> not what the V4L2 VPx formats expect, so the data_offset is set by the
> driver to point to the raw bitstream data.)

Doesn't that essentially create a custom format though ? Who consumes
the IVF header ?

Another use case is handling of embedded data with CSI-2.

CSI-2 sensors can send multiple types of data multiplexed in a single
virtual channels. Common use cases include sending a few lines of
metadata, or sending optical black lines, in addition to the main image.
A CSI-2 source could also send the same image in multiple formats, but I
haven't seen that happening in practice. The CSI-2 standard tags each
line with a data type in order to differentiate them on the receiver
side. On the receiver side, some receivers allow capturing different
data types in different buffers, while other support a single buffer
only, with or without data type filtering. It may thus be that a sensor
sending 2 lines of embedded data before the image to a CSI-2 receiver
that supports a single buffer will leave the user with two options,
capturing the image only or capturing both in the same buffer (really
simple receivers may only offer the last option). Reporting to the user
how data is organized in the buffer is needed, and the data_offset field
is used for this.

This being said, I don't think it's a valid use case fo data_offset. As
mentioned above a sensor could send more than one data type in addition
to the main image (embedded data + optical black is one example), so a
single data_offset field wouldn't allow differentiating embedded data
from optical black lines. I think a more powerful frame descriptor API
would be needed for this. The fact that the buffer layout doesn't change
between frames also hints that this should be supported at the format
level, not the buffer level.

> >> There is also the opposite problem. Sometimes the application is given
> >> 3 different FDs but pointing to the same buffer. If it has to work
> >> with a video device that only supports non-M formats, it can either
> >> fail, making it unusable, or blindly assume that they all point to the
> >> same buffer and just give the first FD to the video device (we do it
> >> in Chromium, since our allocator is guaranteed to keep all planes of
> >> planar formats in one buffer, if to be used with V4L2).
> >>
> >> Something that we could do is allowing the QBUF semantics of M formats
> >> for non-M formats, where the application would fill the planes[] array
> >> for all planes with all the FDs it has and the kernel could then
> >> figure out if they point to the same buffer (i.e. resolve to the same
> >> dma_buf struct) or fail if not.
> >>
> >> [...]
> >>
> >>> Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps,
> >>> again in order to improve single vs multiplanar handling.
> >>
> >> I'd definitely be more than happy to see the plane handling unified
> >> between non-M and M formats, in general. The list of problems with
> >> current interface:
> >>
> >> 1) The userspace has to hardcode the computations of bytesperline for
> >> chroma planes of non-M formats (while they are reported for M
> >> formats).
> >>
> >> 2) Similarly, offsets of the planes in the buffer for non-M formats
> >> must be explicitly calculated in the application,
> >>
> >> 3) Drivers have to explicitly handle both non-M and M formats or
> >> otherwise they would suffer from issues with application compatibility
> >> or sharing buffers with other devices (one supporting only M and the
> >> other only non-M),
> >>
> >> 4) Inconsistency in the meaning of planes[0].sizeimage for non-M
> >> formats and M formats, making it impossible to use planes[0].sizeimage
> >> to set the luma plane size in the hardware for non-M formats (since
> >> it's the total size of all planes).
> >>
> >> I might have probably forgotten about something, but generally fixing
> >> the 4 above, would be a really big step forward.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2019-03-17 16:10       ` Laurent Pinchart
@ 2019-03-17 17:10         ` Nicolas Dufresne
  2019-03-17 17:14         ` Nicolas Dufresne
  2019-03-25  9:41         ` Tomasz Figa
  2 siblings, 0 replies; 26+ messages in thread
From: Nicolas Dufresne @ 2019-03-17 17:10 UTC (permalink / raw)
  To: Laurent Pinchart, Tomasz Figa
  Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda, Boris Brezillon

Le dimanche 17 mars 2019 à 18:10 +0200, Laurent Pinchart a écrit :
> Hi Tomasz,
> 
> On Fri, Mar 15, 2019 at 01:18:17PM +0900, Tomasz Figa wrote:
> > On Fri, Oct 26, 2018 at 10:42 PM Laurent Pinchart wrote:
> > > On Friday, 26 October 2018 14:41:26 EEST Tomasz Figa wrote:
> > > > On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > > > > Some parts of the V4L2 API are awkward to use and I think it would be
> > > > > a good idea to look at possible candidates for that.
> > > > > 
> > > > > Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> > > > > support is really horrible, and writing code to support both single and
> > > > > multiplanar is hard. We are also running out of fields and the timeval
> > > > > isn't y2038 compliant.
> > > > > 
> > > > > A proof-of-concept is here:
> > > > > 
> > > > > https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a
> > > > > 95549df06d9900f3559afdbb9da06bd4b22d1f3
> > > > > 
> > > > > It's a bit old, but it gives a good impression of what I have in mind.
> > > > 
> > > > On a related, but slightly different note, I'm wondering how we should
> > > > handle a case where we have an M format (e.g. NV12M with 2 memory
> > > > planes), but only 1 DMA-buf with all planes to import. That generally
> > > > means that we have to use the same DMA-buf FD with an offset for each
> > > > plane. In theory, v4l2_plane::data_offset could be used for this, but
> > > > the documentation says that it should be set by the application only
> > > > for OUTPUT planes. Moreover, existing drivers tend to just ignore
> > > > it...
> > > 
> > > The following patches may be of interest.
> > > 
> > > https://patchwork.linuxtv.org/patch/29177/
> > > https://patchwork.linuxtv.org/patch/29178/
> > 
> > [+CC Boris]
> > 
> > Thanks Laurent for pointing me to those patches.
> > 
> > Repurposing the data_offset field may sound like a plausible way to do
> > it, but it's not, for several reasons:
> > 
> > 1) The relation between data_offset and other fields in v4l2_buffer
> > makes it hard to use in drivers and userspace.
> 
> Could you elaborate on this ?
> 
> > 2) It is not handled by vb2, so each driver would have to
> > explicitly add data_offset to the plane address and subtract it from
> > plane size and/or bytesused (since data_offset counts into plane size
> > and bytesused),
> 
> We should certainly handle that in the V4L2 core and/or in vb2. I think
> we should go one step further and handle the compose rectangle there
> too, as composing for capture devices is essentially offsetting the
> buffer and setting the correct stride. I wonder if it was a mistake to
> expose compose rectangle on capture video nodes, maybe stride + offset
> would be a better API.

If you have let's say a YUYV buffer, and you want to do a 1 pixel crop
from the left, this approach is not possible since you would introduce
a chroma shift. To implement this correctly, you also need to shift the
UV by half a pixel. Most people take the habit of ignoring this, or
simply refusing unaligned crop, but if you want to let's say to provide
HW for video editing, then you are just doing it wrong. Some HW goes
further and propose sub-pixel coordinates using Q16 representation.

> 
> > 3) For CAPTURE buffers, it's actually defined as set-by-driver
> > (https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#struct-v4l2-plane),
> > so anything userspace sets there is bound to be ignored. I'm not sure
> > if we can change this now, as it would be a compatibility issue.
> > 
> > (There are actually real use cases for it, i.e. the venus driver
> > outputs VPx encoded frames prepended with the IVF header, but that's
> > not what the V4L2 VPx formats expect, so the data_offset is set by the
> > driver to point to the raw bitstream data.)
> 
> Doesn't that essentially create a custom format though ? Who consumes
> the IVF header ?
> 
> Another use case is handling of embedded data with CSI-2.
> 
> CSI-2 sensors can send multiple types of data multiplexed in a single
> virtual channels. Common use cases include sending a few lines of
> metadata, or sending optical black lines, in addition to the main image.
> A CSI-2 source could also send the same image in multiple formats, but I
> haven't seen that happening in practice. The CSI-2 standard tags each
> line with a data type in order to differentiate them on the receiver
> side. On the receiver side, some receivers allow capturing different
> data types in different buffers, while other support a single buffer
> only, with or without data type filtering. It may thus be that a sensor
> sending 2 lines of embedded data before the image to a CSI-2 receiver
> that supports a single buffer will leave the user with two options,
> capturing the image only or capturing both in the same buffer (really
> simple receivers may only offer the last option). Reporting to the user
> how data is organized in the buffer is needed, and the data_offset field
> is used for this.
> 
> This being said, I don't think it's a valid use case fo data_offset. As
> mentioned above a sensor could send more than one data type in addition
> to the main image (embedded data + optical black is one example), so a
> single data_offset field wouldn't allow differentiating embedded data
> from optical black lines. I think a more powerful frame descriptor API
> would be needed for this. The fact that the buffer layout doesn't change
> between frames also hints that this should be supported at the format
> level, not the buffer level.
> 
> > > > There is also the opposite problem. Sometimes the application is given
> > > > 3 different FDs but pointing to the same buffer. If it has to work
> > > > with a video device that only supports non-M formats, it can either
> > > > fail, making it unusable, or blindly assume that they all point to the
> > > > same buffer and just give the first FD to the video device (we do it
> > > > in Chromium, since our allocator is guaranteed to keep all planes of
> > > > planar formats in one buffer, if to be used with V4L2).
> > > > 
> > > > Something that we could do is allowing the QBUF semantics of M formats
> > > > for non-M formats, where the application would fill the planes[] array
> > > > for all planes with all the FDs it has and the kernel could then
> > > > figure out if they point to the same buffer (i.e. resolve to the same
> > > > dma_buf struct) or fail if not.
> > > > 
> > > > [...]
> > > > 
> > > > > Do we have more ioctls that could use a refresh? S/G/TRY_FMT perhaps,
> > > > > again in order to improve single vs multiplanar handling.
> > > > 
> > > > I'd definitely be more than happy to see the plane handling unified
> > > > between non-M and M formats, in general. The list of problems with
> > > > current interface:
> > > > 
> > > > 1) The userspace has to hardcode the computations of bytesperline for
> > > > chroma planes of non-M formats (while they are reported for M
> > > > formats).
> > > > 
> > > > 2) Similarly, offsets of the planes in the buffer for non-M formats
> > > > must be explicitly calculated in the application,
> > > > 
> > > > 3) Drivers have to explicitly handle both non-M and M formats or
> > > > otherwise they would suffer from issues with application compatibility
> > > > or sharing buffers with other devices (one supporting only M and the
> > > > other only non-M),
> > > > 
> > > > 4) Inconsistency in the meaning of planes[0].sizeimage for non-M
> > > > formats and M formats, making it impossible to use planes[0].sizeimage
> > > > to set the luma plane size in the hardware for non-M formats (since
> > > > it's the total size of all planes).
> > > > 
> > > > I might have probably forgotten about something, but generally fixing
> > > > the 4 above, would be a really big step forward.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2019-03-17 16:10       ` Laurent Pinchart
  2019-03-17 17:10         ` Nicolas Dufresne
@ 2019-03-17 17:14         ` Nicolas Dufresne
  2019-03-25  9:41         ` Tomasz Figa
  2 siblings, 0 replies; 26+ messages in thread
From: Nicolas Dufresne @ 2019-03-17 17:14 UTC (permalink / raw)
  To: Laurent Pinchart, Tomasz Figa
  Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda, Boris Brezillon

Le dimanche 17 mars 2019 à 18:10 +0200, Laurent Pinchart a écrit :
> > 3) For CAPTURE buffers, it's actually defined as set-by-driver
> > (https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#struct-v4l2-plane),
> > so anything userspace sets there is bound to be ignored. I'm not sure
> > if we can change this now, as it would be a compatibility issue.
> > 
> > (There are actually real use cases for it, i.e. the venus driver
> > outputs VPx encoded frames prepended with the IVF header, but that's
> > not what the V4L2 VPx formats expect, so the data_offset is set by the
> > driver to point to the raw bitstream data.)
> 
> Doesn't that essentially create a custom format though ? Who consumes
> the IVF header ?

I see where you are going, but exposing IVF would be a useless dump of
complexity for the userspace. IFV contains no useful information that
is not already exposed by v4l2 interface. In this context, it's like
exposing the USB headers. The use of data_offset is simply to avoid a
copy. As Tomasz said, this is the exact purpose of data_offset.

> 
> Another use case is handling of embedded data with CSI-2.
> 
> CSI-2 sensors can send multiple types of data multiplexed in a single
> virtual channels. Common use cases include sending a few lines of
> metadata, or sending optical black lines, in addition to the main image.
> A CSI-2 source could also send the same image in multiple formats, but I
> haven't seen that happening in practice. The CSI-2 standard tags each
> line with a data type in order to differentiate them on the receiver
> side. On the receiver side, some receivers allow capturing different
> data types in different buffers, while other support a single buffer
> only, with or without data type filtering. It may thus be that a sensor
> sending 2 lines of embedded data before the image to a CSI-2 receiver
> that supports a single buffer will leave the user with two options,
> capturing the image only or capturing both in the same buffer (really
> simple receivers may only offer the last option). Reporting to the user
> how data is organized in the buffer is needed, and the data_offset field
> is used for this.
> 
> This being said, I don't think it's a valid use case fo data_offset. As
> mentioned above a sensor could send more than one data type in addition
> to the main image (embedded data + optical black is one example), so a
> single data_offset field wouldn't allow differentiating embedded data
> from optical black lines. I think a more powerful frame descriptor API
> would be needed for this. The fact that the buffer layout doesn't change
> between frames also hints that this should be supported at the format
> level, not the buffer level.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFP] Which V4L2 ioctls could be replaced by better versions?
  2019-03-17 16:10       ` Laurent Pinchart
  2019-03-17 17:10         ` Nicolas Dufresne
  2019-03-17 17:14         ` Nicolas Dufresne
@ 2019-03-25  9:41         ` Tomasz Figa
  2 siblings, 0 replies; 26+ messages in thread
From: Tomasz Figa @ 2019-03-25  9:41 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Hans Verkuil, Linux Media Mailing List, Hirokazu Honda, Boris Brezillon

On Mon, Mar 18, 2019 at 1:10 AM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Tomasz,
>
> On Fri, Mar 15, 2019 at 01:18:17PM +0900, Tomasz Figa wrote:
> > On Fri, Oct 26, 2018 at 10:42 PM Laurent Pinchart wrote:
> > > On Friday, 26 October 2018 14:41:26 EEST Tomasz Figa wrote:
> > >> On Thu, Sep 20, 2018 at 11:42 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > >>> Some parts of the V4L2 API are awkward to use and I think it would be
> > >>> a good idea to look at possible candidates for that.
> > >>>
> > >>> Examples are the ioctls that use struct v4l2_buffer: the multiplanar
> > >>> support is really horrible, and writing code to support both single and
> > >>> multiplanar is hard. We are also running out of fields and the timeval
> > >>> isn't y2038 compliant.
> > >>>
> > >>> A proof-of-concept is here:
> > >>>
> > >>> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=v4l2-buffer&id=a
> > >>> 95549df06d9900f3559afdbb9da06bd4b22d1f3
> > >>>
> > >>> It's a bit old, but it gives a good impression of what I have in mind.
> > >>
> > >> On a related, but slightly different note, I'm wondering how we should
> > >> handle a case where we have an M format (e.g. NV12M with 2 memory
> > >> planes), but only 1 DMA-buf with all planes to import. That generally
> > >> means that we have to use the same DMA-buf FD with an offset for each
> > >> plane. In theory, v4l2_plane::data_offset could be used for this, but
> > >> the documentation says that it should be set by the application only
> > >> for OUTPUT planes. Moreover, existing drivers tend to just ignore
> > >> it...
> > >
> > > The following patches may be of interest.
> > >
> > > https://patchwork.linuxtv.org/patch/29177/
> > > https://patchwork.linuxtv.org/patch/29178/
> >
> > [+CC Boris]
> >
> > Thanks Laurent for pointing me to those patches.
> >
> > Repurposing the data_offset field may sound like a plausible way to do
> > it, but it's not, for several reasons:
> >
> > 1) The relation between data_offset and other fields in v4l2_buffer
> > makes it hard to use in drivers and userspace.
>
> Could you elaborate on this ?

Well, it's not a critical issue, but having the application need to
always add data_offset to bytesused and length makes the code a bit
messy, because the values don't really relate to the plane anymore,
but the buffer (or not even, since length would be less than the
buffer length for planes other than the last).

>
> > 2) It is not handled by vb2, so each driver would have to
> > explicitly add data_offset to the plane address and subtract it from
> > plane size and/or bytesused (since data_offset counts into plane size
> > and bytesused),
>
> We should certainly handle that in the V4L2 core and/or in vb2. I think
> we should go one step further and handle the compose rectangle there
> too, as composing for capture devices is essentially offsetting the
> buffer and setting the correct stride. I wonder if it was a mistake to
> expose compose rectangle on capture video nodes, maybe stride + offset
> would be a better API.
>
> > 3) For CAPTURE buffers, it's actually defined as set-by-driver
> > (https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#struct-v4l2-plane),
> > so anything userspace sets there is bound to be ignored. I'm not sure
> > if we can change this now, as it would be a compatibility issue.
> >
> > (There are actually real use cases for it, i.e. the venus driver
> > outputs VPx encoded frames prepended with the IVF header, but that's
> > not what the V4L2 VPx formats expect, so the data_offset is set by the
> > driver to point to the raw bitstream data.)
>
> Doesn't that essentially create a custom format though ? Who consumes
> the IVF header ?

Nobody consumes it. I believe data_offset was made exactly for this
purpose, to skip useless headers and avoid format proliferation.

>
> Another use case is handling of embedded data with CSI-2.
>
> CSI-2 sensors can send multiple types of data multiplexed in a single
> virtual channels. Common use cases include sending a few lines of
> metadata, or sending optical black lines, in addition to the main image.
> A CSI-2 source could also send the same image in multiple formats, but I
> haven't seen that happening in practice. The CSI-2 standard tags each
> line with a data type in order to differentiate them on the receiver
> side. On the receiver side, some receivers allow capturing different
> data types in different buffers, while other support a single buffer
> only, with or without data type filtering. It may thus be that a sensor
> sending 2 lines of embedded data before the image to a CSI-2 receiver
> that supports a single buffer will leave the user with two options,
> capturing the image only or capturing both in the same buffer (really
> simple receivers may only offer the last option). Reporting to the user
> how data is organized in the buffer is needed, and the data_offset field
> is used for this.
>
> This being said, I don't think it's a valid use case fo data_offset. As
> mentioned above a sensor could send more than one data type in addition
> to the main image (embedded data + optical black is one example), so a
> single data_offset field wouldn't allow differentiating embedded data
> from optical black lines. I think a more powerful frame descriptor API
> would be needed for this. The fact that the buffer layout doesn't change
> between frames also hints that this should be supported at the format
> level, not the buffer level.

FYI, this topic was also brought in some threads about the stateless
video decoders, where it could be useful to still pass the full
bitstream to the decoder, but tag it with locations of particular
parts in the buffer (NAL headers, PPS, SPS, slice header, slice data,
etc.)

Best regards,
Tomasz

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-03-25  9:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 14:42 [RFP] Which V4L2 ioctls could be replaced by better versions? Hans Verkuil
2018-09-20 18:14 ` Nicolas Dufresne
2018-10-03  8:24   ` Tomasz Figa
2018-10-16 13:22     ` Hans Verkuil
2018-10-21 16:28     ` Philipp Zabel
2018-10-22  3:37       ` Tomasz Figa
2018-10-22  9:49         ` Dave Stevenson
2018-10-23 13:02         ` Philipp Zabel
2018-10-27  9:38         ` Nicolas Dufresne
2018-11-08  7:45           ` Tomasz Figa
2018-11-09 21:06             ` Nicolas Dufresne
2018-11-11  3:43               ` Tomasz Figa
2018-11-12  9:29                 ` Philipp Zabel
2018-11-12 15:02                   ` Hans Verkuil
2018-10-21 16:17   ` Philipp Zabel
2018-10-17  8:57 ` Laurent Pinchart
2018-10-17  9:16   ` Hans Verkuil
2018-10-17 20:46     ` Laurent Pinchart
2018-10-20 20:28       ` Sakari Ailus
2018-10-26 11:41 ` Tomasz Figa
2018-10-26 13:42   ` Laurent Pinchart
2019-03-15  4:18     ` Tomasz Figa
2019-03-17 16:10       ` Laurent Pinchart
2019-03-17 17:10         ` Nicolas Dufresne
2019-03-17 17:14         ` Nicolas Dufresne
2019-03-25  9:41         ` Tomasz Figa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.