Le mercredi 17 avril 2019 à 14:39 +0900, Alexandre Courbot a écrit : > Hi Paul, > > On Tue, Apr 16, 2019 at 4:55 PM Paul Kocialkowski > wrote: > > Hi, > > > > Le mardi 16 avril 2019 à 16:22 +0900, Alexandre Courbot a écrit : > > > > [...] > > > > > Thanks for this great discussion. Let me try to summarize the status > > > of this thread + the IRC discussion and add my own thoughts: > > > > > > Proper support for multiple decoding units (e.g. H.264 slices) per > > > frame should not be an afterthought ; compliance to encoded formats > > > depend on it, and the benefit of lower latency is a significant > > > consideration for vendors. > > > > > > m2m, which we use for all stateless codecs, has a strong assumption > > > that one OUTPUT buffer consumed results in one CAPTURE buffer being > > > produced. This assumption can however be overruled: at least the venus > > > driver does it to implement the stateful specification. > > > > > > So we need a way to specify frame boundaries when submitting encoded > > > content to the driver. One request should contain a single OUTPUT > > > buffer, containing a single decoding unit, but we need a way to > > > specify whether the driver should directly produce a CAPTURE buffer > > > from this request, or keep using the same CAPTURE buffer with > > > subsequent requests. > > > > > > I can think of 2 ways this can be expressed: > > > 1) We keep the current m2m behavior as the default (a CAPTURE buffer > > > is produced), and add a flag to ask the driver to change that behavior > > > and hold on the CAPTURE buffer and reuse it with the next request(s) ; > > > > That would kind of break the stateless idea. I think we need requests > > to be fully independent of eachother and have some entity that > > coordinates requests for this kind of things. > > Side note: the idea that stateless decoders are entirely stateless is > not completely accurate anyway. When we specify a resolution on the > OUTPUT queue, we already store some state. What matters IIUC is that > the *hardware* behaves in a stateless manner. I don't think we should > refrain from storing some internal driver state if it makes sense. > > Back to the topic: the effect of this flag would just be that the > first buffer is the CAPTURE queue is not removed, i.e. the next > request will work on the same buffer. It doesn't really preserve any > state - if the next request is the beginning of a different frame, > then the previous work will be discarded and the driver will behave as > it should, not considering any previous state. > > > > 2) We specify that no CAPTURE buffer is produced by default, unless a > > > flag asking so is specified. > > > > > > The flag could be specified in one of two ways: > > > a) As a new v4l2_buffer.flag for the OUTPUT buffer ; > > > b) As a dedicated control, either format-specific or more common to all codecs. > > > > I think we must aim for a generic solution that would be at least > > common to all codecs, and if possible common to requests regardless of > > whether they concern video decoding or not. > > > > I really like the idea of introducing a requests batch/group/queue, > > which groups requests together and allows marking them done when the > > whole group is done being decoded. For that, we explicitly mark one of > > the requests as the final one, so that we can continue adding requests > > to the batch even when it's already being processed. When all the > > requests are done being decoded, we can mark them done. > > I'd need to see this idea more developed (with maybe an example of the > sequence of IOCTLs) to form an opinion about it. Also would need to be > given a few examples of where this could be used outside of stateless > codecs. Then we will have to address what this means for requests: > your argument against using a "release CAPTURE buffer" flag was that > requests won't be fully independent from each other anymore, but I > don't see that situation changing with batches. Then, does the end of > a batch only means that a CAPTURE buffer should be released, or are > other actions required for non-codec use-cases? There are lots and > lots of questions like this one lurking. > > > With that, we also need some tweaking in the core to look for an > > available capture buffer that matches the output buffer's timestamp > > before trying to dequeue the next available capture buffer > > I don't think that would be strictly necessary, unless we want to be > able to decode slices from different frames before the first one is > completed? > > > This way, > > the first request of the batch will get any queued capture buffer, but > > subsequent requests will find the matchung capture buffer by timestamp. > > > > I think that's basically all we need to handle that and the two aspects > > (picking by timestamp and requests groups) are rather independent and > > the latter could probably be used in other situations than video > > decoding. > > > > What do you think? > > At the current point I'd like to avoid over-engineering things. > Introducing a request batch mechanism would mean more months spent > before we can set the stateless codec API in stone, and at some point > we need to settle and release something that people can use. We don't > even have clear idea of what batches would look like and in which > cases they would be used. The idea of an extra flag is simple and > AFAICT would do the job nicely, so why not proceed with this for the > time being? I also share this feeling that this might be a bit over-engineered for what we want to solve. But I also don't fully understand Paul's proposal. > > Cheers, > Alex.