On Tue, Apr 20, 2021 at 2:39 PM Daniel Vetter <daniel@ffwll.ch> wrote:
On Tue, Apr 20, 2021 at 6:25 PM Marek Olšák <maraeo@gmail.com> wrote:
>
> Daniel, imagine hardware that can only do what Windows does: future fences signalled by userspace whenever userspace wants, and no kernel queues like we have today.
>
> The only reason why current AMD GPUs work is because they have a ring buffer per queue with pointers to userspace command buffers followed by fences. What will we do if that ring buffer is removed?

Well this is an entirely different problem than what you set out to
describe. This is essentially the problem where hw does not have any
support for priviledged commands and separate priviledges command
buffer, and direct userspace submit is the only thing that is
available.

I think if this is your problem, then you get to implement some very
interesting compat shim. But that's an entirely different problem from
what you've described in your mail. This pretty much assumes at the hw
level the only thing that works is ATS/pasid, and vram is managed with
HMM exclusively. Once you have that pure driver stack you get to fake
it in the kernel for compat with everything that exists already. How
exactly that will look and how exactly you best construct your
dma_fences for compat will depend highly upon how much is still there
in this hw (e.g. wrt interrupt generation). A lot of the
infrastructure was also done as part of drm_syncobj. I mean we have
entirely fake kernel drivers like vgem/vkms that create dma_fence, so
a hw ringbuffer is really not required.

So ... is this your problem underneath it all, or was that more a wild
strawman for the discussion?

Yes, that's the problem.

Marek