On Fri, Apr 9, 2021 at 12:48 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
  Hi,

> > IIRC the VIRTGPU_BLOB_FLAG_USE_SHAREABLE flag means that the host *can*
> > create a shared mapping (i.e. the host seeing guest-side changes without
> > explicit transfer doesn't cause problems for the guest).  It doesn not
> > mean the host *must* create a shared mapping (note that there is no
> > negotiation whenever the host supports shared mappings or not).
> >
>
> VIRTGPU_BLOB_FLAG_USE_SHAREABLE means guest userspace intends to share the
> blob resource with another virtgpu driver instance via drmPrimeHandleToFd.
> It's a rough analogue to VkExportMemoryAllocateInfoKHR or
> PIPE_BIND_USE_SHARED.

Oh.  My memory was failing me then.  We should *really* clarify the spec
for BLOB_MEM_GUEST. 

So shared mappings are allowed for all BLOB_MEM_GUEST resources, right?

The guest iovecs are always shared with the host, so they may be copied to/from directly depending on the operation.  In the case of RESOURCE_FLUSH + BLOB_MEM_GUEST, it could be a copy from the guest iovecs to the host framebuffer [host framebuffer != host shadow memory].
 

> > So the transfer calls are still needed, and the host can decide to
> > shortcut them in case it can create a shared mapping.  In case there is
> > no shared mapping (say due to missing udmabuf support) the host can
> > fallback to copying.
>
> Transfers are a bit under-defined for BLOB_MEM_GUEST.  Even without udmabuf
> on the host, there is no host side resource for guest-only blobs?  Before
> blob resources, the dumb flow was:
>
> 1) update guest side resource
> 2) TRANSFER_TO_HOST_2D to copy guest side contents to host side private
> resource [Pixman??]
> 3) RESOURCE_FLUSH to copy the host-side contents to the framebuffer and
> page-flip

Yes.

> At least for crosvm, this is possible:
>
> 1) update guest side resource
> 2) RESOURCE_FLUSH to copy the guest-side contents to the framebuffer and
> pageflip
>
> With implicit udmabuf, it may be possible to do this:
>
> 1) update guest side resource
> 2) RESOURCE_FLUSH to page-flip
>
> > So I think crosvm should be fixed to not consider transfer commands for
> > VIRTGPU_BLOB_MEM_GUEST resources an error.
>
> It's a simple change to make and we can definitely do it, if TRANSFER_2D is
> helpful for the QEMU case.  I haven't looked at the QEMU side patches.

Well, we have two different cases:

  (1) No udmabuf available.  qemu will have a host-side shadow then and
      the workflow will be largely identical to the non-blob resource
      workflow.

I think this is the key difference.  With BLOB_MEM_GUEST, crosvm can only have a guest side iovecs and no host-side shadow memory.  With BLOB_MEM_GUEST_HOST3D, host-side shadow memory will exist.

I guess it boils down the Pixman dependency.  crosvm sits right on top of display APIs (X, wayland) rather than having intermediary layers.  Adding a new Pixman API takes time too.

There's a bunch of options:

1) Don't use BLOB_MEM_GUEST dumb buffers in 3D mode.
2) virglrenderer or crosvm modified to implicitly ignore TRANSFER_TO_HOST_2D for BLOB_MEM_GUEST when in 3D mode.
3) It's probably possible to create an implicit udmabuf for RESOURCE_CREATE_2D resources and ignore the transfer there too.  The benefit of this is TRANSFER_TO_HOST_2D makes a ton of sense for non-blob resources.  No kernel side change needed here, just QEMU.
4) modify QEMU display integration

I would choose (1) since it solves the log spam problem and it advances blob support in QEMU.  Though I leave the decision to QEMU devs.
 

  (2) With udmabuf support.  qemu can create udmabufs for the resources,
      mmap() the dmabuf to get a linear mapping, create a pixman buffer
      backed by that dmabuf (no copying needed then).  Depending on
      capabilities pass either the pixman image (gl=off) or the dmabuf
      handle (gl=on) to the UI code to actually show the guest display.

The guest doesn't need to know any of this, it'll just send transfer and
flush commands.  In case (1) qemu must process the transfer commands and
for case (2) qemu can simply ignore them.

> For the PCI-passthrough + guest blob case, the end goal is to share it with
> the host compositor.  If there is no guarantee the guest memory can be
> converted to an OS-handle (to share with the host compositor), then I think
> the guest user space should fallback to another technique involving
> memcpy() to share the memory.

This is what happens today (using non-blob resources).

> So essentially, thinking for two new protocol additions:
>
> F_CREATE_GUEST_HANDLE (or F_HANDLE_FROM_GUEST) --> means an OS-specific
> udmabuf-like mechanism exists on the host.
>
> BLOB_FLAG_CREATE_GUEST_HANDLE (or BLOB_FLAG_HANDLE_FROM_GUEST)--> tells
> host userspace "you must create a udmabuf" [or OS-specific equivalent] upon
> success

Again:  Why do we actually need that?  Is there any benefit other than
the guest knowing it doesn't need to send transfer commands?
I see the whole udmabuf thing as a host-side performance optimization
and I think this should be fully transparent to the guest as the host
can easily just ignore the transfer commands. 

So the use case I'm most interested in (and Vivek/Tina?) is tiled/compressed udmabufs, so they may be eventually shared with the host compositor via the DRM modifier API.

Transfers to linear udmabufs make sense.  Maybe transfers to tiled/compressed udmabufs shouldn't even be attempted.  

It's a complicated case with many ambiguities, especially with PCI passthrough involved.  Explicit tiled/compressed udmabufs are just an idea, will have to think more about it / have some proof of concept [with virgl and PCI passthrough], before making any concrete proposals.  Will keep your idea of just ignoring transfers on the host in mind.
 
Given we batch commands
the extra commands don't lead to extra context switches, so there
shouldn't be much overhead.

If we really want make the guest aware of the hosts udmabuf state I
think this should be designed the other way around:  Add some way for
the host to tell the guest transfer commands are not needed for a
specific BLOB_MEM_GUEST resource.

take care,
  Gerd