From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= <berrange@redhat.com>
Subject: Re: [libvirt] Expose vfio device display/migration to libvirt and
 above, was Re: [PATCH 0/3] sample: vfio mdev display devices.
Date: Fri, 4 May 2018 10:16:09 +0100
Message-ID: <20180504091609.GB29999@redhat.com>
References: <20180418123153.0f4f037d@w520.home>
	<20180423154003.12c5467a@w520.home>
	<a5f4ec49-d5aa-e853-03ad-7ca9d7e38206@nvidia.com>
	<20180424165918.5c2ef037@w520.home>
	<0a1d6487-0dfb-2ffc-4774-ebaf65c15892@nvidia.com>
	<20180425120057.0fabb70e@w520.home> <20180425195229.GK2496@work-vm>
	<a20611c2-071d-7611-bf1c-c9998ec3a462@nvidia.com>
	<20180426185522.GQ2631@work-vm> <20180503125800.76cc7582@w520.home>
Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= <berrange@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: Neo Jia <cjia@nvidia.com>, kvm@vger.kernel.org,
	Erik Skultety <eskultet@redhat.com>, libvirt <libvir-list@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Tina Zhang <tina.zhang@intel.com>, Kirti Wankhede <kwankhede@nvidia.com>,
	Gerd Hoffmann <kraxel@redhat.com>, Laine Stump <laine@redhat.com>,
	Jiri Denemark <jdenemar@redhat.com>, intel-gvt-dev@lists.freedesktop.org
To: Alex Williamson <alex.williamson@redhat.com>
Return-path: <libvir-list-bounces@redhat.com>
Content-Disposition: inline
In-Reply-To: <20180503125800.76cc7582@w520.home>
List-Unsubscribe: <https://www.redhat.com/mailman/options/libvir-list>,
	<mailto:libvir-list-request@redhat.com?subject=unsubscribe>
List-Archive: <https://www.redhat.com/archives/libvir-list>
List-Post: <mailto:libvir-list@redhat.com>
List-Help: <mailto:libvir-list-request@redhat.com?subject=help>
List-Subscribe: <https://www.redhat.com/mailman/listinfo/libvir-list>,
	<mailto:libvir-list-request@redhat.com?subject=subscribe>
Sender: libvir-list-bounces@redhat.com
Errors-To: libvir-list-bounces@redhat.com
List-Id: kvm.vger.kernel.org

On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> Hi,
> 
> The previous discussion hasn't produced results, so let's start over.
> Here's the situation:
> 
>  - We currently have kernel and QEMU support for the QEMU vfio-pci
>    display option.
> 
>  - The default for this option is 'auto', so the device will attempt to
>    generate a display if the underlying device supports it, currently
>    only GVTg and some future release of NVIDIA vGPU (plus Gerd's
>    sample mdpy and mbochs).
> 
>  - The display option is implemented via two different mechanism, a
>    vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
> 
>  - Displays using dma-buf require OpenGL support, displays making
>    use of region support do not.
> 
>  - Enabling OpenGL support requires specific VM configurations, which
>    libvirt /may/ want to facilitate.
> 
>  - Probing display support for a given device is complicated by the
>    fact that GVTg and NVIDIA both impose requirements on the process
>    opening the device file descriptor through the vfio API:
> 
>    - GVTg requires a KVM association or will fail to allow the device
>      to be opened.
> 
>    - NVIDIA requires that their vgpu-manager process can locate a UUID
>      for the VM via the process commandline.
> 
>    - These are both horrible impositions and prevent libvirt from
>      simply probing the device itself.

Agreed, these requirements are just horrific. Probing for features
should not require this kind of level environmental setup. I can
just about understand & accept how we ended up here, because this
scenario is not one that was strongly considered when the first impls
were being done. I don't think we should accept it as a long term
requirement though.

> Erik Skultety, who initially raised the display question, has identified
> one possible solution, which is to simply make the display configuration
> the user's problem (apologies if I've misinterpreted Erik).  I believe
> this would work something like:
> 
>  - libvirt identifies a version of QEMU that includes 'display' support
>    for vfio-pci devices and defaults to adding display=off for every
>    vfio-pci device [have we chosen the wrong default (auto) in QEMU?].
> 
>  - New XML support would allow a user to enable display support on the
>    vfio device.
> 
>  - Resolving any OpenGL dependencies of that change would be left to
>    the user.
> 
> A nice aspect of this is that policy decisions are left to the user and
> clearly no interface changes are necessary, perhaps with the exception
> of deciding whether we've made the wrong default choice for vfio-pci
> devices in QEMU.

Unless I'm mis-understanding this isn't really a solution to the
problem, rather it is us simply giving up and telling someone else
to try to fix the problem. The 'user' here is not a human - it is
simply the next level up in the mgmt stack, eg OpenStack or oVirt.
If we can't solve it acceptably in libvirt code, I don't have much
hope that OpenStack can solve it in their code, since they have
even stronger need to automate everything.

> On the other hand, if we do want to give libvirt a mechanism to probe
> the display support for a device, we can make a simplified QEMU
> instance be the mechanism through which we do that.  For example the
> script[1] can be provided with either a PCI device or sysfs path to an
> mdev device and run a minimal VM instance meeting the requirements of
> both GVTg and NVIDIA to report the display support and GL requirements
> for a device.  There are clearly some unrefined and atrocious bits of
> this script, but it's only a proof of concept, the process management
> can be improved and we can decide whether we want to provide qmp
> mechanism to introspect the device rather than grep'ing error
> messages.  The goal is simply to show that we could choose to embrace
> QEMU and use it not as a VM, but simply a tool for poking at a device
> given the restrictions the mdev vendor drivers have already imposed.

Feels like a pretty heavy weight solution, that just encourages the
drivers to continue down the undesirable path they're already on,
possibly making the situation even worse over time.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|