From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Subject: Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices. Date: Fri, 4 May 2018 10:16:09 +0100 Message-ID: <20180504091609.GB29999@redhat.com> References: <20180418123153.0f4f037d@w520.home> <20180423154003.12c5467a@w520.home> <20180424165918.5c2ef037@w520.home> <0a1d6487-0dfb-2ffc-4774-ebaf65c15892@nvidia.com> <20180425120057.0fabb70e@w520.home> <20180425195229.GK2496@work-vm> <20180426185522.GQ2631@work-vm> <20180503125800.76cc7582@w520.home> Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: Neo Jia , kvm@vger.kernel.org, Erik Skultety , libvirt , "Dr. David Alan Gilbert" , Tina Zhang , Kirti Wankhede , Gerd Hoffmann , Laine Stump , Jiri Denemark , intel-gvt-dev@lists.freedesktop.org To: Alex Williamson Return-path: Content-Disposition: inline In-Reply-To: <20180503125800.76cc7582@w520.home> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com List-Id: kvm.vger.kernel.org On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote: > Hi, > > The previous discussion hasn't produced results, so let's start over. > Here's the situation: > > - We currently have kernel and QEMU support for the QEMU vfio-pci > display option. > > - The default for this option is 'auto', so the device will attempt to > generate a display if the underlying device supports it, currently > only GVTg and some future release of NVIDIA vGPU (plus Gerd's > sample mdpy and mbochs). > > - The display option is implemented via two different mechanism, a > vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs). > > - Displays using dma-buf require OpenGL support, displays making > use of region support do not. > > - Enabling OpenGL support requires specific VM configurations, which > libvirt /may/ want to facilitate. > > - Probing display support for a given device is complicated by the > fact that GVTg and NVIDIA both impose requirements on the process > opening the device file descriptor through the vfio API: > > - GVTg requires a KVM association or will fail to allow the device > to be opened. > > - NVIDIA requires that their vgpu-manager process can locate a UUID > for the VM via the process commandline. > > - These are both horrible impositions and prevent libvirt from > simply probing the device itself. Agreed, these requirements are just horrific. Probing for features should not require this kind of level environmental setup. I can just about understand & accept how we ended up here, because this scenario is not one that was strongly considered when the first impls were being done. I don't think we should accept it as a long term requirement though. > Erik Skultety, who initially raised the display question, has identified > one possible solution, which is to simply make the display configuration > the user's problem (apologies if I've misinterpreted Erik). I believe > this would work something like: > > - libvirt identifies a version of QEMU that includes 'display' support > for vfio-pci devices and defaults to adding display=off for every > vfio-pci device [have we chosen the wrong default (auto) in QEMU?]. > > - New XML support would allow a user to enable display support on the > vfio device. > > - Resolving any OpenGL dependencies of that change would be left to > the user. > > A nice aspect of this is that policy decisions are left to the user and > clearly no interface changes are necessary, perhaps with the exception > of deciding whether we've made the wrong default choice for vfio-pci > devices in QEMU. Unless I'm mis-understanding this isn't really a solution to the problem, rather it is us simply giving up and telling someone else to try to fix the problem. The 'user' here is not a human - it is simply the next level up in the mgmt stack, eg OpenStack or oVirt. If we can't solve it acceptably in libvirt code, I don't have much hope that OpenStack can solve it in their code, since they have even stronger need to automate everything. > On the other hand, if we do want to give libvirt a mechanism to probe > the display support for a device, we can make a simplified QEMU > instance be the mechanism through which we do that. For example the > script[1] can be provided with either a PCI device or sysfs path to an > mdev device and run a minimal VM instance meeting the requirements of > both GVTg and NVIDIA to report the display support and GL requirements > for a device. There are clearly some unrefined and atrocious bits of > this script, but it's only a proof of concept, the process management > can be improved and we can decide whether we want to provide qmp > mechanism to introspect the device rather than grep'ing error > messages. The goal is simply to show that we could choose to embrace > QEMU and use it not as a VM, but simply a tool for poking at a device > given the restrictions the mdev vendor drivers have already imposed. Feels like a pretty heavy weight solution, that just encourages the drivers to continue down the undesirable path they're already on, possibly making the situation even worse over time. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|