From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOsMS-0002aO-Hb for qemu-devel@nongnu.org; Thu, 28 Jan 2016 14:36:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aOsMN-0007tw-EP for qemu-devel@nongnu.org; Thu, 28 Jan 2016 14:36:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42246) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aOsMN-0007ts-6b for qemu-devel@nongnu.org; Thu, 28 Jan 2016 14:36:03 -0500 Message-ID: <1454009759.7183.7.camel@redhat.com> From: Alex Williamson Date: Thu, 28 Jan 2016 12:35:59 -0700 In-Reply-To: <1451994098-6972-1-git-send-email-kraxel@redhat.com> References: <1451994098-6972-1-git-send-email-kraxel@redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gerd Hoffmann , qemu-devel@nongnu.org Cc: igvt-g@ml01.01.org, xen-devel@lists.xensource.com, Eduardo Habkost , Stefano Stabellini , Cao jin , vfio-users@redhat.com On Tue, 2016-01-05 at 12:41 +0100, Gerd Hoffmann wrote: > =C2=A0 Hi, >=C2=A0 > We have some code in our tree to support pci passthrough of intel > graphics devices (igd) on xen, which requires some chipset tweaks > for (a) the host bridge and (b) the lpc/isa-bridge to meat the > expectations of the guest driver. >=C2=A0 > For kvm we need pretty much the same, also the requirements for vgpu > (xengt/kvmgt) are very simliar.=C2=A0=C2=A0This patch wires up the exis= ting > support for kvm.=C2=A0=C2=A0It also brings a bunch of bugfixes and clea= nups. >=C2=A0 > Unfortunaly the oldish laptop I had planned to use for testing turned > out to have no working iommu support for igd, so this patch series > still has seen very light testing only.=C2=A0=C2=A0Any testing feedback= is very > welcome. >=C2=A0 Hi Gerd, I believe I have working code for getting the IGD OpRegion from the host into QEMU using a vfio device specific region, but now comes the part of how do we expose it into the VM and I'm looking for suggestions. Effectively in vfio-pci I have a MemoryRegion that can access the host OpRegion.=C2=A0=C2=A0We can map that directly into the guest, map it read= -only into the guest, or we can read out the contents and have our own virtual version of it.=C2=A0=C2=A0So let me throw out the options, some of which = come from you, and we can hammer out which way to go. 1) The OpRegion MemoryRegion is mapped into system_memory through programming of the 0xFC config space register. =C2=A0a) vfio-pci could pick an address to do this as it is realized. =C2=A0b) SeaBIOS/OVMF could program this. Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick an address and mark it as e820 reserved.=C2=A0=C2=A0I'm not sure how= to pick that address.=C2=A0=C2=A0We'd probably want to make the 0xFC config regis= ter read-only.=C2=A0=C2=A01.b) has the issue you mentioned where in most case= s the OpRegion will be 8k, but the BIOS won't know how much address space it's mapping into system memory when it writes the 0xFC register.=C2=A0=C2=A0I= don't know how much of a problem this is since the BIOS can easily determine the size once mapped and re-map it somewhere there's sufficient space. Practically, it seems like it's always going to be 8K.=C2=A0=C2=A0This of= course requires modification to every BIOS.=C2=A0=C2=A0It also leaves the 0xFC r= egister as a mapping control rather than a pointer to the OpRegion in RAM, which doesn't really match real hardware.=C2=A0=C2=A0The BIOS would need to pic= k an address in this case. 2) Read-only mappings version of 1) Discussion: Really nothing changes from the issues above, just prevents any possibility of the guest modifying anything in the host.=C2=A0=C2=A0X= en apparently allows write access to the host page already. 3) Copy OpRegion contents into buffer and do either 1) or 2) above. Discussion: No benefit that I can see over above other than maybe allowing write access that doesn't affect the host. 4) Copy contents into a guest RAM location, mark it reserved, point to it via 0xFC config as scratch register. =C2=A0a) Done by QEMU (vfio-pci) =C2=A0b) Done by SeaBIOS/OVMF Discussion: This is the most like real hardware.=C2=A0=C2=A04.a) has the = usual issue of how to pick an address, but the benefit of not requiring BIOS changes (simply mark the RAM reserved via existing methods).=C2=A0=C2=A04= .b) would require passing a buffer containing the contents of the OpRegion via fw_cfg and letting the BIOS do the setup.=C2=A0=C2=A0The latter of course= requires modifying each BIOS for this support. Of course none of these support hotplug nor really can they since reserved memory regions are not dynamic in the architecture. In all cases, some piece of software needs to know where it can place the OpRegion in guest memory.=C2=A0=C2=A0It seems like there are advantag= es or disadvantages whether that's done by QEMU or the BIOS, but we only need to do it once if it's QEMU.=C2=A0=C2=A0Suggestions, comments, preferences= ? Another thing I notice in this series is the access to PCI config space of both the host bridge and the LPC bridge.=C2=A0=C2=A0This prevents unpr= ivileged use cases and is a barrier to libvirt support since it will need to provide access to the pci-sysfs files for the process.=C2=A0=C2=A0Should = vfio add additional device specific regions to expose the config space of these other devices?=C2=A0=C2=A0I don't see that there's any write access neces= sary, so these would be read-only.=C2=A0=C2=A0The comment in the kernel regarding = why an unprivileged user can only access standard config space indicates that some devices lockup if unimplemented config space is accessed.=C2=A0=C2=A0= It seems like that's probably not an issue for recent-ish Intel host bridges and LPC devices.=C2=A0=C2=A0If OpRegion, host bridge config, and LPC config w= ere all provided through vfio, would there be any need for igd-passthrough switches on the machine type?=C2=A0=C2=A0It seems like the QEMU vfio-pci = driver could enable the necessary features and pre-fill the host and LPC bridge config items on demand when parsing an IGD device.=C2=A0=C2=A0Thanks, Alex From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks Date: Thu, 28 Jan 2016 12:35:59 -0700 Message-ID: <1454009759.7183.7.camel@redhat.com> References: <1451994098-6972-1-git-send-email-kraxel@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <1451994098-6972-1-git-send-email-kraxel@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org To: Gerd Hoffmann , qemu-devel@nongnu.org Cc: igvt-g@ml01.01.org, xen-devel@lists.xensource.com, Eduardo Habkost , Stefano Stabellini , Cao jin , vfio-users@redhat.com List-Id: xen-devel@lists.xenproject.org On Tue, 2016-01-05 at 12:41 +0100, Gerd Hoffmann wrote: > =C2=A0 Hi, >=C2=A0 > We have some code in our tree to support pci passthrough of intel > graphics devices (igd) on xen, which requires some chipset tweaks > for (a) the host bridge and (b) the lpc/isa-bridge to meat the > expectations of the guest driver. >=C2=A0 > For kvm we need pretty much the same, also the requirements for vgpu > (xengt/kvmgt) are very simliar.=C2=A0=C2=A0This patch wires up the exis= ting > support for kvm.=C2=A0=C2=A0It also brings a bunch of bugfixes and clea= nups. >=C2=A0 > Unfortunaly the oldish laptop I had planned to use for testing turned > out to have no working iommu support for igd, so this patch series > still has seen very light testing only.=C2=A0=C2=A0Any testing feedback= is very > welcome. >=C2=A0 Hi Gerd, I believe I have working code for getting the IGD OpRegion from the host into QEMU using a vfio device specific region, but now comes the part of how do we expose it into the VM and I'm looking for suggestions. Effectively in vfio-pci I have a MemoryRegion that can access the host OpRegion.=C2=A0=C2=A0We can map that directly into the guest, map it read= -only into the guest, or we can read out the contents and have our own virtual version of it.=C2=A0=C2=A0So let me throw out the options, some of which = come from you, and we can hammer out which way to go. 1) The OpRegion MemoryRegion is mapped into system_memory through programming of the 0xFC config space register. =C2=A0a) vfio-pci could pick an address to do this as it is realized. =C2=A0b) SeaBIOS/OVMF could program this. Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick an address and mark it as e820 reserved.=C2=A0=C2=A0I'm not sure how= to pick that address.=C2=A0=C2=A0We'd probably want to make the 0xFC config regis= ter read-only.=C2=A0=C2=A01.b) has the issue you mentioned where in most case= s the OpRegion will be 8k, but the BIOS won't know how much address space it's mapping into system memory when it writes the 0xFC register.=C2=A0=C2=A0I= don't know how much of a problem this is since the BIOS can easily determine the size once mapped and re-map it somewhere there's sufficient space. Practically, it seems like it's always going to be 8K.=C2=A0=C2=A0This of= course requires modification to every BIOS.=C2=A0=C2=A0It also leaves the 0xFC r= egister as a mapping control rather than a pointer to the OpRegion in RAM, which doesn't really match real hardware.=C2=A0=C2=A0The BIOS would need to pic= k an address in this case. 2) Read-only mappings version of 1) Discussion: Really nothing changes from the issues above, just prevents any possibility of the guest modifying anything in the host.=C2=A0=C2=A0X= en apparently allows write access to the host page already. 3) Copy OpRegion contents into buffer and do either 1) or 2) above. Discussion: No benefit that I can see over above other than maybe allowing write access that doesn't affect the host. 4) Copy contents into a guest RAM location, mark it reserved, point to it via 0xFC config as scratch register. =C2=A0a) Done by QEMU (vfio-pci) =C2=A0b) Done by SeaBIOS/OVMF Discussion: This is the most like real hardware.=C2=A0=C2=A04.a) has the = usual issue of how to pick an address, but the benefit of not requiring BIOS changes (simply mark the RAM reserved via existing methods).=C2=A0=C2=A04= .b) would require passing a buffer containing the contents of the OpRegion via fw_cfg and letting the BIOS do the setup.=C2=A0=C2=A0The latter of course= requires modifying each BIOS for this support. Of course none of these support hotplug nor really can they since reserved memory regions are not dynamic in the architecture. In all cases, some piece of software needs to know where it can place the OpRegion in guest memory.=C2=A0=C2=A0It seems like there are advantag= es or disadvantages whether that's done by QEMU or the BIOS, but we only need to do it once if it's QEMU.=C2=A0=C2=A0Suggestions, comments, preferences= ? Another thing I notice in this series is the access to PCI config space of both the host bridge and the LPC bridge.=C2=A0=C2=A0This prevents unpr= ivileged use cases and is a barrier to libvirt support since it will need to provide access to the pci-sysfs files for the process.=C2=A0=C2=A0Should = vfio add additional device specific regions to expose the config space of these other devices?=C2=A0=C2=A0I don't see that there's any write access neces= sary, so these would be read-only.=C2=A0=C2=A0The comment in the kernel regarding = why an unprivileged user can only access standard config space indicates that some devices lockup if unimplemented config space is accessed.=C2=A0=C2=A0= It seems like that's probably not an issue for recent-ish Intel host bridges and LPC devices.=C2=A0=C2=A0If OpRegion, host bridge config, and LPC config w= ere all provided through vfio, would there be any need for igd-passthrough switches on the machine type?=C2=A0=C2=A0It seems like the QEMU vfio-pci = driver could enable the necessary features and pre-fill the host and LPC bridge config items on demand when parsing an IGD device.=C2=A0=C2=A0Thanks, Alex