From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49399) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X2NKU-0005QM-8I for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:24:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X2NKL-00069q-F6 for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:24:18 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:32359) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X2NKL-00069f-6j for qemu-devel@nongnu.org; Wed, 02 Jul 2014 12:24:09 -0400 Date: Wed, 2 Jul 2014 12:23:37 -0400 From: Konrad Rzeszutek Wilk Message-ID: <20140702162337.GB32380@laptop.dumpdata.com> References: <20140630090511.GB15777@redhat.com> <53B1BAF9.6040800@citrix.com> <20140701053907.GA6108@redhat.com> <20140701170206.GB7640@redhat.com> <53B2F238.7000009@citrix.com> <53B3EDF5.4000802@redhat.com> <20140702140033.GG19068@laptop.dumpdata.com> <53B41C27.4030706@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53B41C27.4030706@redhat.com> Subject: Re: [Qemu-devel] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , daniel.vetter@ffwll.ch, jani.nikula@linux.intel.com, airlied@linux.ie, intel-gfx@lists.freedesktop.org Cc: "peter.maydell@linaro.org" , "xen-devel@lists.xensource.com" , "anthony@codemonkey.ws" , "Michael S. Tsirkin" , "Allen M. Kay" , "Kelly.Zytaruk@amd.com" , "qemu-devel@nongnu.org" , "yang.z.zhang@intel.com" , Stefano Stabellini , Ross Philipson , Anthony Perard , "Chen, Tiejun" On Wed, Jul 02, 2014 at 04:50:15PM +0200, Paolo Bonzini wrote: > Il 02/07/2014 16:00, Konrad Rzeszutek Wilk ha scritto: > >With this long thread I lost a bit context about the challenges > >that exists. But let me try summarizing it here - which will hopefully > >get some consensus. > > > >1). Fix IGD hardware to not use Southbridge magic addresses. > > We can moan and moan but I doubt it is going to change. > > There are two problems: > > - Northbridge (i.e. MCH i.e. PCI host bridge) configuration space addresses Right. So in drivers/gpu/drm/i915/i915_dma.c: 1135 #define MCHBAR_I915 0x44 1136 #define MCHBAR_I965 0x48 1147 int reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915; 1152 if (INTEL_INFO(dev)->gen >= 4) 1153 pci_read_config_dword(dev_priv->bridge_dev, reg + 4, &temp_hi); 1154 pci_read_config_dword(dev_priv->bridge_dev, reg, &temp_lo); 1155 mchbar_addr = ((u64)temp_hi << 32) | temp_lo; and 1139 #define DEVEN_REG 0x54 1193 int mchbar_reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915; 1202 if (IS_I915G(dev) || IS_I915GM(dev)) { 1203 pci_read_config_dword(dev_priv->bridge_dev, DEVEN_REG, &temp); 1204 enabled = !!(temp & DEVEN_MCHBAR_EN); 1205 } else { 1206 pci_read_config_dword(dev_priv->bridge_dev, mchbar_reg, &temp); 1207 enabled = temp & 1; 1208 } > > - Southbridge (i.e. PCH i.e. ISA bridge) vendor/device ID; some versions of > the driver identify it by class, some versions identify it by slot (1f.0). Right, So in drivers/gpu/drm/i915/i915_drv.c the giant intel_detect_pch which sets the pch_type based on : 432 if (pch->vendor == PCI_VENDOR_ID_INTEL) { 433 unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; 434 dev_priv->pch_id = id; 435 436 if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { It checks for 0x3b00, 0x1c00, 0x1e00, 0x8c00 and 0x9c00. The INTEL_PCH_DEVICE_ID_MASK is 0xff00 > > To solve the first, make a new machine type, PIIX4-based, and pass through > the registers you need. The patch must document _exactly_ why the registers > are safe to pass. If they are not reserved on PIIX4, the patch must > document what the same offsets mean on PIIX4, and why it's sensible to > assume that firmware for virtual machine will not read/write them. Bonus > point for also documenting the same for Q35. OK. They look to be related to setting up an MBAR , but I don't understand why it is needed. Hopefully some of the i915 folks CC-ed here can answer. > > Regarding the second, fixing IGD hardware to not rely on chipset magic is a > no-go, I agree. I disagree that it's a no-go to define a "backdoor" that > lets a hypervisor pass the right information to the driver without hacking > the chipset device model. > > The hardware folks would have to give us a place for a pair of registers > (something like data/address), and a bit somewhere else that would be always > 0 on hardware and always 1 if the hypervisor is implementing the pair of > registers. This is similar to CPUID, which has the HYPERVISOR bit + > hypervisor-defined leaves at 0x40000000. > > The data/address pair could be in a BAR, in configuration space, in the low > VGA ports at 0x3c0-0x3df, wherever. The hypervisor bit can be in the same > place or somewhere else---again, whatever is convenient for the hardware > folks. We just need *one bit* that is known-zero on all hardware, and 8 > bytes in a reserved area. I don't think it's too hard to find this space, > and I really, really would like Intel to follow up on a paravirtualized > backdoor. > > That said, we have the problem of existing guests, so I agree something else > is needed. > > > a) Two bridges - one 'passthrough' and the legacy ISA bridge > > that QEMU emulates. Both Linux and Windows are OK with > > two bridges (even thought it is pretty weird). > > This is pretty much the only solution for existing Linux guests that look up > the southbridge by class. Right. > > The proposed solution here is to define a new "pci stub" device in QEMU that > lets you define a do-nothing device with your desired vendor ID, device ID, > class and optionally subsystem IDs. > > The new machine type (the one that instantiates the special > IGD-passthrough-enabled northbridge) can then instantiate this stub device > at 1f.0 with the desired vendor ID, device ID and class ID. Which is kind of neat because you can use a different type of device ID with (say make it look like Ibex Peak) and pair it up with an IGD that is found only on LynxPoint. Oh fun! > > If we cannot get the paravirtualized backdoor, it would also make sense to: > > - have drivers standardize on a single way to probe the southbridge > > - make this be neither by class (because the firmware wants to distinguish > the actual ISA bridge from the stub, and it can do so by looking up the > class), nor by slot (because this conflicts with the Q35 chipset model that > has the southbridge at 1f.0). > > mst's proposal was to probe by subsystem id. I'm not sure I understood the > details exactly, but I trust him. :) However, in case it wasn't clear I > think a paravirtualized backdoor would still be better. OK, like this: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 651e65e..03f2829 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -433,6 +433,8 @@ void intel_detect_pch(struct drm_device *dev) unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; dev_priv->pch_id = id; + if (pch->subsystem_vendor == PCI_VENDOR_ID_XEN) + id = pch->device & INTEL_PCH_DEVICE_ID_MASK; if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_IBX; DRM_DEBUG_KMS("Found Ibex Peak PCH\n"); > > > b) One bridge - the one that QEMU emulates - and lets emulate > > more of the registers (by emulate - I mean for some get the > > data from the real hardware). > > > > b1). We can't use the legacy because the registers are > > above 256 (is that correct? Did I miss something?) > > As I understand it, mst brought up Q35 because the northbridge configuration > space layout might be more similar to what the driver expects than for > PIIX4. But I don't think anyone really said whether this is true or false. > > I think Q35 is absolutely not a requirement for IGD passthrough, especially > until this statement is either proved or disproved. OK, so lets drop that. > > >4). Code does a bit of sysfs that could use some refacturing with > > the KVM code. > > Problem: More time needed to do the code restructing. > > FWIW, I don't really care about code sharing with KVM. That's a separate > problem and it's not necessary to bring it up and make waters even more > muddy. > OK, lets drop that for now. > Paolo From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support Date: Wed, 2 Jul 2014 12:23:37 -0400 Message-ID: <20140702162337.GB32380@laptop.dumpdata.com> References: <20140630090511.GB15777@redhat.com> <53B1BAF9.6040800@citrix.com> <20140701053907.GA6108@redhat.com> <20140701170206.GB7640@redhat.com> <53B2F238.7000009@citrix.com> <53B3EDF5.4000802@redhat.com> <20140702140033.GG19068@laptop.dumpdata.com> <53B41C27.4030706@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) by gabe.freedesktop.org (Postfix) with ESMTP id 1D0C16E642 for ; Wed, 2 Jul 2014 09:24:06 -0700 (PDT) Content-Disposition: inline In-Reply-To: <53B41C27.4030706@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Paolo Bonzini , daniel.vetter@ffwll.ch, jani.nikula@linux.intel.com, airlied@linux.ie, intel-gfx@lists.freedesktop.org Cc: "peter.maydell@linaro.org" , "xen-devel@lists.xensource.com" , "anthony@codemonkey.ws" , "Michael S. Tsirkin" , "Kelly.Zytaruk@amd.com" , "qemu-devel@nongnu.org" , "yang.z.zhang@intel.com" , Stefano Stabellini , Ross Philipson , Anthony Perard , "Chen, Tiejun" List-Id: intel-gfx@lists.freedesktop.org On Wed, Jul 02, 2014 at 04:50:15PM +0200, Paolo Bonzini wrote: > Il 02/07/2014 16:00, Konrad Rzeszutek Wilk ha scritto: > >With this long thread I lost a bit context about the challenges > >that exists. But let me try summarizing it here - which will hopefully > >get some consensus. > > > >1). Fix IGD hardware to not use Southbridge magic addresses. > > We can moan and moan but I doubt it is going to change. > > There are two problems: > > - Northbridge (i.e. MCH i.e. PCI host bridge) configuration space addresses Right. So in drivers/gpu/drm/i915/i915_dma.c: 1135 #define MCHBAR_I915 0x44 1136 #define MCHBAR_I965 0x48 1147 int reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915; 1152 if (INTEL_INFO(dev)->gen >= 4) 1153 pci_read_config_dword(dev_priv->bridge_dev, reg + 4, &temp_hi); 1154 pci_read_config_dword(dev_priv->bridge_dev, reg, &temp_lo); 1155 mchbar_addr = ((u64)temp_hi << 32) | temp_lo; and 1139 #define DEVEN_REG 0x54 1193 int mchbar_reg = INTEL_INFO(dev)->gen >= 4 ? MCHBAR_I965 : MCHBAR_I915; 1202 if (IS_I915G(dev) || IS_I915GM(dev)) { 1203 pci_read_config_dword(dev_priv->bridge_dev, DEVEN_REG, &temp); 1204 enabled = !!(temp & DEVEN_MCHBAR_EN); 1205 } else { 1206 pci_read_config_dword(dev_priv->bridge_dev, mchbar_reg, &temp); 1207 enabled = temp & 1; 1208 } > > - Southbridge (i.e. PCH i.e. ISA bridge) vendor/device ID; some versions of > the driver identify it by class, some versions identify it by slot (1f.0). Right, So in drivers/gpu/drm/i915/i915_drv.c the giant intel_detect_pch which sets the pch_type based on : 432 if (pch->vendor == PCI_VENDOR_ID_INTEL) { 433 unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; 434 dev_priv->pch_id = id; 435 436 if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { It checks for 0x3b00, 0x1c00, 0x1e00, 0x8c00 and 0x9c00. The INTEL_PCH_DEVICE_ID_MASK is 0xff00 > > To solve the first, make a new machine type, PIIX4-based, and pass through > the registers you need. The patch must document _exactly_ why the registers > are safe to pass. If they are not reserved on PIIX4, the patch must > document what the same offsets mean on PIIX4, and why it's sensible to > assume that firmware for virtual machine will not read/write them. Bonus > point for also documenting the same for Q35. OK. They look to be related to setting up an MBAR , but I don't understand why it is needed. Hopefully some of the i915 folks CC-ed here can answer. > > Regarding the second, fixing IGD hardware to not rely on chipset magic is a > no-go, I agree. I disagree that it's a no-go to define a "backdoor" that > lets a hypervisor pass the right information to the driver without hacking > the chipset device model. > > The hardware folks would have to give us a place for a pair of registers > (something like data/address), and a bit somewhere else that would be always > 0 on hardware and always 1 if the hypervisor is implementing the pair of > registers. This is similar to CPUID, which has the HYPERVISOR bit + > hypervisor-defined leaves at 0x40000000. > > The data/address pair could be in a BAR, in configuration space, in the low > VGA ports at 0x3c0-0x3df, wherever. The hypervisor bit can be in the same > place or somewhere else---again, whatever is convenient for the hardware > folks. We just need *one bit* that is known-zero on all hardware, and 8 > bytes in a reserved area. I don't think it's too hard to find this space, > and I really, really would like Intel to follow up on a paravirtualized > backdoor. > > That said, we have the problem of existing guests, so I agree something else > is needed. > > > a) Two bridges - one 'passthrough' and the legacy ISA bridge > > that QEMU emulates. Both Linux and Windows are OK with > > two bridges (even thought it is pretty weird). > > This is pretty much the only solution for existing Linux guests that look up > the southbridge by class. Right. > > The proposed solution here is to define a new "pci stub" device in QEMU that > lets you define a do-nothing device with your desired vendor ID, device ID, > class and optionally subsystem IDs. > > The new machine type (the one that instantiates the special > IGD-passthrough-enabled northbridge) can then instantiate this stub device > at 1f.0 with the desired vendor ID, device ID and class ID. Which is kind of neat because you can use a different type of device ID with (say make it look like Ibex Peak) and pair it up with an IGD that is found only on LynxPoint. Oh fun! > > If we cannot get the paravirtualized backdoor, it would also make sense to: > > - have drivers standardize on a single way to probe the southbridge > > - make this be neither by class (because the firmware wants to distinguish > the actual ISA bridge from the stub, and it can do so by looking up the > class), nor by slot (because this conflicts with the Q35 chipset model that > has the southbridge at 1f.0). > > mst's proposal was to probe by subsystem id. I'm not sure I understood the > details exactly, but I trust him. :) However, in case it wasn't clear I > think a paravirtualized backdoor would still be better. OK, like this: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 651e65e..03f2829 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -433,6 +433,8 @@ void intel_detect_pch(struct drm_device *dev) unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; dev_priv->pch_id = id; + if (pch->subsystem_vendor == PCI_VENDOR_ID_XEN) + id = pch->device & INTEL_PCH_DEVICE_ID_MASK; if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_IBX; DRM_DEBUG_KMS("Found Ibex Peak PCH\n"); > > > b) One bridge - the one that QEMU emulates - and lets emulate > > more of the registers (by emulate - I mean for some get the > > data from the real hardware). > > > > b1). We can't use the legacy because the registers are > > above 256 (is that correct? Did I miss something?) > > As I understand it, mst brought up Q35 because the northbridge configuration > space layout might be more similar to what the driver expects than for > PIIX4. But I don't think anyone really said whether this is true or false. > > I think Q35 is absolutely not a requirement for IGD passthrough, especially > until this statement is either proved or disproved. OK, so lets drop that. > > >4). Code does a bit of sysfs that could use some refacturing with > > the KVM code. > > Problem: More time needed to do the code restructing. > > FWIW, I don't really care about code sharing with KVM. That's a separate > problem and it's not necessary to bring it up and make waters even more > muddy. > OK, lets drop that for now. > Paolo