linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jon Derrick <jonathan.derrick@intel.com>
To: <linux-pci@vger.kernel.org>, qemu-devel@nongnu.org
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	virtualization@lists.linux-foundation.org,
	Christoph Hellwig <hch@lst.de>,
	Andrzej Jakowski <andrzej.jakowski@linux.intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jon Derrick <jonathan.derrick@intel.com>
Subject: [PATCH v3 FOR QEMU v3] hw/vfio: Add VMD Passthrough Quirk
Date: Wed, 27 May 2020 23:02:38 -0400	[thread overview]
Message-ID: <20200528030240.16024-2-jonathan.derrick@intel.com> (raw)
In-Reply-To: <20200528030240.16024-1-jonathan.derrick@intel.com>

The VMD endpoint provides a real PCIe domain to the guest, including
bridges and endpoints. Because the VMD domain is enumerated by the guest
kernel, the guest kernel will assign Guest Physical Addresses to the
downstream endpoint BARs and bridge windows.

When the guest kernel performs MMIO to VMD sub-devices, MMU will
translate from the guest address space to the physical address space.
Because the bridges have been programmed with guest addresses, the
bridges will reject the transaction containing physical addresses.

VMD device 28C0 natively assists passthrough by providing the Host
Physical Address in shadow registers accessible to the guest for bridge
window assignment. The shadow registers are valid if bit 1 is set in VMD
VMLOCK config register 0x70.

In order to support existing VMDs, this quirk provides the shadow
registers in a vendor-specific PCI capability to the vfio-passthrough
device for all VMD device ids which don't natively assist with
passthrough. The Linux VMD driver is updated to check for this new
vendor-specific capability.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 hw/vfio/pci-quirks.c | 84 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 72 insertions(+), 12 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 3bd05fed12..6b8c1edfd5 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1567,18 +1567,6 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
     return 0;
 }
 
-int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
-{
-    int ret;
-
-    ret = vfio_add_nv_gpudirect_cap(vdev, errp);
-    if (ret) {
-        return ret;
-    }
-
-    return 0;
-}
-
 static void vfio_pci_nvlink2_get_tgt(Object *obj, Visitor *v,
                                      const char *name,
                                      void *opaque, Error **errp)
@@ -1709,3 +1697,75 @@ free_exit:
 
     return ret;
 }
+
+/*
+ * The VMD endpoint provides a real PCIe domain to the guest and the guest
+ * kernel performs enumeration of the VMD sub-device domain. Guest transactions
+ * to VMD sub-devices go through MMU translation from guest addresses to
+ * physical addresses. When MMIO goes to an endpoint after being translated to
+ * physical addresses, the bridge rejects the transaction because the window
+ * has been programmed with guest addresses.
+ *
+ * VMD can use the Host Physical Address in order to correctly program the
+ * bridge windows in its PCIe domain. VMD device 28C0 has HPA shadow registers
+ * located at offset 0x2000 in MEMBAR2 (BAR 4). This quirk provides the HPA
+ * shadow registers in a vendor-specific capability register for devices
+ * without native support. The position of 0xE8-0xFF is in the reserved range
+ * of the VMD device capability space following the Power Management
+ * Capability.
+ */
+#define VMD_SHADOW_CAP_VER 1
+#define VMD_SHADOW_CAP_LEN 24
+static int vfio_add_vmd_shadow_cap(VFIOPCIDevice *vdev, Error **errp)
+{
+    uint8_t membar_phys[16];
+    int ret, pos = 0xE8;
+
+    if (!(vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x201D) ||
+          vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x467F) ||
+          vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x4C3D) ||
+          vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, 0x9A0B))) {
+        return 0;
+    }
+
+    ret = pread(vdev->vbasedev.fd, membar_phys, 16,
+                vdev->config_offset + PCI_BASE_ADDRESS_2);
+    if (ret != 16) {
+        error_report("VMD %s cannot read MEMBARs (%d)",
+                     vdev->vbasedev.name, ret);
+        return -EFAULT;
+    }
+
+    ret = pci_add_capability(&vdev->pdev, PCI_CAP_ID_VNDR, pos,
+                             VMD_SHADOW_CAP_LEN, errp);
+    if (ret < 0) {
+        error_prepend(errp, "Failed to add VMD MEMBAR Shadow cap: ");
+        return ret;
+    }
+
+    memset(vdev->emulated_config_bits + pos, 0xFF, VMD_SHADOW_CAP_LEN);
+    pos += PCI_CAP_FLAGS;
+    pci_set_byte(vdev->pdev.config + pos++, VMD_SHADOW_CAP_LEN);
+    pci_set_byte(vdev->pdev.config + pos++, VMD_SHADOW_CAP_VER);
+    pci_set_long(vdev->pdev.config + pos, 0x53484457); /* SHDW */
+    memcpy(vdev->pdev.config + pos + 4, membar_phys, 16);
+
+    return 0;
+}
+
+int vfio_add_virt_caps(VFIOPCIDevice *vdev, Error **errp)
+{
+    int ret;
+
+    ret = vfio_add_nv_gpudirect_cap(vdev, errp);
+    if (ret) {
+        return ret;
+    }
+
+    ret = vfio_add_vmd_shadow_cap(vdev, errp);
+    if (ret) {
+        return ret;
+    }
+
+    return 0;
+}
-- 
2.18.1


  reply	other threads:[~2020-05-28  3:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28  3:02 [PATCH v3 0/2] VMD endpoint passthrough support Jon Derrick
2020-05-28  3:02 ` Jon Derrick [this message]
2020-05-28  3:02 ` [PATCH v3 1/2] PCI: vmd: Filter resource type bits from shadow register Jon Derrick
2020-05-29 10:33   ` Lorenzo Pieralisi
2020-05-29 15:53     ` Derrick, Jonathan
2020-05-29 16:18       ` Lorenzo Pieralisi
2020-06-11 21:16         ` Derrick, Jonathan
2020-06-12 13:54           ` Lorenzo Pieralisi
2020-06-12 15:11             ` Derrick, Jonathan
2020-05-28  3:02 ` [PATCH v3 2/2] PCI: vmd: Use Shadow MEMBAR registers for QEMU/KVM guests Jon Derrick
2020-07-06  9:16   ` Lorenzo Pieralisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200528030240.16024-2-jonathan.derrick@intel.com \
    --to=jonathan.derrick@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=andrzej.jakowski@linux.intel.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).