qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Two minor fixes on virtio-iommu
@ 2024-01-22  6:40 Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22  6:40 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
	jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan

PATCH1 fixes a potential issue with vfio devices when reboot to a
different OS which set bus number differently from previous OS.
I didn't reproduce the issue in reality, but it's still possible
in theory.

PATCH2 is a prerequisite of of PATCH3.

PATCH3 make virtio-iommu support PCI device aliases. If there are
more than one device in same IOMMU group, either due to topology,
isolation feature, etc. virtio-iommu can only make one which has
alias BDF works. This impacts both emulated and vfio devices.
I have reproduced the failure with an example config to have two
vfio devices under same pcie to pci bridge.
This patch also make a proper place in virtio-iommu to store
iova_ranges from vfio device when vfio device shares same IOMMU
group with other devices, either emulated or vfio devices.

Zhenzhong Duan (3):
  virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  hw/pci: Add two parameters to get_address_space
  virtio-iommu: Support PCI device aliases

 include/hw/pci/pci.h     | 11 ++++++++---
 hw/alpha/typhoon.c       |  3 ++-
 hw/arm/smmu-common.c     |  3 ++-
 hw/i386/amd_iommu.c      |  6 ++++--
 hw/i386/intel_iommu.c    |  6 ++++--
 hw/pci-host/astro.c      |  3 ++-
 hw/pci-host/designware.c |  3 ++-
 hw/pci-host/dino.c       |  3 ++-
 hw/pci-host/pnv_phb3.c   |  3 ++-
 hw/pci-host/pnv_phb4.c   |  3 ++-
 hw/pci-host/ppce500.c    |  3 ++-
 hw/pci-host/raven.c      |  3 ++-
 hw/pci-host/sabre.c      |  3 ++-
 hw/pci/pci.c             |  3 ++-
 hw/ppc/ppc440_pcix.c     |  3 ++-
 hw/ppc/spapr_pci.c       |  3 ++-
 hw/remote/iommu.c        |  3 ++-
 hw/s390x/s390-pci-bus.c  |  3 ++-
 hw/virtio/virtio-iommu.c | 21 ++++++++++++---------
 19 files changed, 58 insertions(+), 31 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  2024-01-22  6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
@ 2024-01-22  6:40 ` Zhenzhong Duan
  2024-01-23  9:41   ` Cédric Le Goater
  2024-01-22  6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
  2 siblings, 1 reply; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22  6:40 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
	jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan

IOMMUPciBus pointer cache is indexed by bus number, bus number
may not always be a fixed value, i.e., guest reboot to different
kernel which set bus number with different algorithm.

This could lead to endpoint binding to wrong iommu MR in
virtio_iommu_get_endpoint(), then vfio device setup wrong
mapping from other device.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/virtio/virtio-iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 8a4bd933c6..bfce3237f3 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque)
 
     trace_virtio_iommu_system_reset();
 
+    memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
+
     /*
      * config.bypass is sticky across device reset, but should be restored on
      * system reset
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] hw/pci: Add two parameters to get_address_space
  2024-01-22  6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
@ 2024-01-22  6:40 ` Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
  2 siblings, 0 replies; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22  6:40 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
	jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan,
	Richard Henderson, Peter Maydell, Paolo Bonzini, Eduardo Habkost,
	Marcel Apfelbaum, Helge Deller, Andrey Smirnov,
	Cédric Le Goater, Nicholas Piggin,
	Frédéric Barrat, Hervé Poussineau,
	Mark Cave-Ayland, BALATON Zoltan, Daniel Henrique Barboza,
	David Gibson, Harsh Prateek Bora, Elena Ufimtseva,
	Jagannathan Raman, Matthew Rosato, Eric Farman, Thomas Huth,
	Halil Pasic, Christian Borntraeger, David Hildenbrand,
	Ilya Leoshkevich, open list:PowerNV Non-Virt...,
	open list:S390 PCI

This adds PCI device's real bus and devfn to API get_address_space(),
for vIOMMU which also wants real BDF, i.e., virtio-iommu.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 include/hw/pci/pci.h     | 11 ++++++++---
 hw/alpha/typhoon.c       |  3 ++-
 hw/arm/smmu-common.c     |  3 ++-
 hw/i386/amd_iommu.c      |  6 ++++--
 hw/i386/intel_iommu.c    |  6 ++++--
 hw/pci-host/astro.c      |  3 ++-
 hw/pci-host/designware.c |  3 ++-
 hw/pci-host/dino.c       |  3 ++-
 hw/pci-host/pnv_phb3.c   |  3 ++-
 hw/pci-host/pnv_phb4.c   |  3 ++-
 hw/pci-host/ppce500.c    |  3 ++-
 hw/pci-host/raven.c      |  3 ++-
 hw/pci-host/sabre.c      |  3 ++-
 hw/pci/pci.c             |  3 ++-
 hw/ppc/ppc440_pcix.c     |  3 ++-
 hw/ppc/spapr_pci.c       |  3 ++-
 hw/remote/iommu.c        |  3 ++-
 hw/s390x/s390-pci-bus.c  |  3 ++-
 hw/virtio/virtio-iommu.c |  3 ++-
 19 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index fa6313aabc..2483d61a8c 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -378,13 +378,18 @@ typedef struct PCIIOMMUOps {
      *
      * Mandatory callback which returns a pointer to an #AddressSpace
      *
-     * @bus: the #PCIBus being accessed.
+     * @bus: the aliased #PCIBus being accessed.
      *
      * @opaque: the data passed to pci_setup_iommu().
      *
-     * @devfn: device and function number
+     * @devfn: aliased device and function number
+     *
+     * @real_bus: the #PCIBus being accessed.
+     *
+     * @real_devfn: device and function number
      */
-   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+   AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn,
+                                       PCIBus *real_bus, int real_devfn);
 } PCIIOMMUOps;
 
 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
index e8711ae16a..eda5a1c391 100644
--- a/hw/alpha/typhoon.c
+++ b/hw/alpha/typhoon.c
@@ -732,7 +732,8 @@ static IOMMUTLBEntry typhoon_translate_iommu(IOMMUMemoryRegion *iommu,
     return ret;
 }
 
-static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                           PCIBus *real_bus, int real_devfn)
 {
     TyphoonState *s = opaque;
     return &s->pchip.iommu_as;
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 9a8ac45431..c3a8f84c38 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -569,7 +569,8 @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
     return NULL;
 }
 
-static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn,
+                                      PCIBus *real_bus, int real_devfn)
 {
     SMMUState *s = opaque;
     SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus);
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 4203144da9..0cc63fd050 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1390,7 +1390,8 @@ static const MemoryRegionOps amdvi_ir_ops = {
     }
 };
 
-static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                          PCIBus *real_bus, int real_devfn)
 {
     char name[128];
     AMDVIState *s = opaque;
@@ -1578,7 +1579,8 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
     }
 
     /* Pseudo address space under root PCI bus. */
-    x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
+    x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID,
+                                            NULL, 0);
 
     /* set up MMIO */
     memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1a07faddb4..9d8c8e1d03 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4094,7 +4094,8 @@ static void vtd_reset(DeviceState *dev)
     vtd_address_space_refresh_all(s);
 }
 
-static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                        PCIBus *real_bus, int real_devfn)
 {
     IntelIOMMUState *s = opaque;
     VTDAddressSpace *vtd_as;
@@ -4233,7 +4234,8 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     vtd_init(s);
     pci_setup_iommu(bus, &vtd_iommu_ops, dev);
     /* Pseudo address space under root PCI bus. */
-    x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
+    x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC,
+                                          NULL, 0);
     qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
 }
 
diff --git a/hw/pci-host/astro.c b/hw/pci-host/astro.c
index 37d271118c..c6c0f3f95f 100644
--- a/hw/pci-host/astro.c
+++ b/hw/pci-host/astro.c
@@ -337,7 +337,8 @@ static IOMMUTLBEntry astro_translate_iommu(IOMMUMemoryRegion *iommu,
 }
 
 static AddressSpace *elroy_pcihost_set_iommu(PCIBus *bus, void *opaque,
-                                            int devfn)
+                                             int devfn,
+                                             PCIBus *real_bus, int real_devfn)
 {
     ElroyState *s = opaque;
     return &s->astro->iommu_as;
diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..fc652e6609 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -656,7 +656,8 @@ static const MemoryRegionOps designware_pci_mmio_ops = {
 };
 
 static AddressSpace *designware_pcie_host_set_iommu(PCIBus *bus, void *opaque,
-                                                    int devfn)
+                                                    int devfn, PCIBus *real_bus,
+                                                    int real_devfn)
 {
     DesignwarePCIEHost *s = DESIGNWARE_PCIE_HOST(opaque);
 
diff --git a/hw/pci-host/dino.c b/hw/pci-host/dino.c
index d992c4bb69..45f8784b2b 100644
--- a/hw/pci-host/dino.c
+++ b/hw/pci-host/dino.c
@@ -347,7 +347,8 @@ static const MemoryRegionOps dino_config_addr_ops = {
 };
 
 static AddressSpace *dino_pcihost_set_iommu(PCIBus *bus, void *opaque,
-                                            int devfn)
+                                            int devfn, PCIBus *real_bus,
+                                            int real_devfn)
 {
     DinoState *s = opaque;
 
diff --git a/hw/pci-host/pnv_phb3.c b/hw/pci-host/pnv_phb3.c
index 2a74dbe45f..a0c4235fae 100644
--- a/hw/pci-host/pnv_phb3.c
+++ b/hw/pci-host/pnv_phb3.c
@@ -935,7 +935,8 @@ static const MemoryRegionOps pnv_phb3_msi_ops = {
     .endianness = DEVICE_LITTLE_ENDIAN
 };
 
-static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                        PCIBus *real_bus, int real_devfn)
 {
     PnvPHB3 *phb = opaque;
     PnvPhb3DMASpace *ds;
diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 075499d36d..fee34f376e 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1450,7 +1450,8 @@ static PnvPhb4DMASpace *pnv_phb4_dma_find(PnvPHB4 *phb, PCIBus *bus, int devfn)
     return ds;
 }
 
-static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                        PCIBus *real_bus, int real_devfn)
 {
     PnvPHB4 *phb = opaque;
     PnvPhb4DMASpace *ds;
diff --git a/hw/pci-host/ppce500.c b/hw/pci-host/ppce500.c
index fa0d67b342..52c17a8a02 100644
--- a/hw/pci-host/ppce500.c
+++ b/hw/pci-host/ppce500.c
@@ -428,7 +428,8 @@ static void e500_pcihost_bridge_realize(PCIDevice *d, Error **errp)
 }
 
 static AddressSpace *e500_pcihost_set_iommu(PCIBus *bus, void *opaque,
-                                            int devfn)
+                                            int devfn, PCIBus *real_bus,
+                                            int real_devfn)
 {
     PPCE500PCIState *s = opaque;
 
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index c7a0a2878a..d1b7c1a847 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -216,7 +216,8 @@ static void raven_set_irq(void *opaque, int irq_num, int level)
 }
 
 static AddressSpace *raven_pcihost_set_iommu(PCIBus *bus, void *opaque,
-                                             int devfn)
+                                             int devfn, PCIBus *real_bus,
+                                             int real_devfn)
 {
     PREPPCIState *s = opaque;
 
diff --git a/hw/pci-host/sabre.c b/hw/pci-host/sabre.c
index d0851b48b0..c82c62c640 100644
--- a/hw/pci-host/sabre.c
+++ b/hw/pci-host/sabre.c
@@ -105,7 +105,8 @@ static inline void sabre_clear_request(SabreState *s, unsigned int irq_num)
     s->irq_request = NO_IRQ_REQUEST;
 }
 
-static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                         PCIBus *real_bus, int real_devfn)
 {
     IOMMUState *is = opaque;
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 76080af580..2f91b87559 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2719,7 +2719,8 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
     }
     if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
         return iommu_bus->iommu_ops->get_address_space(bus,
-                                 iommu_bus->iommu_opaque, devfn);
+                                 iommu_bus->iommu_opaque, devfn,
+                                 pci_get_bus(dev), dev->devfn);
     }
     return &address_space_memory;
 }
diff --git a/hw/ppc/ppc440_pcix.c b/hw/ppc/ppc440_pcix.c
index df4ee374d0..ea8fa28699 100644
--- a/hw/ppc/ppc440_pcix.c
+++ b/hw/ppc/ppc440_pcix.c
@@ -442,7 +442,8 @@ static void ppc440_pcix_set_irq(void *opaque, int irq_num, int level)
     qemu_set_irq(*pci_irq, level);
 }
 
-static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn)
+static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn,
+                                           PCIBus *real_bus, int real_devfn)
 {
     PPC440PCIXState *s = opaque;
 
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 25e0295d6f..4a2893c845 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -773,7 +773,8 @@ static const MemoryRegionOps spapr_msi_ops = {
 /*
  * PHB PCI device
  */
-static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                         PCIBus *real_bus, int real_devfn)
 {
     SpaprPhbState *phb = opaque;
 
diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
index 7c56aad0fc..7e71647e79 100644
--- a/hw/remote/iommu.c
+++ b/hw/remote/iommu.c
@@ -37,7 +37,8 @@
  */
 
 static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
-                                              void *opaque, int devfn)
+                                              void *opaque, int devfn,
+                                              PCIBus *real_bus, int real_devfn)
 {
     RemoteIommu *iommu = opaque;
     RemoteIommuElem *elem = NULL;
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 347580ebac..00de41df69 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -644,7 +644,8 @@ static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
     return iommu;
 }
 
-static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+                                        PCIBus *real_bus, int real_devfn)
 {
     S390pciState *s = opaque;
     S390PCIIOMMU *iommu = s390_pci_get_iommu(s, bus, devfn);
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index bfce3237f3..d99c1f0d64 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -395,7 +395,8 @@ static void add_prop_resv_regions(IOMMUDevice *sdev)
 }
 
 static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
-                                              int devfn)
+                                              int devfn, PCIBus *real_bus,
+                                              int real_devfn)
 {
     VirtIOIOMMU *s = opaque;
     IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] virtio-iommu: Support PCI device aliases
  2024-01-22  6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
  2024-01-22  6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
@ 2024-01-22  6:40 ` Zhenzhong Duan
  2024-01-24 20:55   ` Eric Auger
  2 siblings, 1 reply; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22  6:40 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
	jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan

Currently virtio-iommu doesn't work well if there are multiple devices
in same iommu group. In below example config, guest virtio-iommu driver
can successfully probe first device but fail on others. Only one device
under the bridge can work normally.

-device virtio-iommu \
-device pcie-pci-bridge,id=root0 \
-device vfio-pci,host=81:11.0,bus=root0 \
-device vfio-pci,host=6f:01.0,bus=root0 \

The reason is virtio-iommu stores AS(address space) in hash table with
aliased BDF and corelates endpoint which is indexed by device's real
BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
table, we either get wrong AS or NULL.

Fix it by storing AS indexed by real BDF. This way also make iova_ranges
from vfio device stored in IOMMUDevice of real BDF successfully.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/virtio/virtio-iommu.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index d99c1f0d64..6880d92a44 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
                                               int real_devfn)
 {
     VirtIOIOMMU *s = opaque;
-    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
     static uint32_t mr_index;
     IOMMUDevice *sdev;
 
     if (!sbus) {
         sbus = g_malloc0(sizeof(IOMMUPciBus) +
                          sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
-        sbus->bus = bus;
-        g_hash_table_insert(s->as_by_busptr, bus, sbus);
+        sbus->bus = real_bus;
+        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
     }
 
-    sdev = sbus->pbdev[devfn];
+    sdev = sbus->pbdev[real_devfn];
     if (!sdev) {
         char *name = g_strdup_printf("%s-%d-%d",
                                      TYPE_VIRTIO_IOMMU_MEMORY_REGION,
-                                     mr_index++, devfn);
-        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
+                                     mr_index++, real_devfn);
+        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
 
         sdev->viommu = s;
-        sdev->bus = bus;
-        sdev->devfn = devfn;
+        sdev->bus = real_bus;
+        sdev->devfn = real_devfn;
 
         trace_virtio_iommu_init_iommu_mr(name);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  2024-01-22  6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
@ 2024-01-23  9:41   ` Cédric Le Goater
  2024-01-23 10:03     ` Duan, Zhenzhong
  0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2024-01-23  9:41 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, peterx, jasowang,
	mst, yi.l.liu, chao.p.peng

On 1/22/24 07:40, Zhenzhong Duan wrote:
> IOMMUPciBus pointer cache is indexed by bus number, bus number
> may not always be a fixed value, i.e., guest reboot to different
> kernel which set bus number with different algorithm.
> 
> This could lead to endpoint binding to wrong iommu MR in
> virtio_iommu_get_endpoint(), then vfio device setup wrong
> mapping from other device.
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   hw/virtio/virtio-iommu.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 8a4bd933c6..bfce3237f3 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque)
>   
>       trace_virtio_iommu_system_reset();
>   
> +    memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
> +
>       /*
>        * config.bypass is sticky across device reset, but should be restored on
>        * system reset

you could remove the memset in virtio_iommu_device_realize() then ?


Thanks,

C.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  2024-01-23  9:41   ` Cédric Le Goater
@ 2024-01-23 10:03     ` Duan, Zhenzhong
  2024-01-24 21:04       ` Eric Auger
  0 siblings, 1 reply; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-23 10:03 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-devel, qemu-arm
  Cc: eric.auger, jean-philippe, alex.williamson, peterx, jasowang,
	mst, Liu, Yi L, Peng, Chao P



>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>when system reset
>
>On 1/22/24 07:40, Zhenzhong Duan wrote:
>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>> may not always be a fixed value, i.e., guest reboot to different
>> kernel which set bus number with different algorithm.
>>
>> This could lead to endpoint binding to wrong iommu MR in
>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>> mapping from other device.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>   hw/virtio/virtio-iommu.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 8a4bd933c6..bfce3237f3 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>*opaque)
>>
>>       trace_virtio_iommu_system_reset();
>>
>> +    memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>iommu_pcibus_by_bus_num));
>> +
>>       /*
>>        * config.bypass is sticky across device reset, but should be restored on
>>        * system reset
>
>you could remove the memset in virtio_iommu_device_realize() then ?

Good suggestion, will do.

Thanks
Zhenzhong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
  2024-01-22  6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
@ 2024-01-24 20:55   ` Eric Auger
  2024-01-25  5:58     ` Duan, Zhenzhong
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Auger @ 2024-01-24 20:55 UTC (permalink / raw)
  To: Zhenzhong Duan, qemu-devel, qemu-arm
  Cc: jean-philippe, alex.williamson, clg, peterx, jasowang, mst,
	yi.l.liu, chao.p.peng

Hi Zhenzhong,

On 1/22/24 07:40, Zhenzhong Duan wrote:
> Currently virtio-iommu doesn't work well if there are multiple devices
> in same iommu group. In below example config, guest virtio-iommu driver
> can successfully probe first device but fail on others. Only one device
> under the bridge can work normally.
>
> -device virtio-iommu \
> -device pcie-pci-bridge,id=root0 \
> -device vfio-pci,host=81:11.0,bus=root0 \
> -device vfio-pci,host=6f:01.0,bus=root0 \
>
> The reason is virtio-iommu stores AS(address space) in hash table with
> aliased BDF and corelates endpoint which is indexed by device's real
> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
> table, we either get wrong AS or NULL.
>
> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
> from vfio device stored in IOMMUDevice of real BDF successfully.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  hw/virtio/virtio-iommu.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index d99c1f0d64..6880d92a44 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>                                                int real_devfn)
>  {
>      VirtIOIOMMU *s = opaque;
> -    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
>      static uint32_t mr_index;
>      IOMMUDevice *sdev;
>  
>      if (!sbus) {
>          sbus = g_malloc0(sizeof(IOMMUPciBus) +
>                           sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
> -        sbus->bus = bus;
> -        g_hash_table_insert(s->as_by_busptr, bus, sbus);
> +        sbus->bus = real_bus;
> +        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
>      }
>  
> -    sdev = sbus->pbdev[devfn];
> +    sdev = sbus->pbdev[real_devfn];
>      if (!sdev) {
>          char *name = g_strdup_printf("%s-%d-%d",
>                                       TYPE_VIRTIO_IOMMU_MEMORY_REGION,
> -                                     mr_index++, devfn);
> -        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
> +                                     mr_index++, real_devfn);
> +        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>  
>          sdev->viommu = s;
> -        sdev->bus = bus;
> -        sdev->devfn = devfn;
> +        sdev->bus = real_bus;
> +        sdev->devfn = real_devfn;
but then this means the 2 devices would be abstracted by two different
IOMMU MRs whereas in practice they cannot be distinguished from an IOMMU
pov. Shouldn't the virtio-iommu driver use the same ep_id for both
devices within the same group?

Note there are some known issues about virtio-iommu and pcie-to-pci
bridges which were reported early last year and confirmed by Robin
Murphy. See:

[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() <https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/#r>

https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/

Thanks

Eric




>  
>          trace_virtio_iommu_init_iommu_mr(name);
>  



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  2024-01-23 10:03     ` Duan, Zhenzhong
@ 2024-01-24 21:04       ` Eric Auger
  2024-01-25  2:46         ` Duan, Zhenzhong
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Auger @ 2024-01-24 21:04 UTC (permalink / raw)
  To: Duan, Zhenzhong, Cédric Le Goater, qemu-devel, qemu-arm
  Cc: jean-philippe, alex.williamson, peterx, jasowang, mst, Liu, Yi L,
	Peng, Chao P

Hi Zhenzhong,

On 1/23/24 11:03, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>> when system reset
>>
>> On 1/22/24 07:40, Zhenzhong Duan wrote:
>>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>>> may not always be a fixed value, i.e., guest reboot to different
>>> kernel which set bus number with different algorithm.
this cannot harm to reset it but I don't know if this can be encountered.
>>>
>>> This could lead to endpoint binding to wrong iommu MR in
>>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>>> mapping from other device.
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>   hw/virtio/virtio-iommu.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>> index 8a4bd933c6..bfce3237f3 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>> *opaque)
>>>       trace_virtio_iommu_system_reset();
>>>
>>> +    memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>> iommu_pcibus_by_bus_num));
>>> +
>>>       /*
>>>        * config.bypass is sticky across device reset, but should be restored on
>>>        * system reset
>> you could remove the memset in virtio_iommu_device_realize() then ?
> Good suggestion, will do.
By the way what about the vtd implementation. s->vtd_address_spaces is
hash table but I don't see it reset either?
Also if is requested here we would need it on smmuv3 as well.

Thanks

Eric
>
> Thanks
> Zhenzhong



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
  2024-01-24 21:04       ` Eric Auger
@ 2024-01-25  2:46         ` Duan, Zhenzhong
  0 siblings, 0 replies; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-25  2:46 UTC (permalink / raw)
  To: eric.auger, Cédric Le Goater, qemu-devel, qemu-arm
  Cc: jean-philippe, alex.williamson, peterx, jasowang, mst, Liu, Yi L,
	Peng, Chao P

Hi Eric,

>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>when system reset
>
>Hi Zhenzhong,
>
>On 1/23/24 11:03, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer
>cache
>>> when system reset
>>>
>>> On 1/22/24 07:40, Zhenzhong Duan wrote:
>>>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>>>> may not always be a fixed value, i.e., guest reboot to different
>>>> kernel which set bus number with different algorithm.
>this cannot harm to reset it but I don't know if this can be encountered.
>>>>
>>>> This could lead to endpoint binding to wrong iommu MR in
>>>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>>>> mapping from other device.
>>>>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>>   hw/virtio/virtio-iommu.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>> index 8a4bd933c6..bfce3237f3 100644
>>>> --- a/hw/virtio/virtio-iommu.c
>>>> +++ b/hw/virtio/virtio-iommu.c
>>>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>>> *opaque)
>>>>       trace_virtio_iommu_system_reset();
>>>>
>>>> +    memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>>> iommu_pcibus_by_bus_num));
>>>> +
>>>>       /*
>>>>        * config.bypass is sticky across device reset, but should be restored
>on
>>>>        * system reset
>>> you could remove the memset in virtio_iommu_device_realize() then ?
>> Good suggestion, will do.
>By the way what about the vtd implementation. s->vtd_address_spaces is
>hash table but I don't see it reset either?

Good question!
s->vtd_address_spaces is indexed by a key containing (PCIBus *) which is reliable.

/*
 * PCI bus number (or SID) is not reliable since the device is usaully
 * initialized before guest can configure the PCI bridge
 * (SECONDARY_BUS_NUMBER).
 */
struct vtd_as_key {
    PCIBus *bus;
    uint8_t devfn;
    uint32_t pasid;
};

So I don’t think it should reset, same for s->as_by_busptr in virtio-iommu.

s->vtd_as_cache[bus_num] is unstable after guest reboot, but vtd_get_as_by_sid() has
logic to verify and update cache, so it doesn't have issue.

But if we hotplug/unplug bridge in a loop, there may be issue with s->vtd_address_spaces
Because old AS is never released. Anyway that's beyond scope of this patch.

>Also if is requested here we would need it on smmuv3 as well.

Good suggestion, I think so, I'll add a patch to smmuv3 for review.

Thanks
Zhenzhong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] virtio-iommu: Support PCI device aliases
  2024-01-24 20:55   ` Eric Auger
@ 2024-01-25  5:58     ` Duan, Zhenzhong
  0 siblings, 0 replies; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-25  5:58 UTC (permalink / raw)
  To: eric.auger, qemu-devel, qemu-arm
  Cc: jean-philippe, alex.williamson, clg, peterx, jasowang, mst, Liu,
	Yi L, Peng, Chao P



>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
>
>Hi Zhenzhong,
>
>On 1/22/24 07:40, Zhenzhong Duan wrote:
>> Currently virtio-iommu doesn't work well if there are multiple devices
>> in same iommu group. In below example config, guest virtio-iommu driver
>> can successfully probe first device but fail on others. Only one device
>> under the bridge can work normally.
>>
>> -device virtio-iommu \
>> -device pcie-pci-bridge,id=root0 \
>> -device vfio-pci,host=81:11.0,bus=root0 \
>> -device vfio-pci,host=6f:01.0,bus=root0 \
>>
>> The reason is virtio-iommu stores AS(address space) in hash table with
>> aliased BDF and corelates endpoint which is indexed by device's real
>> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
>> table, we either get wrong AS or NULL.
>>
>> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
>> from vfio device stored in IOMMUDevice of real BDF successfully.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  hw/virtio/virtio-iommu.c | 16 ++++++++--------
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index d99c1f0d64..6880d92a44 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -399,27 +399,27 @@ static AddressSpace
>*virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>>                                                int real_devfn)
>>  {
>>      VirtIOIOMMU *s = opaque;
>> -    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr,
>real_bus);
>>      static uint32_t mr_index;
>>      IOMMUDevice *sdev;
>>
>>      if (!sbus) {
>>          sbus = g_malloc0(sizeof(IOMMUPciBus) +
>>                           sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
>> -        sbus->bus = bus;
>> -        g_hash_table_insert(s->as_by_busptr, bus, sbus);
>> +        sbus->bus = real_bus;
>> +        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
>>      }
>>
>> -    sdev = sbus->pbdev[devfn];
>> +    sdev = sbus->pbdev[real_devfn];
>>      if (!sdev) {
>>          char *name = g_strdup_printf("%s-%d-%d",
>>                                       TYPE_VIRTIO_IOMMU_MEMORY_REGION,
>> -                                     mr_index++, devfn);
>> -        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
>> +                                     mr_index++, real_devfn);
>> +        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>>
>>          sdev->viommu = s;
>> -        sdev->bus = bus;
>> -        sdev->devfn = devfn;
>> +        sdev->bus = real_bus;
>> +        sdev->devfn = real_devfn;
>but then this means the 2 devices would be abstracted by two different
>IOMMU MRs whereas in practice they cannot be distinguished from an
>IOMMU pov.

Yes, normally the two different IOMMU MRs should link to same guest domain,
so translation result will be same. But if a malicious guest try to break,
then it fails to block that.

>Shouldn't the virtio-iommu driver use the same ep_id for both
>devices within the same group?

IIUC, you mean for domain attach and not probe request?
I was thinking ep_id represented an existing device in guest, not the aliased one. 

>
>Note there are some known issues about virtio-iommu and pcie-to-pci
>bridges which were reported early last year and confirmed by Robin
>Murphy. See:
>
>[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr()
><https://lore.kernel.org/all/20230116124709.793084-1-
>eric.auger@redhat.com/#r>

Thanks for sharing, it’s valuable😊

BRs.
Zhenzhong

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-01-25  5:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-22  6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
2024-01-22  6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
2024-01-23  9:41   ` Cédric Le Goater
2024-01-23 10:03     ` Duan, Zhenzhong
2024-01-24 21:04       ` Eric Auger
2024-01-25  2:46         ` Duan, Zhenzhong
2024-01-22  6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
2024-01-22  6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
2024-01-24 20:55   ` Eric Auger
2024-01-25  5:58     ` Duan, Zhenzhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).