* [PATCH 0/3] Two minor fixes on virtio-iommu
@ 2024-01-22 6:40 Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw)
To: qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan
PATCH1 fixes a potential issue with vfio devices when reboot to a
different OS which set bus number differently from previous OS.
I didn't reproduce the issue in reality, but it's still possible
in theory.
PATCH2 is a prerequisite of of PATCH3.
PATCH3 make virtio-iommu support PCI device aliases. If there are
more than one device in same IOMMU group, either due to topology,
isolation feature, etc. virtio-iommu can only make one which has
alias BDF works. This impacts both emulated and vfio devices.
I have reproduced the failure with an example config to have two
vfio devices under same pcie to pci bridge.
This patch also make a proper place in virtio-iommu to store
iova_ranges from vfio device when vfio device shares same IOMMU
group with other devices, either emulated or vfio devices.
Zhenzhong Duan (3):
virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
hw/pci: Add two parameters to get_address_space
virtio-iommu: Support PCI device aliases
include/hw/pci/pci.h | 11 ++++++++---
hw/alpha/typhoon.c | 3 ++-
hw/arm/smmu-common.c | 3 ++-
hw/i386/amd_iommu.c | 6 ++++--
hw/i386/intel_iommu.c | 6 ++++--
hw/pci-host/astro.c | 3 ++-
hw/pci-host/designware.c | 3 ++-
hw/pci-host/dino.c | 3 ++-
hw/pci-host/pnv_phb3.c | 3 ++-
hw/pci-host/pnv_phb4.c | 3 ++-
hw/pci-host/ppce500.c | 3 ++-
hw/pci-host/raven.c | 3 ++-
hw/pci-host/sabre.c | 3 ++-
hw/pci/pci.c | 3 ++-
hw/ppc/ppc440_pcix.c | 3 ++-
hw/ppc/spapr_pci.c | 3 ++-
hw/remote/iommu.c | 3 ++-
hw/s390x/s390-pci-bus.c | 3 ++-
hw/virtio/virtio-iommu.c | 21 ++++++++++++---------
19 files changed, 58 insertions(+), 31 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
@ 2024-01-22 6:40 ` Zhenzhong Duan
2024-01-23 9:41 ` Cédric Le Goater
2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
2 siblings, 1 reply; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw)
To: qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan
IOMMUPciBus pointer cache is indexed by bus number, bus number
may not always be a fixed value, i.e., guest reboot to different
kernel which set bus number with different algorithm.
This could lead to endpoint binding to wrong iommu MR in
virtio_iommu_get_endpoint(), then vfio device setup wrong
mapping from other device.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/virtio/virtio-iommu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 8a4bd933c6..bfce3237f3 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque)
trace_virtio_iommu_system_reset();
+ memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
+
/*
* config.bypass is sticky across device reset, but should be restored on
* system reset
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] hw/pci: Add two parameters to get_address_space
2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
@ 2024-01-22 6:40 ` Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
2 siblings, 0 replies; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw)
To: qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan,
Richard Henderson, Peter Maydell, Paolo Bonzini, Eduardo Habkost,
Marcel Apfelbaum, Helge Deller, Andrey Smirnov,
Cédric Le Goater, Nicholas Piggin,
Frédéric Barrat, Hervé Poussineau,
Mark Cave-Ayland, BALATON Zoltan, Daniel Henrique Barboza,
David Gibson, Harsh Prateek Bora, Elena Ufimtseva,
Jagannathan Raman, Matthew Rosato, Eric Farman, Thomas Huth,
Halil Pasic, Christian Borntraeger, David Hildenbrand,
Ilya Leoshkevich, open list:PowerNV Non-Virt...,
open list:S390 PCI
This adds PCI device's real bus and devfn to API get_address_space(),
for vIOMMU which also wants real BDF, i.e., virtio-iommu.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/pci/pci.h | 11 ++++++++---
hw/alpha/typhoon.c | 3 ++-
hw/arm/smmu-common.c | 3 ++-
hw/i386/amd_iommu.c | 6 ++++--
hw/i386/intel_iommu.c | 6 ++++--
hw/pci-host/astro.c | 3 ++-
hw/pci-host/designware.c | 3 ++-
hw/pci-host/dino.c | 3 ++-
hw/pci-host/pnv_phb3.c | 3 ++-
hw/pci-host/pnv_phb4.c | 3 ++-
hw/pci-host/ppce500.c | 3 ++-
hw/pci-host/raven.c | 3 ++-
hw/pci-host/sabre.c | 3 ++-
hw/pci/pci.c | 3 ++-
hw/ppc/ppc440_pcix.c | 3 ++-
hw/ppc/spapr_pci.c | 3 ++-
hw/remote/iommu.c | 3 ++-
hw/s390x/s390-pci-bus.c | 3 ++-
hw/virtio/virtio-iommu.c | 3 ++-
19 files changed, 48 insertions(+), 23 deletions(-)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index fa6313aabc..2483d61a8c 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -378,13 +378,18 @@ typedef struct PCIIOMMUOps {
*
* Mandatory callback which returns a pointer to an #AddressSpace
*
- * @bus: the #PCIBus being accessed.
+ * @bus: the aliased #PCIBus being accessed.
*
* @opaque: the data passed to pci_setup_iommu().
*
- * @devfn: device and function number
+ * @devfn: aliased device and function number
+ *
+ * @real_bus: the #PCIBus being accessed.
+ *
+ * @real_devfn: device and function number
*/
- AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn);
} PCIIOMMUOps;
AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
index e8711ae16a..eda5a1c391 100644
--- a/hw/alpha/typhoon.c
+++ b/hw/alpha/typhoon.c
@@ -732,7 +732,8 @@ static IOMMUTLBEntry typhoon_translate_iommu(IOMMUMemoryRegion *iommu,
return ret;
}
-static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
TyphoonState *s = opaque;
return &s->pchip.iommu_as;
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 9a8ac45431..c3a8f84c38 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -569,7 +569,8 @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num)
return NULL;
}
-static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
SMMUState *s = opaque;
SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus);
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 4203144da9..0cc63fd050 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1390,7 +1390,8 @@ static const MemoryRegionOps amdvi_ir_ops = {
}
};
-static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
char name[128];
AMDVIState *s = opaque;
@@ -1578,7 +1579,8 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
}
/* Pseudo address space under root PCI bus. */
- x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
+ x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID,
+ NULL, 0);
/* set up MMIO */
memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1a07faddb4..9d8c8e1d03 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4094,7 +4094,8 @@ static void vtd_reset(DeviceState *dev)
vtd_address_space_refresh_all(s);
}
-static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
IntelIOMMUState *s = opaque;
VTDAddressSpace *vtd_as;
@@ -4233,7 +4234,8 @@ static void vtd_realize(DeviceState *dev, Error **errp)
vtd_init(s);
pci_setup_iommu(bus, &vtd_iommu_ops, dev);
/* Pseudo address space under root PCI bus. */
- x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
+ x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC,
+ NULL, 0);
qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
}
diff --git a/hw/pci-host/astro.c b/hw/pci-host/astro.c
index 37d271118c..c6c0f3f95f 100644
--- a/hw/pci-host/astro.c
+++ b/hw/pci-host/astro.c
@@ -337,7 +337,8 @@ static IOMMUTLBEntry astro_translate_iommu(IOMMUMemoryRegion *iommu,
}
static AddressSpace *elroy_pcihost_set_iommu(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn,
+ PCIBus *real_bus, int real_devfn)
{
ElroyState *s = opaque;
return &s->astro->iommu_as;
diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
index dd9e389c07..fc652e6609 100644
--- a/hw/pci-host/designware.c
+++ b/hw/pci-host/designware.c
@@ -656,7 +656,8 @@ static const MemoryRegionOps designware_pci_mmio_ops = {
};
static AddressSpace *designware_pcie_host_set_iommu(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn, PCIBus *real_bus,
+ int real_devfn)
{
DesignwarePCIEHost *s = DESIGNWARE_PCIE_HOST(opaque);
diff --git a/hw/pci-host/dino.c b/hw/pci-host/dino.c
index d992c4bb69..45f8784b2b 100644
--- a/hw/pci-host/dino.c
+++ b/hw/pci-host/dino.c
@@ -347,7 +347,8 @@ static const MemoryRegionOps dino_config_addr_ops = {
};
static AddressSpace *dino_pcihost_set_iommu(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn, PCIBus *real_bus,
+ int real_devfn)
{
DinoState *s = opaque;
diff --git a/hw/pci-host/pnv_phb3.c b/hw/pci-host/pnv_phb3.c
index 2a74dbe45f..a0c4235fae 100644
--- a/hw/pci-host/pnv_phb3.c
+++ b/hw/pci-host/pnv_phb3.c
@@ -935,7 +935,8 @@ static const MemoryRegionOps pnv_phb3_msi_ops = {
.endianness = DEVICE_LITTLE_ENDIAN
};
-static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
PnvPHB3 *phb = opaque;
PnvPhb3DMASpace *ds;
diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 075499d36d..fee34f376e 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1450,7 +1450,8 @@ static PnvPhb4DMASpace *pnv_phb4_dma_find(PnvPHB4 *phb, PCIBus *bus, int devfn)
return ds;
}
-static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
PnvPHB4 *phb = opaque;
PnvPhb4DMASpace *ds;
diff --git a/hw/pci-host/ppce500.c b/hw/pci-host/ppce500.c
index fa0d67b342..52c17a8a02 100644
--- a/hw/pci-host/ppce500.c
+++ b/hw/pci-host/ppce500.c
@@ -428,7 +428,8 @@ static void e500_pcihost_bridge_realize(PCIDevice *d, Error **errp)
}
static AddressSpace *e500_pcihost_set_iommu(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn, PCIBus *real_bus,
+ int real_devfn)
{
PPCE500PCIState *s = opaque;
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index c7a0a2878a..d1b7c1a847 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -216,7 +216,8 @@ static void raven_set_irq(void *opaque, int irq_num, int level)
}
static AddressSpace *raven_pcihost_set_iommu(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn, PCIBus *real_bus,
+ int real_devfn)
{
PREPPCIState *s = opaque;
diff --git a/hw/pci-host/sabre.c b/hw/pci-host/sabre.c
index d0851b48b0..c82c62c640 100644
--- a/hw/pci-host/sabre.c
+++ b/hw/pci-host/sabre.c
@@ -105,7 +105,8 @@ static inline void sabre_clear_request(SabreState *s, unsigned int irq_num)
s->irq_request = NO_IRQ_REQUEST;
}
-static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
IOMMUState *is = opaque;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 76080af580..2f91b87559 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2719,7 +2719,8 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
}
if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
return iommu_bus->iommu_ops->get_address_space(bus,
- iommu_bus->iommu_opaque, devfn);
+ iommu_bus->iommu_opaque, devfn,
+ pci_get_bus(dev), dev->devfn);
}
return &address_space_memory;
}
diff --git a/hw/ppc/ppc440_pcix.c b/hw/ppc/ppc440_pcix.c
index df4ee374d0..ea8fa28699 100644
--- a/hw/ppc/ppc440_pcix.c
+++ b/hw/ppc/ppc440_pcix.c
@@ -442,7 +442,8 @@ static void ppc440_pcix_set_irq(void *opaque, int irq_num, int level)
qemu_set_irq(*pci_irq, level);
}
-static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn)
+static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
PPC440PCIXState *s = opaque;
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 25e0295d6f..4a2893c845 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -773,7 +773,8 @@ static const MemoryRegionOps spapr_msi_ops = {
/*
* PHB PCI device
*/
-static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
SpaprPhbState *phb = opaque;
diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
index 7c56aad0fc..7e71647e79 100644
--- a/hw/remote/iommu.c
+++ b/hw/remote/iommu.c
@@ -37,7 +37,8 @@
*/
static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
- void *opaque, int devfn)
+ void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
RemoteIommu *iommu = opaque;
RemoteIommuElem *elem = NULL;
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 347580ebac..00de41df69 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -644,7 +644,8 @@ static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
return iommu;
}
-static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn,
+ PCIBus *real_bus, int real_devfn)
{
S390pciState *s = opaque;
S390PCIIOMMU *iommu = s390_pci_get_iommu(s, bus, devfn);
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index bfce3237f3..d99c1f0d64 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -395,7 +395,8 @@ static void add_prop_resv_regions(IOMMUDevice *sdev)
}
static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
- int devfn)
+ int devfn, PCIBus *real_bus,
+ int real_devfn)
{
VirtIOIOMMU *s = opaque;
IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] virtio-iommu: Support PCI device aliases
2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
@ 2024-01-22 6:40 ` Zhenzhong Duan
2024-01-24 20:55 ` Eric Auger
2 siblings, 1 reply; 10+ messages in thread
From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw)
To: qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx,
jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan
Currently virtio-iommu doesn't work well if there are multiple devices
in same iommu group. In below example config, guest virtio-iommu driver
can successfully probe first device but fail on others. Only one device
under the bridge can work normally.
-device virtio-iommu \
-device pcie-pci-bridge,id=root0 \
-device vfio-pci,host=81:11.0,bus=root0 \
-device vfio-pci,host=6f:01.0,bus=root0 \
The reason is virtio-iommu stores AS(address space) in hash table with
aliased BDF and corelates endpoint which is indexed by device's real
BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
table, we either get wrong AS or NULL.
Fix it by storing AS indexed by real BDF. This way also make iova_ranges
from vfio device stored in IOMMUDevice of real BDF successfully.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/virtio/virtio-iommu.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index d99c1f0d64..6880d92a44 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
int real_devfn)
{
VirtIOIOMMU *s = opaque;
- IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+ IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
static uint32_t mr_index;
IOMMUDevice *sdev;
if (!sbus) {
sbus = g_malloc0(sizeof(IOMMUPciBus) +
sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
- sbus->bus = bus;
- g_hash_table_insert(s->as_by_busptr, bus, sbus);
+ sbus->bus = real_bus;
+ g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
}
- sdev = sbus->pbdev[devfn];
+ sdev = sbus->pbdev[real_devfn];
if (!sdev) {
char *name = g_strdup_printf("%s-%d-%d",
TYPE_VIRTIO_IOMMU_MEMORY_REGION,
- mr_index++, devfn);
- sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
+ mr_index++, real_devfn);
+ sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
sdev->viommu = s;
- sdev->bus = bus;
- sdev->devfn = devfn;
+ sdev->bus = real_bus;
+ sdev->devfn = real_devfn;
trace_virtio_iommu_init_iommu_mr(name);
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
@ 2024-01-23 9:41 ` Cédric Le Goater
2024-01-23 10:03 ` Duan, Zhenzhong
0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2024-01-23 9:41 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, peterx, jasowang,
mst, yi.l.liu, chao.p.peng
On 1/22/24 07:40, Zhenzhong Duan wrote:
> IOMMUPciBus pointer cache is indexed by bus number, bus number
> may not always be a fixed value, i.e., guest reboot to different
> kernel which set bus number with different algorithm.
>
> This could lead to endpoint binding to wrong iommu MR in
> virtio_iommu_get_endpoint(), then vfio device setup wrong
> mapping from other device.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/virtio/virtio-iommu.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index 8a4bd933c6..bfce3237f3 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque)
>
> trace_virtio_iommu_system_reset();
>
> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num));
> +
> /*
> * config.bypass is sticky across device reset, but should be restored on
> * system reset
you could remove the memset in virtio_iommu_device_realize() then ?
Thanks,
C.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
2024-01-23 9:41 ` Cédric Le Goater
@ 2024-01-23 10:03 ` Duan, Zhenzhong
2024-01-24 21:04 ` Eric Auger
0 siblings, 1 reply; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-23 10:03 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, qemu-arm
Cc: eric.auger, jean-philippe, alex.williamson, peterx, jasowang,
mst, Liu, Yi L, Peng, Chao P
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>when system reset
>
>On 1/22/24 07:40, Zhenzhong Duan wrote:
>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>> may not always be a fixed value, i.e., guest reboot to different
>> kernel which set bus number with different algorithm.
>>
>> This could lead to endpoint binding to wrong iommu MR in
>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>> mapping from other device.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/virtio/virtio-iommu.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index 8a4bd933c6..bfce3237f3 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>*opaque)
>>
>> trace_virtio_iommu_system_reset();
>>
>> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>iommu_pcibus_by_bus_num));
>> +
>> /*
>> * config.bypass is sticky across device reset, but should be restored on
>> * system reset
>
>you could remove the memset in virtio_iommu_device_realize() then ?
Good suggestion, will do.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
@ 2024-01-24 20:55 ` Eric Auger
2024-01-25 5:58 ` Duan, Zhenzhong
0 siblings, 1 reply; 10+ messages in thread
From: Eric Auger @ 2024-01-24 20:55 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel, qemu-arm
Cc: jean-philippe, alex.williamson, clg, peterx, jasowang, mst,
yi.l.liu, chao.p.peng
Hi Zhenzhong,
On 1/22/24 07:40, Zhenzhong Duan wrote:
> Currently virtio-iommu doesn't work well if there are multiple devices
> in same iommu group. In below example config, guest virtio-iommu driver
> can successfully probe first device but fail on others. Only one device
> under the bridge can work normally.
>
> -device virtio-iommu \
> -device pcie-pci-bridge,id=root0 \
> -device vfio-pci,host=81:11.0,bus=root0 \
> -device vfio-pci,host=6f:01.0,bus=root0 \
>
> The reason is virtio-iommu stores AS(address space) in hash table with
> aliased BDF and corelates endpoint which is indexed by device's real
> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
> table, we either get wrong AS or NULL.
>
> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
> from vfio device stored in IOMMUDevice of real BDF successfully.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/virtio/virtio-iommu.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index d99c1f0d64..6880d92a44 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
> int real_devfn)
> {
> VirtIOIOMMU *s = opaque;
> - IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
> + IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
> static uint32_t mr_index;
> IOMMUDevice *sdev;
>
> if (!sbus) {
> sbus = g_malloc0(sizeof(IOMMUPciBus) +
> sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
> - sbus->bus = bus;
> - g_hash_table_insert(s->as_by_busptr, bus, sbus);
> + sbus->bus = real_bus;
> + g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
> }
>
> - sdev = sbus->pbdev[devfn];
> + sdev = sbus->pbdev[real_devfn];
> if (!sdev) {
> char *name = g_strdup_printf("%s-%d-%d",
> TYPE_VIRTIO_IOMMU_MEMORY_REGION,
> - mr_index++, devfn);
> - sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
> + mr_index++, real_devfn);
> + sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>
> sdev->viommu = s;
> - sdev->bus = bus;
> - sdev->devfn = devfn;
> + sdev->bus = real_bus;
> + sdev->devfn = real_devfn;
but then this means the 2 devices would be abstracted by two different
IOMMU MRs whereas in practice they cannot be distinguished from an IOMMU
pov. Shouldn't the virtio-iommu driver use the same ep_id for both
devices within the same group?
Note there are some known issues about virtio-iommu and pcie-to-pci
bridges which were reported early last year and confirmed by Robin
Murphy. See:
[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() <https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/#r>
https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/
Thanks
Eric
>
> trace_virtio_iommu_init_iommu_mr(name);
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
2024-01-23 10:03 ` Duan, Zhenzhong
@ 2024-01-24 21:04 ` Eric Auger
2024-01-25 2:46 ` Duan, Zhenzhong
0 siblings, 1 reply; 10+ messages in thread
From: Eric Auger @ 2024-01-24 21:04 UTC (permalink / raw)
To: Duan, Zhenzhong, Cédric Le Goater, qemu-devel, qemu-arm
Cc: jean-philippe, alex.williamson, peterx, jasowang, mst, Liu, Yi L,
Peng, Chao P
Hi Zhenzhong,
On 1/23/24 11:03, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>> when system reset
>>
>> On 1/22/24 07:40, Zhenzhong Duan wrote:
>>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>>> may not always be a fixed value, i.e., guest reboot to different
>>> kernel which set bus number with different algorithm.
this cannot harm to reset it but I don't know if this can be encountered.
>>>
>>> This could lead to endpoint binding to wrong iommu MR in
>>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>>> mapping from other device.
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/virtio/virtio-iommu.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>> index 8a4bd933c6..bfce3237f3 100644
>>> --- a/hw/virtio/virtio-iommu.c
>>> +++ b/hw/virtio/virtio-iommu.c
>>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>> *opaque)
>>> trace_virtio_iommu_system_reset();
>>>
>>> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>> iommu_pcibus_by_bus_num));
>>> +
>>> /*
>>> * config.bypass is sticky across device reset, but should be restored on
>>> * system reset
>> you could remove the memset in virtio_iommu_device_realize() then ?
> Good suggestion, will do.
By the way what about the vtd implementation. s->vtd_address_spaces is
hash table but I don't see it reset either?
Also if is requested here we would need it on smmuv3 as well.
Thanks
Eric
>
> Thanks
> Zhenzhong
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset
2024-01-24 21:04 ` Eric Auger
@ 2024-01-25 2:46 ` Duan, Zhenzhong
0 siblings, 0 replies; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-25 2:46 UTC (permalink / raw)
To: eric.auger, Cédric Le Goater, qemu-devel, qemu-arm
Cc: jean-philippe, alex.williamson, peterx, jasowang, mst, Liu, Yi L,
Peng, Chao P
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache
>when system reset
>
>Hi Zhenzhong,
>
>On 1/23/24 11:03, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer
>cache
>>> when system reset
>>>
>>> On 1/22/24 07:40, Zhenzhong Duan wrote:
>>>> IOMMUPciBus pointer cache is indexed by bus number, bus number
>>>> may not always be a fixed value, i.e., guest reboot to different
>>>> kernel which set bus number with different algorithm.
>this cannot harm to reset it but I don't know if this can be encountered.
>>>>
>>>> This could lead to endpoint binding to wrong iommu MR in
>>>> virtio_iommu_get_endpoint(), then vfio device setup wrong
>>>> mapping from other device.
>>>>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> hw/virtio/virtio-iommu.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>>>> index 8a4bd933c6..bfce3237f3 100644
>>>> --- a/hw/virtio/virtio-iommu.c
>>>> +++ b/hw/virtio/virtio-iommu.c
>>>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void
>>> *opaque)
>>>> trace_virtio_iommu_system_reset();
>>>>
>>>> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s-
>>>> iommu_pcibus_by_bus_num));
>>>> +
>>>> /*
>>>> * config.bypass is sticky across device reset, but should be restored
>on
>>>> * system reset
>>> you could remove the memset in virtio_iommu_device_realize() then ?
>> Good suggestion, will do.
>By the way what about the vtd implementation. s->vtd_address_spaces is
>hash table but I don't see it reset either?
Good question!
s->vtd_address_spaces is indexed by a key containing (PCIBus *) which is reliable.
/*
* PCI bus number (or SID) is not reliable since the device is usaully
* initialized before guest can configure the PCI bridge
* (SECONDARY_BUS_NUMBER).
*/
struct vtd_as_key {
PCIBus *bus;
uint8_t devfn;
uint32_t pasid;
};
So I don’t think it should reset, same for s->as_by_busptr in virtio-iommu.
s->vtd_as_cache[bus_num] is unstable after guest reboot, but vtd_get_as_by_sid() has
logic to verify and update cache, so it doesn't have issue.
But if we hotplug/unplug bridge in a loop, there may be issue with s->vtd_address_spaces
Because old AS is never released. Anyway that's beyond scope of this patch.
>Also if is requested here we would need it on smmuv3 as well.
Good suggestion, I think so, I'll add a patch to smmuv3 for review.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 3/3] virtio-iommu: Support PCI device aliases
2024-01-24 20:55 ` Eric Auger
@ 2024-01-25 5:58 ` Duan, Zhenzhong
0 siblings, 0 replies; 10+ messages in thread
From: Duan, Zhenzhong @ 2024-01-25 5:58 UTC (permalink / raw)
To: eric.auger, qemu-devel, qemu-arm
Cc: jean-philippe, alex.williamson, clg, peterx, jasowang, mst, Liu,
Yi L, Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
>
>Hi Zhenzhong,
>
>On 1/22/24 07:40, Zhenzhong Duan wrote:
>> Currently virtio-iommu doesn't work well if there are multiple devices
>> in same iommu group. In below example config, guest virtio-iommu driver
>> can successfully probe first device but fail on others. Only one device
>> under the bridge can work normally.
>>
>> -device virtio-iommu \
>> -device pcie-pci-bridge,id=root0 \
>> -device vfio-pci,host=81:11.0,bus=root0 \
>> -device vfio-pci,host=6f:01.0,bus=root0 \
>>
>> The reason is virtio-iommu stores AS(address space) in hash table with
>> aliased BDF and corelates endpoint which is indexed by device's real
>> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
>> table, we either get wrong AS or NULL.
>>
>> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
>> from vfio device stored in IOMMUDevice of real BDF successfully.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/virtio/virtio-iommu.c | 16 ++++++++--------
>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index d99c1f0d64..6880d92a44 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -399,27 +399,27 @@ static AddressSpace
>*virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>> int real_devfn)
>> {
>> VirtIOIOMMU *s = opaque;
>> - IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>> + IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr,
>real_bus);
>> static uint32_t mr_index;
>> IOMMUDevice *sdev;
>>
>> if (!sbus) {
>> sbus = g_malloc0(sizeof(IOMMUPciBus) +
>> sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
>> - sbus->bus = bus;
>> - g_hash_table_insert(s->as_by_busptr, bus, sbus);
>> + sbus->bus = real_bus;
>> + g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
>> }
>>
>> - sdev = sbus->pbdev[devfn];
>> + sdev = sbus->pbdev[real_devfn];
>> if (!sdev) {
>> char *name = g_strdup_printf("%s-%d-%d",
>> TYPE_VIRTIO_IOMMU_MEMORY_REGION,
>> - mr_index++, devfn);
>> - sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
>> + mr_index++, real_devfn);
>> + sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>>
>> sdev->viommu = s;
>> - sdev->bus = bus;
>> - sdev->devfn = devfn;
>> + sdev->bus = real_bus;
>> + sdev->devfn = real_devfn;
>but then this means the 2 devices would be abstracted by two different
>IOMMU MRs whereas in practice they cannot be distinguished from an
>IOMMU pov.
Yes, normally the two different IOMMU MRs should link to same guest domain,
so translation result will be same. But if a malicious guest try to break,
then it fails to block that.
>Shouldn't the virtio-iommu driver use the same ep_id for both
>devices within the same group?
IIUC, you mean for domain attach and not probe request?
I was thinking ep_id represented an existing device in guest, not the aliased one.
>
>Note there are some known issues about virtio-iommu and pcie-to-pci
>bridges which were reported early last year and confirmed by Robin
>Murphy. See:
>
>[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr()
><https://lore.kernel.org/all/20230116124709.793084-1-
>eric.auger@redhat.com/#r>
Thanks for sharing, it’s valuable😊
BRs.
Zhenzhong
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-01-25 5:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan
2024-01-23 9:41 ` Cédric Le Goater
2024-01-23 10:03 ` Duan, Zhenzhong
2024-01-24 21:04 ` Eric Auger
2024-01-25 2:46 ` Duan, Zhenzhong
2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan
2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan
2024-01-24 20:55 ` Eric Auger
2024-01-25 5:58 ` Duan, Zhenzhong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).