QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature
@ 2021-04-21  8:04 Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 1/8] hw/pci/pci_host: Allow bypass iommu for pci host Wang Xingang
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

These patches add support for configure bypass_iommu on/off for
pci root bus, including primary bus and pxb root bus. At present,
All root bus will go through iommu when iommu is configured,
which is not flexible, because in many situations the need for using
iommu and bypass iommu aften exists at the same time.

So this add option to enable/disable bypass_iommu for primary bus
and pxb root bus. The bypass_iommu property is set to false default,
meaning that devcies will go through iommu if no explicit configuration
is added. When bypass_iommu is enabled for the root bus, devices
attached to it will bypass iommu, otherwise devices will go through
iommu.

This feature can be used in this manner:
arm: -machine virt,iommu=smmuv3,bypass_iommu=true
x86: -machine q35,bypass_iommu=true
pxb: -device pxb-pcie,bus_nr=0x10,id=pci.10,bus=pcie.0,bypass_iommu=true 

History:

v2 -> v3:
- rebase on top of v6.0.0-rc4
- Took into account Eric's comments, replace with a bypass_iommu
  proerty 
- When building the IORT idmap, cover the whole RID space

v1 -> v2:
- rebase on top of v6.0.0-rc0
- Fix some issues
- Took into account Eric's comments, and remove the PCI_BUS_IOMMU flag,
  replace it with a property in PCIHostState.
- Add support for x86 iommu option

Xingang Wang (8):
  hw/pci/pci_host: Allow bypass iommu for pci host
  hw/pxb: Add a bypass iommu property
  hw/arm/virt: Add a machine option to bypass iommu for primary bus
  hw/i386: Add a pc machine option to bypass iommu for primary bus
  hw/pci: Add pci_bus_range to get bus number range
  hw/arm/virt-acpi-build: Add explicit IORT idmap for smmuv3 node
  hw/i386/acpi-build: Add explicit scope in DMAR table
  hw/i386/acpi-build: Add bypass_iommu check when building IVRS table

 hw/arm/virt-acpi-build.c            | 128 +++++++++++++++++++++++-----
 hw/arm/virt.c                       |  26 ++++++
 hw/i386/acpi-build.c                |  70 ++++++++++++++-
 hw/i386/pc.c                        |  18 ++++
 hw/pci-bridge/pci_expander_bridge.c |   3 +
 hw/pci-host/q35.c                   |   1 +
 hw/pci/pci.c                        |  33 ++++++-
 hw/pci/pci_host.c                   |   2 +
 include/hw/arm/virt.h               |   1 +
 include/hw/i386/pc.h                |   1 +
 include/hw/pci/pci.h                |   2 +
 include/hw/pci/pci_host.h           |   1 +
 12 files changed, 263 insertions(+), 23 deletions(-)

-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 1/8] hw/pci/pci_host: Allow bypass iommu for pci host
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
@ 2021-04-21  8:04 ` Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 2/8] hw/pxb: Add a bypass iommu property Wang Xingang
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

This add a bypass_iommu property for pci host, which indicates
whether devices attached to the pci root bus will bypass iommu.
In pci_device_iommu_address_space(), add a bypass_iommu check
to avoid getting iommu address space for devices bypass iommu.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/pci/pci.c              | 18 +++++++++++++++++-
 hw/pci/pci_host.c         |  2 ++
 include/hw/pci/pci.h      |  1 +
 include/hw/pci/pci_host.h |  1 +
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 8f35e13a0c..301addfb35 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -417,6 +417,22 @@ const char *pci_root_bus_path(PCIDevice *dev)
     return rootbus->qbus.name;
 }
 
+bool pci_bus_bypass_iommu(PCIBus *bus)
+{
+    PCIBus *rootbus = bus;
+    PCIHostState *host_bridge;
+
+    if (!pci_bus_is_root(bus)) {
+        rootbus = pci_device_root_bus(bus->parent_dev);
+    }
+
+    host_bridge = PCI_HOST_BRIDGE(rootbus->qbus.parent);
+
+    assert(host_bridge->bus == rootbus);
+
+    return host_bridge->bypass_iommu;
+}
+
 static void pci_root_bus_init(PCIBus *bus, DeviceState *parent,
                               MemoryRegion *address_space_mem,
                               MemoryRegion *address_space_io,
@@ -2719,7 +2735,7 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
 
         iommu_bus = parent_bus;
     }
-    if (iommu_bus && iommu_bus->iommu_fn) {
+    if (!pci_bus_bypass_iommu(bus) && iommu_bus && iommu_bus->iommu_fn) {
         return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, devfn);
     }
     return &address_space_memory;
diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c
index 8ca5fadcbd..2768db53e6 100644
--- a/hw/pci/pci_host.c
+++ b/hw/pci/pci_host.c
@@ -222,6 +222,8 @@ const VMStateDescription vmstate_pcihost = {
 static Property pci_host_properties_common[] = {
     DEFINE_PROP_BOOL("x-config-reg-migration-enabled", PCIHostState,
                      mig_enabled, true),
+    DEFINE_PROP_BOOL("pci-host-bypass-iommu", PCIHostState,
+                     bypass_iommu, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 6be4e0c460..f4d51b672b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -480,6 +480,7 @@ void pci_for_each_bus(PCIBus *bus,
 
 PCIBus *pci_device_root_bus(const PCIDevice *d);
 const char *pci_root_bus_path(PCIDevice *dev);
+bool pci_bus_bypass_iommu(PCIBus *bus);
 PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn);
 int pci_qdev_find_device(const char *id, PCIDevice **pdev);
 void pci_bus_get_w64_range(PCIBus *bus, Range *range);
diff --git a/include/hw/pci/pci_host.h b/include/hw/pci/pci_host.h
index 52e038c019..c6f4eb4585 100644
--- a/include/hw/pci/pci_host.h
+++ b/include/hw/pci/pci_host.h
@@ -43,6 +43,7 @@ struct PCIHostState {
     uint32_t config_reg;
     bool mig_enabled;
     PCIBus *bus;
+    bool bypass_iommu;
 
     QLIST_ENTRY(PCIHostState) next;
 };
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 2/8] hw/pxb: Add a bypass iommu property
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 1/8] hw/pci/pci_host: Allow bypass iommu for pci host Wang Xingang
@ 2021-04-21  8:04 ` Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 3/8] hw/arm/virt: Add a machine option to bypass iommu for primary bus Wang Xingang
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

This add a bypass_iommu property for pci_expander_bridge.
The property can be used as:
qemu -device pxb-pcie,bus_nr=0x10,addr=0x1,bypass_iommu=true

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index aedded1064..7112dc3062 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -57,6 +57,7 @@ struct PXBDev {
 
     uint8_t bus_nr;
     uint16_t numa_node;
+    bool bypass_iommu;
 };
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -255,6 +256,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
     bus->map_irq = pxb_map_irq_fn;
 
     PCI_HOST_BRIDGE(ds)->bus = bus;
+    PCI_HOST_BRIDGE(ds)->bypass_iommu = pxb->bypass_iommu;
 
     pxb_register_bus(dev, bus, &local_err);
     if (local_err) {
@@ -301,6 +303,7 @@ static Property pxb_dev_properties[] = {
     /* Note: 0 is not a legal PXB bus number. */
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
+    DEFINE_PROP_BOOL("bypass_iommu", PXBDev, bypass_iommu, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 3/8] hw/arm/virt: Add a machine option to bypass iommu for primary bus
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 1/8] hw/pci/pci_host: Allow bypass iommu for pci host Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 2/8] hw/pxb: Add a bypass iommu property Wang Xingang
@ 2021-04-21  8:04 ` Wang Xingang
  2021-04-21  8:04 ` [PATCH RFC v3 4/8] hw/i386: Add a pc " Wang Xingang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

This add a bypass_iommu option for arm virt machine,
the option can be used in this manner:
qemu -machine virt,iommu=smmuv3,bypass_iommu=true

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/arm/virt.c         | 26 ++++++++++++++++++++++++++
 include/hw/arm/virt.h |  1 +
 2 files changed, 27 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9f01d9041b..0ce6167aab 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1366,6 +1366,7 @@ static void create_pcie(VirtMachineState *vms)
     }
 
     pci = PCI_HOST_BRIDGE(dev);
+    pci->bypass_iommu = vms->bypass_iommu;
     vms->bus = pci->bus;
     if (vms->bus) {
         for (i = 0; i < nb_nics; i++) {
@@ -2319,6 +2320,21 @@ static void virt_set_iommu(Object *obj, const char *value, Error **errp)
     }
 }
 
+static bool virt_get_bypass_iommu(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->bypass_iommu;
+}
+
+static void virt_set_bypass_iommu(Object *obj, bool value,
+                                              Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->bypass_iommu = value;
+}
+
 static CpuInstanceProperties
 virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
@@ -2656,6 +2672,13 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
                                           "Set the IOMMU type. "
                                           "Valid values are none and smmuv3");
 
+    object_class_property_add_bool(oc, "bypass_iommu",
+                                   virt_get_bypass_iommu,
+                                   virt_set_bypass_iommu);
+    object_class_property_set_description(oc, "bypass_iommu",
+                                          "Set on/off to enable/disable "
+                                          "bypass_iommu for primary bus");
+
     object_class_property_add_bool(oc, "ras", virt_get_ras,
                                    virt_set_ras);
     object_class_property_set_description(oc, "ras",
@@ -2723,6 +2746,9 @@ static void virt_instance_init(Object *obj)
     /* Default disallows iommu instantiation */
     vms->iommu = VIRT_IOMMU_NONE;
 
+    /* The primary bus is attached to iommu by default */
+    vms->bypass_iommu = false;
+
     /* Default disallows RAS instantiation */
     vms->ras = false;
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 921416f918..82bceadb82 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -147,6 +147,7 @@ struct VirtMachineState {
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
+    bool bypass_iommu;
     VirtMSIControllerType msi_controller;
     uint16_t virtio_iommu_bdf;
     struct arm_boot_info bootinfo;
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 4/8] hw/i386: Add a pc machine option to bypass iommu for primary bus
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
                   ` (2 preceding siblings ...)
  2021-04-21  8:04 ` [PATCH RFC v3 3/8] hw/arm/virt: Add a machine option to bypass iommu for primary bus Wang Xingang
@ 2021-04-21  8:04 ` Wang Xingang
  2021-04-21  8:05 ` [PATCH RFC v3 5/8] hw/pci: Add pci_bus_range to get bus number range Wang Xingang
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

Add a bypass_iommu pc machine option to bypass iommu translation
for the primary root bus.
The option can be used as manner:
qemu-system-x86_64 -machine q35,bypass_iommu=true

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/i386/pc.c         | 18 ++++++++++++++++++
 hw/pci-host/q35.c    |  1 +
 include/hw/i386/pc.h |  1 +
 3 files changed, 20 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8a84b25a03..2266a0520f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1529,6 +1529,20 @@ static void pc_machine_set_hpet(Object *obj, bool value, Error **errp)
     pcms->hpet_enabled = value;
 }
 
+static bool pc_machine_get_bypass_iommu(Object *obj, Error **errp)
+{
+    PCMachineState *pcms = PC_MACHINE(obj);
+
+    return pcms->bypass_iommu;
+}
+
+static void pc_machine_set_bypass_iommu(Object *obj, bool value, Error **errp)
+{
+    PCMachineState *pcms = PC_MACHINE(obj);
+
+    pcms->bypass_iommu = value;
+}
+
 static void pc_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
                                             const char *name, void *opaque,
                                             Error **errp)
@@ -1628,6 +1642,7 @@ static void pc_machine_initfn(Object *obj)
 #ifdef CONFIG_HPET
     pcms->hpet_enabled = true;
 #endif
+    pcms->bypass_iommu = false;
 
     pc_system_flash_create(pcms);
     pcms->pcspk = isa_new(TYPE_PC_SPEAKER);
@@ -1752,6 +1767,9 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     object_class_property_add_bool(oc, "hpet",
         pc_machine_get_hpet, pc_machine_set_hpet);
 
+    object_class_property_add_bool(oc, "bypass_iommu",
+        pc_machine_get_bypass_iommu, pc_machine_set_bypass_iommu);
+
     object_class_property_add(oc, PC_MACHINE_MAX_FW_SIZE, "size",
         pc_machine_get_max_fw_size, pc_machine_set_max_fw_size,
         NULL, NULL);
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 2eb729dff5..ade05a5539 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -64,6 +64,7 @@ static void q35_host_realize(DeviceState *dev, Error **errp)
                                 s->mch.address_space_io,
                                 0, TYPE_PCIE_BUS);
     PC_MACHINE(qdev_get_machine())->bus = pci->bus;
+    pci->bypass_iommu = PC_MACHINE(qdev_get_machine())->bypass_iommu;
     qdev_realize(DEVICE(&s->mch), BUS(pci->bus), &error_fatal);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index dcf060b791..83ee8f2a01 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -45,6 +45,7 @@ typedef struct PCMachineState {
     bool sata_enabled;
     bool pit_enabled;
     bool hpet_enabled;
+    bool bypass_iommu;
     uint64_t max_fw_size;
 
     /* NUMA information: */
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 5/8] hw/pci: Add pci_bus_range to get bus number range
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
                   ` (3 preceding siblings ...)
  2021-04-21  8:04 ` [PATCH RFC v3 4/8] hw/i386: Add a pc " Wang Xingang
@ 2021-04-21  8:05 ` Wang Xingang
  2021-04-21  8:05 ` [PATCH RFC v3 6/8] hw/arm/virt-acpi-build: Add explicit IORT idmap for smmuv3 node Wang Xingang
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:05 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

This helps to get the bus number range of a pci bridge hierarchy.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/pci/pci.c         | 15 +++++++++++++++
 include/hw/pci/pci.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 301addfb35..2ac3b8d76c 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -538,6 +538,21 @@ int pci_bus_num(PCIBus *s)
     return PCI_BUS_GET_CLASS(s)->bus_num(s);
 }
 
+void pci_bus_range(PCIBus *bus, int *min_bus, int *max_bus)
+{
+    int i;
+    *min_bus = *max_bus = pci_bus_num(bus);
+
+    for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) {
+        PCIDevice *dev = bus->devices[i];
+
+        if (dev && PCI_DEVICE_GET_CLASS(dev)->is_bridge) {
+            *min_bus = MIN(*min_bus, dev->config[PCI_SECONDARY_BUS]);
+            *max_bus = MAX(*max_bus, dev->config[PCI_SUBORDINATE_BUS]);
+        }
+    }
+}
+
 int pci_bus_numa_node(PCIBus *bus)
 {
     return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index f4d51b672b..d0f4266e37 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -450,6 +450,7 @@ static inline PCIBus *pci_get_bus(const PCIDevice *dev)
     return PCI_BUS(qdev_get_parent_bus(DEVICE(dev)));
 }
 int pci_bus_num(PCIBus *s);
+void pci_bus_range(PCIBus *bus, int *min_bus, int *max_bus);
 static inline int pci_dev_bus_num(const PCIDevice *dev)
 {
     return pci_bus_num(pci_get_bus(dev));
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 6/8] hw/arm/virt-acpi-build: Add explicit IORT idmap for smmuv3 node
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
                   ` (4 preceding siblings ...)
  2021-04-21  8:05 ` [PATCH RFC v3 5/8] hw/pci: Add pci_bus_range to get bus number range Wang Xingang
@ 2021-04-21  8:05 ` Wang Xingang
  2021-04-21  8:05 ` [PATCH RFC v3 7/8] hw/i386/acpi-build: Add explicit scope in DMAR table Wang Xingang
  2021-04-21  8:05 ` [PATCH RFC v3 8/8] hw/i386/acpi-build: Add bypass_iommu check when building IVRS table Wang Xingang
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:05 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

This add explicit IORT idmap info according to pci root bus number
range, and only add smmu idmap for those which does not bypass iommu.

For idmap directly to ITS node, this split the whole RID mapping to
smmu idmap and its idmap. So this should cover the whole idmap for
through/bypass SMMUv3 node.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/arm/virt-acpi-build.c | 128 +++++++++++++++++++++++++++++++++------
 1 file changed, 109 insertions(+), 19 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 60fe2e65a7..661b84edec 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -44,6 +44,7 @@
 #include "hw/acpi/tpm.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
 #include "hw/pci-host/gpex.h"
 #include "hw/arm/virt.h"
 #include "hw/mem/nvdimm.h"
@@ -237,6 +238,41 @@ static void acpi_dsdt_add_tpm(Aml *scope, VirtMachineState *vms)
     aml_append(scope, dev);
 }
 
+/* Build the iort ID mapping to SMMUv3 for a given PCI host bridge */
+static int
+iort_host_bridges(Object *obj, void *opaque)
+{
+    GArray *idmap_blob = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PCI_HOST_BRIDGE)) {
+        PCIBus *bus = PCI_HOST_BRIDGE(obj)->bus;
+
+        if (bus && !pci_bus_bypass_iommu(bus)) {
+            int min_bus, max_bus;
+            pci_bus_range(bus, &min_bus, &max_bus);
+
+            AcpiIortIdMapping idmap = {
+                .input_base = cpu_to_le32(min_bus << 8),
+                .id_count = cpu_to_le32((max_bus - min_bus + 1) << 8),
+                .output_base = cpu_to_le32(min_bus << 8),
+                .output_reference = cpu_to_le32(0),
+                .flags = cpu_to_le32(0),
+            };
+            g_array_append_val(idmap_blob, idmap);
+        }
+    }
+
+    return 0;
+}
+
+static int smmu_idmap_sort_func(gconstpointer a, gconstpointer b)
+{
+    AcpiIortIdMapping *idmap_a = (AcpiIortIdMapping *)a;
+    AcpiIortIdMapping *idmap_b = (AcpiIortIdMapping *)b;
+
+    return idmap_a->input_base - idmap_b->input_base;
+}
+
 static void
 build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
@@ -247,6 +283,45 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     AcpiIortSmmu3 *smmu;
     size_t node_size, iort_node_offset, iort_length, smmu_offset = 0;
     AcpiIortRC *rc;
+    uint32_t base, i, rc_map_count;
+    GArray *smmu_idmap_blob =
+        g_array_new(false, true, sizeof(AcpiIortIdMapping));
+    GArray *its_idmap_blob =
+        g_array_new(false, true, sizeof(AcpiIortIdMapping));
+
+    object_child_foreach_recursive(object_get_root(),
+                                   iort_host_bridges, smmu_idmap_blob);
+
+    g_array_sort(smmu_idmap_blob, smmu_idmap_sort_func);
+
+    /* Build the iort ID mapping to ITS directly */
+    i = 0, base = 0;
+    while (base < 0xFFFF && i <= smmu_idmap_blob->len) {
+        AcpiIortIdMapping new_idmap = {
+            .input_base = cpu_to_le32(base),
+            .id_count = cpu_to_le32(0),
+            .output_base = cpu_to_le32(base),
+            .output_reference = cpu_to_le32(0),
+            .flags = cpu_to_le32(0),
+        };
+
+        if (i == smmu_idmap_blob->len) {
+            if (base < 0xFFFF) {
+                new_idmap.id_count = cpu_to_le32(0xFFFF - base);
+                g_array_append_val(its_idmap_blob, new_idmap);
+            }
+            break;
+        }
+
+        idmap = &g_array_index(smmu_idmap_blob, AcpiIortIdMapping, i);
+        if (base < idmap->input_base) {
+            new_idmap.id_count = cpu_to_le32(idmap->input_base - base);
+            g_array_append_val(its_idmap_blob, new_idmap);
+        }
+
+        i++;
+        base = idmap->input_base + idmap->id_count;
+    }
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
@@ -280,13 +355,13 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 
         /* SMMUv3 node */
         smmu_offset = iort_node_offset + node_size;
-        node_size = sizeof(*smmu) + sizeof(*idmap);
+        node_size = sizeof(*smmu) + sizeof(*idmap) * smmu_idmap_blob->len;
         iort_length += node_size;
         smmu = acpi_data_push(table_data, node_size);
 
         smmu->type = ACPI_IORT_NODE_SMMU_V3;
         smmu->length = cpu_to_le16(node_size);
-        smmu->mapping_count = cpu_to_le32(1);
+        smmu->mapping_count = cpu_to_le32(smmu_idmap_blob->len);
         smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
         smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
         smmu->flags = cpu_to_le32(ACPI_IORT_SMMU_V3_COHACC_OVERRIDE);
@@ -295,23 +370,24 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         smmu->sync_gsiv = cpu_to_le32(irq + 2);
         smmu->gerr_gsiv = cpu_to_le32(irq + 3);
 
-        /* Identity RID mapping covering the whole input RID range */
-        idmap = &smmu->id_mapping_array[0];
-        idmap->input_base = 0;
-        idmap->id_count = cpu_to_le32(0xFFFF);
-        idmap->output_base = 0;
-        /* output IORT node is the ITS group node (the first node) */
-        idmap->output_reference = cpu_to_le32(iort_node_offset);
+        for (i = 0; i < smmu_idmap_blob->len; i++) {
+            idmap = &smmu->id_mapping_array[i];
+            *idmap = g_array_index(smmu_idmap_blob, AcpiIortIdMapping, i);
+            /* output IORT node is the ITS group node (the first node) */
+            idmap->output_reference = cpu_to_le32(iort_node_offset);
+        }
     }
 
     /* Root Complex Node */
-    node_size = sizeof(*rc) + sizeof(*idmap);
+    rc_map_count = (vms->iommu == VIRT_IOMMU_SMMUV3) ?
+        smmu_idmap_blob->len + its_idmap_blob->len : 1;
+    node_size = sizeof(*rc) + sizeof(*idmap) * rc_map_count;
     iort_length += node_size;
     rc = acpi_data_push(table_data, node_size);
 
     rc->type = ACPI_IORT_NODE_PCI_ROOT_COMPLEX;
     rc->length = cpu_to_le16(node_size);
-    rc->mapping_count = cpu_to_le32(1);
+    rc->mapping_count = cpu_to_le32(rc_map_count);
     rc->mapping_offset = cpu_to_le32(sizeof(*rc));
 
     /* fully coherent device */
@@ -319,20 +395,34 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     rc->memory_properties.memory_flags = 0x3; /* CCA = CPM = DCAS = 1 */
     rc->pci_segment_number = 0; /* MCFG pci_segment */
 
-    /* Identity RID mapping covering the whole input RID range */
-    idmap = &rc->id_mapping_array[0];
-    idmap->input_base = 0;
-    idmap->id_count = cpu_to_le32(0xFFFF);
-    idmap->output_base = 0;
-
     if (vms->iommu == VIRT_IOMMU_SMMUV3) {
-        /* output IORT node is the smmuv3 node */
-        idmap->output_reference = cpu_to_le32(smmu_offset);
+        for (i = 0; i < rc_map_count; i++) {
+            idmap = &rc->id_mapping_array[i];
+
+            if (i < smmu_idmap_blob->len) {
+                *idmap = g_array_index(smmu_idmap_blob, AcpiIortIdMapping, i);
+                /* output IORT node is the smmuv3 node */
+                idmap->output_reference = cpu_to_le32(smmu_offset);
+            } else {
+                *idmap = g_array_index(its_idmap_blob,
+                         AcpiIortIdMapping, i - smmu_idmap_blob->len);
+                /* output IORT node is the ITS group node (the first node) */
+                idmap->output_reference = cpu_to_le32(iort_node_offset);
+            }
+        }
     } else {
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &rc->id_mapping_array[0];
+        idmap->input_base = cpu_to_le32(0);
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = cpu_to_le32(0);
         /* output IORT node is the ITS group node (the first node) */
         idmap->output_reference = cpu_to_le32(iort_node_offset);
     }
 
+    g_array_free(smmu_idmap_blob, true);
+    g_array_free(its_idmap_blob, true);
+
     /*
      * Update the pointer address in case table_data->data moves during above
      * acpi_data_push operations.
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 7/8] hw/i386/acpi-build: Add explicit scope in DMAR table
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
                   ` (5 preceding siblings ...)
  2021-04-21  8:05 ` [PATCH RFC v3 6/8] hw/arm/virt-acpi-build: Add explicit IORT idmap for smmuv3 node Wang Xingang
@ 2021-04-21  8:05 ` Wang Xingang
  2021-04-21  8:05 ` [PATCH RFC v3 8/8] hw/i386/acpi-build: Add bypass_iommu check when building IVRS table Wang Xingang
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:05 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

In DMAR table, the drhd is set to cover all pci devices when intel_iommu
is on. This patch add explicit scope data, including only the pci devices
that go through iommu.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/i386/acpi-build.c | 68 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index de98750aef..fdb26682cb 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1988,6 +1988,56 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
                  x86ms->oem_table_id);
 }
 
+/*
+ * Insert DMAR scope for PCI bridges and endpoint devcie
+ */
+static void
+insert_scope(PCIBus *bus, PCIDevice *dev, void *opaque)
+{
+    GArray *scope_blob = opaque;
+    AcpiDmarDeviceScope *scope = NULL;
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
+        /* Dmar Scope Type: 0x02 for PCI Bridge */
+        build_append_int_noprefix(scope_blob, 0x02, 1);
+    } else {
+        /* Dmar Scope Type: 0x01 for PCI Endpoint Device */
+        build_append_int_noprefix(scope_blob, 0x01, 1);
+    }
+
+    /* length */
+    build_append_int_noprefix(scope_blob,
+                              sizeof(*scope) + sizeof(scope->path[0]), 1);
+    /* reserved */
+    build_append_int_noprefix(scope_blob, 0, 2);
+    /* enumeration_id */
+    build_append_int_noprefix(scope_blob, 0, 1);
+    /* bus */
+    build_append_int_noprefix(scope_blob, pci_bus_num(bus), 1);
+    /* device */
+    build_append_int_noprefix(scope_blob, PCI_SLOT(dev->devfn), 1);
+    /* function */
+    build_append_int_noprefix(scope_blob, PCI_FUNC(dev->devfn), 1);
+}
+
+/* For a given PCI host bridge, walk and insert DMAR scope */
+static int
+dmar_host_bridges(Object *obj, void *opaque)
+{
+    GArray *scope_blob = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PCI_HOST_BRIDGE)) {
+        PCIBus *bus = PCI_HOST_BRIDGE(obj)->bus;
+
+        if (bus && !pci_bus_bypass_iommu(bus)) {
+            pci_for_each_device(bus, pci_bus_num(bus), insert_scope,
+                                scope_blob);
+        }
+    }
+
+    return 0;
+}
+
 /*
  * VT-d spec 8.1 DMA Remapping Reporting Structure
  * (version Oct. 2014 or later)
@@ -2007,6 +2057,15 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker, const char *oem_id,
     /* Root complex IOAPIC use one path[0] only */
     size_t ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
     IntelIOMMUState *intel_iommu = INTEL_IOMMU_DEVICE(iommu);
+    GArray *scope_blob = g_array_new(false, true, 1);
+
+    /*
+     * A PCI bus walk, for each PCI host bridge.
+     * Insert scope for each PCI bridge and endpoint device which
+     * is attached to a bus with iommu enabled.
+     */
+    object_child_foreach_recursive(object_get_root(),
+                                   dmar_host_bridges, scope_blob);
 
     assert(iommu);
     if (x86_iommu_ir_supported(iommu)) {
@@ -2020,8 +2079,9 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker, const char *oem_id,
     /* DMAR Remapping Hardware Unit Definition structure */
     drhd = acpi_data_push(table_data, sizeof(*drhd) + ioapic_scope_size);
     drhd->type = cpu_to_le16(ACPI_DMAR_TYPE_HARDWARE_UNIT);
-    drhd->length = cpu_to_le16(sizeof(*drhd) + ioapic_scope_size);
-    drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
+    drhd->length =
+        cpu_to_le16(sizeof(*drhd) + ioapic_scope_size + scope_blob->len);
+    drhd->flags = 0;            /* Don't include all pci device */
     drhd->pci_segment = cpu_to_le16(0);
     drhd->address = cpu_to_le64(Q35_HOST_BRIDGE_IOMMU_ADDR);
 
@@ -2035,6 +2095,10 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker, const char *oem_id,
     scope->path[0].device = PCI_SLOT(Q35_PSEUDO_DEVFN_IOAPIC);
     scope->path[0].function = PCI_FUNC(Q35_PSEUDO_DEVFN_IOAPIC);
 
+    /* Add scope found above */
+    g_array_append_vals(table_data, scope_blob->data, scope_blob->len);
+    g_array_free(scope_blob, true);
+
     if (iommu->dt_supported) {
         atsr = acpi_data_push(table_data, sizeof(*atsr));
         atsr->type = cpu_to_le16(ACPI_DMAR_TYPE_ATSR);
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v3 8/8] hw/i386/acpi-build: Add bypass_iommu check when building IVRS table
  2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
                   ` (6 preceding siblings ...)
  2021-04-21  8:05 ` [PATCH RFC v3 7/8] hw/i386/acpi-build: Add explicit scope in DMAR table Wang Xingang
@ 2021-04-21  8:05 ` Wang Xingang
  7 siblings, 0 replies; 9+ messages in thread
From: Wang Xingang @ 2021-04-21  8:05 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, eric.auger, shannon.zhaosl, imammedo, mst,
	marcel.apfelbaum, peter.maydell, ehabkost, richard.henderson,
	pbonzini
  Cc: xieyingtai, cenjiahui, wangxingang5

From: Xingang Wang <wangxingang5@huawei.com>

When building IVRS table, only devices which go through iommu
will be scanned, and the corresponding ivhd will be inserted.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
---
 hw/i386/acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index fdb26682cb..71fb95737c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2229,7 +2229,7 @@ ivrs_host_bridges(Object *obj, void *opaque)
     if (object_dynamic_cast(obj, TYPE_PCI_HOST_BRIDGE)) {
         PCIBus *bus = PCI_HOST_BRIDGE(obj)->bus;
 
-        if (bus) {
+        if (bus && !pci_bus_bypass_iommu(bus)) {
             pci_for_each_device(bus, pci_bus_num(bus), insert_ivhd, ivhd_blob);
         }
     }
-- 
2.19.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21  8:04 [PATCH RFC v3 0/8] Introduce Bypass IOMMU Feature Wang Xingang
2021-04-21  8:04 ` [PATCH RFC v3 1/8] hw/pci/pci_host: Allow bypass iommu for pci host Wang Xingang
2021-04-21  8:04 ` [PATCH RFC v3 2/8] hw/pxb: Add a bypass iommu property Wang Xingang
2021-04-21  8:04 ` [PATCH RFC v3 3/8] hw/arm/virt: Add a machine option to bypass iommu for primary bus Wang Xingang
2021-04-21  8:04 ` [PATCH RFC v3 4/8] hw/i386: Add a pc " Wang Xingang
2021-04-21  8:05 ` [PATCH RFC v3 5/8] hw/pci: Add pci_bus_range to get bus number range Wang Xingang
2021-04-21  8:05 ` [PATCH RFC v3 6/8] hw/arm/virt-acpi-build: Add explicit IORT idmap for smmuv3 node Wang Xingang
2021-04-21  8:05 ` [PATCH RFC v3 7/8] hw/i386/acpi-build: Add explicit scope in DMAR table Wang Xingang
2021-04-21  8:05 ` [PATCH RFC v3 8/8] hw/i386/acpi-build: Add bypass_iommu check when building IVRS table Wang Xingang

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git
	git clone --mirror https://lore.kernel.org/qemu-devel/2 qemu-devel/git/2.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git