All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device
@ 2018-02-12 18:58 Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 01/22] machine: Add a get_primary_pci_bus callback Eric Auger
                   ` (21 more replies)
  0 siblings, 22 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This series implements the virtio-iommu device.

This v6 is an upgrade to v0.6 virtio-iommu spec [1].

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v2.11.0-virtio-iommu-v6

References:
[1] [RFC] virtio-iommu version 0.6
git://linux-arm.org/virtio-iommu.git viommu/v0.6

[2] guest branch featuring the virtio-iommu driver v0.6
git://linux-arm.org/linux-jpb.git virtio-iommu/devel

Testing:
- tested with guest using virtio-pci-net and virtio-blk-pci
  (,vhost=off,iommu_platform,disable-modern=off,disable-legacy=on)
- virtio-blk-pci uses irqfd: with GICv3, use
  "target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route"
  available on my branch
- VFIO support will be brought by a separate series
- on x86 is hacked guest kernel (available on demand)

History:

v5 -> v6:
- minor update against v0.6 spec
- fix g_hash_table_lookup in virtio_iommu_find_add_as
- replace some error_reports by qemu_log_mask(LOG_GUEST_ERROR, ...)

v4 -> v5:
- event queue and fault reporting
- we now return the IOAPIC MSI region if the virtio-iommu is instantiated
  in a PC machine.
- we bypass transactions on MSI HW region and fault on reserved ones.
- We support ACPI boot with mach-virt (based on IORT proposal)
- We moved to the new driver naming conventions
- simplified mach-virt instantiation
- worked around the disappearing of pci_find_primary_bus
- in virtio_iommu_translate, check the dev->as is not NULL
- initialize as->device_list in virtio_iommu_get_as
- initialize bufstate.error to false in virtio_iommu_probe

v3 -> v4:
- probe request support although no reserved region is returned at
  the moment
- unmap semantics less strict, as specified in v0.4
- device registration, attach/detach revisited
- split into smaller patches to ease review
- propose a way to inform the IOMMU mr about the page_size_mask
  of underlying HW IOMMU, if any
- remove warning associated with the translation of the MSI doorbell

v2 -> v3:
- rebase on top of 2.10-rc0 and especially
  [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
- add mutex init
- fix as->mappings deletion using g_tree_ref/unref
- when a dev is attached whereas it is already attached to
  another address space, first detach it
- fix some error values
- page_sizes = TARGET_PAGE_MASK;
- I haven't changed the unmap() semantics yet, waiting for the
  next virtio-iommu spec revision.

v1 -> v2:
- fix redifinition of viommu_as typedef

Eric Auger (22):
  machine: Add a get_primary_pci_bus callback
  hw/arm/virt: Implement get_primary_pci_bus
  pc: Implement get_primary_pci_bus
  update-linux-headers: Import virtio_iommu.h
  linux-headers: Partial update for virtio-iommu v0.6
  virtio-iommu: Add skeleton
  virtio-iommu: Decode the command payload
  virtio-iommu: Add the iommu regions
  virtio-iommu: Register attached endpoints
  virtio-iommu: Implement attach/detach command
  virtio-iommu: Implement map/unmap
  virtio-iommu: Implement translate
  virtio-iommu: Implement probe request
  virtio-iommu: Add an msi_bypass property
  virtio-iommu: Implement fault reporting
  virtio_iommu: Handle reserved regions in translation process
  hw/arm/virt: Add virtio-iommu to the virt board
  hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table
  memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
  hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
  virtio-iommu: Implement set_page_size_mask
  hw/vfio/common: Do not print error when viommu translates into an mmio
    region

 hw/arm/virt-acpi-build.c                      |   54 +-
 hw/arm/virt.c                                 |   99 ++-
 hw/i386/pc.c                                  |    8 +
 hw/vfio/common.c                              |    7 +-
 hw/virtio/Makefile.objs                       |    1 +
 hw/virtio/trace-events                        |   25 +
 hw/virtio/virtio-iommu.c                      | 1069 +++++++++++++++++++++++++
 include/exec/memory.h                         |    4 +
 include/hw/acpi/acpi-defs.h                   |   21 +-
 include/hw/arm/virt.h                         |   19 +
 include/hw/boards.h                           |    3 +
 include/hw/vfio/vfio-common.h                 |    1 +
 include/hw/virtio/virtio-iommu.h              |   63 ++
 include/standard-headers/linux/virtio_ids.h   |    1 +
 include/standard-headers/linux/virtio_iommu.h |  196 +++++
 linux-headers/linux/virtio_iommu.h            |    1 +
 scripts/update-linux-headers.sh               |    3 +
 17 files changed, 1558 insertions(+), 17 deletions(-)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h
 create mode 100644 include/standard-headers/linux/virtio_iommu.h
 create mode 100644 linux-headers/linux/virtio_iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 01/22] machine: Add a get_primary_pci_bus callback
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 02/22] hw/arm/virt: Implement get_primary_pci_bus Eric Auger
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

After e492dc5a267e "pci: Eliminate pci_find_primary_bus()" we don't
have an easy mean to retrieve the primary bus of a machine. This will be
needed by virtio-iommu-device which is bound to be dynamically instantiated
in at least ARM virt and Q35 machines.

Adding a get_primary_pci_bus() callback allows to retrieve the PCIBus the
iommu is connected to.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

This is a temporary solution until we decide whether the
virtio-iommu-device should be instantiable through a -device command
line or through a machine command line, as already suggested by Peter
(for vsmmuv3 though).
---
 include/hw/boards.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index efb0a9e..0ed376a 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -156,6 +156,8 @@ typedef struct {
  *    should instead use "unimplemented-device" for all memory ranges where
  *    the guest will attempt to probe for a device that QEMU doesn't
  *    implement and a stub device is required.
+ * @get_primary_pci_bus: return the primary PCI bus or NULL if there are
+ *    several root buses
  */
 struct MachineClass {
     /*< private >*/
@@ -212,6 +214,7 @@ struct MachineClass {
                                                          unsigned cpu_index);
     const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
     int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
+    PCIBus *(*get_primary_pci_bus)(const MachineState *ms);
 };
 
 /**
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 02/22] hw/arm/virt: Implement get_primary_pci_bus
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 01/22] machine: Add a get_primary_pci_bus callback Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 03/22] pc: " Eric Auger
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

get_primary_pci_bus() returns the root bus. We also
add the PCIBus handle to VirtMachineState.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c         | 11 ++++++++++-
 include/hw/arm/virt.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index dbb3c80..08ac411 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -941,7 +941,7 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
-static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
+static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = vms->memmap[VIRT_PCIE_MMIO].size;
@@ -1004,6 +1004,7 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
 
     pci = PCI_HOST_BRIDGE(dev);
     if (pci->bus) {
+        vms->pci_bus = pci->bus;
         for (i = 0; i < nb_nics; i++) {
             NICInfo *nd = &nd_table[i];
 
@@ -1523,6 +1524,13 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static PCIBus *virt_get_primary_pci_bus(const MachineState *ms)
+{
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+
+    return vms->pci_bus;
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -1544,6 +1552,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
     mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
+    mc->get_primary_pci_bus = virt_get_primary_pci_bus;
 }
 
 static const TypeInfo virt_machine_info = {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..7e31e99 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -106,6 +106,7 @@ typedef struct {
     uint32_t gic_phandle;
     uint32_t msi_phandle;
     int psci_conduit;
+    PCIBus *pci_bus;
 } VirtMachineState;
 
 #define TYPE_VIRT_MACHINE   MACHINE_TYPE_NAME("virt")
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 03/22] pc: Implement get_primary_pci_bus
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 01/22] machine: Add a get_primary_pci_bus callback Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 02/22] hw/arm/virt: Implement get_primary_pci_bus Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 04/22] update-linux-headers: Import virtio_iommu.h Eric Auger
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

Implement this get_primary_pci_bus() which returns the root
bus.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/i386/pc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 55e69d6..ac33ade 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2342,6 +2342,13 @@ static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+static PCIBus *pc_machine_get_primary_pci_bus(const MachineState *ms)
+{
+    PCMachineState *pcms = PC_MACHINE(ms);
+
+    return pcms->bus;
+}
+
 static void pc_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -2381,6 +2388,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug = pc_machine_device_unplug_cb;
     nc->nmi_monitor_handler = x86_nmi;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
+    mc->get_primary_pci_bus = pc_machine_get_primary_pci_bus;
 
     object_class_property_add(oc, PC_MACHINE_MEMHP_REGION_SIZE, "int",
         pc_machine_get_hotplug_memory_region_size, NULL,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 04/22] update-linux-headers: Import virtio_iommu.h
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (2 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 03/22] pc: " Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 05/22] linux-headers: Partial update for virtio-iommu v0.6 Eric Auger
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

Update the script to update the virtio_iommu.h header.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 scripts/update-linux-headers.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 135a10d..e5af161 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -135,6 +135,9 @@ EOF
 cat <<EOF >$output/linux-headers/linux/virtio_config.h
 #include "standard-headers/linux/virtio_config.h"
 EOF
+cat <<EOF >$output/linux-headers/linux/virtio_iommu.h
+#include "standard-headers/linux/virtio_iommu.h"
+EOF
 cat <<EOF >$output/linux-headers/linux/virtio_ring.h
 #include "standard-headers/linux/virtio_ring.h"
 EOF
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 05/22] linux-headers: Partial update for virtio-iommu v0.6
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (3 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 04/22] update-linux-headers: Import virtio_iommu.h Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 06/22] virtio-iommu: Add skeleton Eric Auger
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

Partial sync against Jean-Philippe's branch:
git://linux-arm.org/linux-jpb.git virtio-iommu/devel

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/standard-headers/linux/virtio_ids.h   |   1 +
 include/standard-headers/linux/virtio_iommu.h | 196 ++++++++++++++++++++++++++
 linux-headers/linux/virtio_iommu.h            |   1 +
 3 files changed, 198 insertions(+)
 create mode 100644 include/standard-headers/linux/virtio_iommu.h
 create mode 100644 linux-headers/linux/virtio_iommu.h

diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 6d5c3b2..cfe47c5 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -43,5 +43,6 @@
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
+#define VIRTIO_ID_IOMMU        23 /* virtio IOMMU */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
new file mode 100644
index 0000000..fac4dae
--- /dev/null
+++ b/include/standard-headers/linux/virtio_iommu.h
@@ -0,0 +1,196 @@
+/*
+ * Virtio-iommu definition v0.6
+ *
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This header is BSD licensed so anyone can use the definitions
+ * to implement compatible drivers/servers:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of ARM Ltd. nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#ifndef _LINUX_VIRTIO_IOMMU_H
+#define _LINUX_VIRTIO_IOMMU_H
+
+#include "standard-headers/linux/types.h"
+
+/* Feature bits */
+#define VIRTIO_IOMMU_F_INPUT_RANGE		0
+#define VIRTIO_IOMMU_F_DOMAIN_BITS		1
+#define VIRTIO_IOMMU_F_MAP_UNMAP		2
+#define VIRTIO_IOMMU_F_BYPASS			3
+#define VIRTIO_IOMMU_F_PROBE			4
+
+struct virtio_iommu_config {
+	/* Supported page sizes */
+	uint64_t					page_size_mask;
+	/* Supported IOVA range */
+	struct virtio_iommu_range {
+		uint64_t				start;
+		uint64_t				end;
+	} input_range;
+	/* Max domain ID size */
+	uint8_t 					domain_bits;
+	uint8_t					padding[3];
+	/* Probe buffer size */
+	uint32_t					probe_size;
+} QEMU_PACKED;
+
+/* Request types */
+#define VIRTIO_IOMMU_T_ATTACH			0x01
+#define VIRTIO_IOMMU_T_DETACH			0x02
+#define VIRTIO_IOMMU_T_MAP			0x03
+#define VIRTIO_IOMMU_T_UNMAP			0x04
+#define VIRTIO_IOMMU_T_PROBE			0x05
+
+/* Status types */
+#define VIRTIO_IOMMU_S_OK			0x00
+#define VIRTIO_IOMMU_S_IOERR			0x01
+#define VIRTIO_IOMMU_S_UNSUPP			0x02
+#define VIRTIO_IOMMU_S_DEVERR			0x03
+#define VIRTIO_IOMMU_S_INVAL			0x04
+#define VIRTIO_IOMMU_S_RANGE			0x05
+#define VIRTIO_IOMMU_S_NOENT			0x06
+#define VIRTIO_IOMMU_S_FAULT			0x07
+
+struct virtio_iommu_req_head {
+	uint8_t					type;
+	uint8_t					reserved[3];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_tail {
+	uint8_t					status;
+	uint8_t					reserved[3];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_attach {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					domain;
+	uint32_t					endpoint;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+struct virtio_iommu_req_detach {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					endpoint;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_MAP_F_READ			(1 << 0)
+#define VIRTIO_IOMMU_MAP_F_WRITE		(1 << 1)
+#define VIRTIO_IOMMU_MAP_F_EXEC			(1 << 2)
+
+#define VIRTIO_IOMMU_MAP_F_MASK			(VIRTIO_IOMMU_MAP_F_READ |	\
+						 VIRTIO_IOMMU_MAP_F_WRITE |	\
+						 VIRTIO_IOMMU_MAP_F_EXEC)
+
+struct virtio_iommu_req_map {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					domain;
+	uint64_t					virt_start;
+	uint64_t					virt_end;
+	uint64_t					phys_start;
+	uint32_t					flags;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+struct virtio_iommu_req_unmap {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					domain;
+	uint64_t					virt_start;
+	uint64_t					virt_end;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_RESV_MEM_T_RESERVED	0
+#define VIRTIO_IOMMU_RESV_MEM_T_MSI		1
+
+struct virtio_iommu_probe_resv_mem {
+	uint8_t					subtype;
+	uint8_t					reserved[3];
+	uint64_t					addr;
+	uint64_t					size;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_PROBE_T_NONE		0
+#define VIRTIO_IOMMU_PROBE_T_RESV_MEM		1
+
+#define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
+
+struct virtio_iommu_probe_property {
+	uint16_t					type;
+	uint16_t					length;
+	uint8_t					value[];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_probe {
+	struct virtio_iommu_req_head		head;
+	uint32_t					endpoint;
+	uint8_t					reserved[64];
+
+	uint8_t					properties[];
+
+	/* Tail follows the variable-length properties array (no padding) */
+} QEMU_PACKED;
+
+union virtio_iommu_req {
+	struct virtio_iommu_req_head		head;
+
+	struct virtio_iommu_req_attach		attach;
+	struct virtio_iommu_req_detach		detach;
+	struct virtio_iommu_req_map		map;
+	struct virtio_iommu_req_unmap		unmap;
+	struct virtio_iommu_req_probe		probe;
+};
+
+/* Fault types */
+#define VIRTIO_IOMMU_FAULT_R_UNKNOWN		0
+#define VIRTIO_IOMMU_FAULT_R_DOMAIN		1
+#define VIRTIO_IOMMU_FAULT_R_MAPPING		2
+
+#define VIRTIO_IOMMU_FAULT_F_READ		(1 << 0)
+#define VIRTIO_IOMMU_FAULT_F_WRITE		(1 << 1)
+#define VIRTIO_IOMMU_FAULT_F_EXEC		(1 << 2)
+#define VIRTIO_IOMMU_FAULT_F_ADDRESS		(1 << 8)
+
+struct virtio_iommu_fault {
+	uint8_t					reason;
+	uint8_t					padding[3];
+	uint32_t					flags;
+	uint32_t					endpoint;
+	uint64_t					address;
+} QEMU_PACKED;
+
+#endif
diff --git a/linux-headers/linux/virtio_iommu.h b/linux-headers/linux/virtio_iommu.h
new file mode 100644
index 0000000..2dc4609
--- /dev/null
+++ b/linux-headers/linux/virtio_iommu.h
@@ -0,0 +1 @@
+#include "standard-headers/linux/virtio_iommu.h"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 06/22] virtio-iommu: Add skeleton
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (4 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 05/22] linux-headers: Partial update for virtio-iommu v0.6 Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 07/22] virtio-iommu: Decode the command payload Eric Auger
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patchs adds the skeleton for the virtio-iommu device.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v4 -> v5:
- use the new v0.5 terminology (domain, endpoint)
- add the event virtqueue

v3 -> v4:
- use page_size_mask instead of page_sizes
- added set_features()
- added some traces (reset, set_status, set_features)
- empty virtio_iommu_set_config() as the driver MUST NOT
  write to device configuration fields
- add get_config trace

v2 -> v3:
- rebase on 2.10-rc0, ie. use IOMMUMemoryRegion and remove
  iommu_ops.
- advertise VIRTIO_IOMMU_F_MAP_UNMAP feature
- page_sizes set to TARGET_PAGE_SIZE

Conflicts:
	hw/virtio/trace-events
---
 hw/virtio/Makefile.objs          |   1 +
 hw/virtio/trace-events           |   7 ++
 hw/virtio/virtio-iommu.c         | 256 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |  60 +++++++++
 4 files changed, 324 insertions(+)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 765d363..8967a4a 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -6,6 +6,7 @@ common-obj-y += virtio-mmio.o
 
 obj-y += virtio.o virtio-balloon.o 
 obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
+obj-$(CONFIG_LINUX) += virtio-iommu.o
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
 obj-y += virtio-crypto.o
 obj-$(CONFIG_VIRTIO_PCI) += virtio-crypto-pci.o
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 2b8f81e..0094703 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -31,3 +31,10 @@ vhost_region_add(void *p, const char *mr) "dev %p mr %s"
 vhost_region_del(void *p, const char *mr) "dev %p mr %s"
 vhost_iommu_region_add(void *p, const char *mr) "dev %p mr %s"
 vhost_iommu_region_del(void *p, const char *mr) "dev %p mr %s"
+
+# hw/virtio/virtio-iommu.c
+#
+virtio_iommu_set_features(uint64_t features) "features accepted by the driver =0x%"PRIx64
+virtio_iommu_device_reset(void) "reset!"
+virtio_iommu_device_status(uint8_t status) "driver status = %d"
+virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, uint8_t ioasid_bits, uint32_t probe_size) "page_size_mask=0x%"PRIx64" start=0x%"PRIx64" end=0x%"PRIx64" ioasid_bits=%d probe_size=0x%x"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
new file mode 100644
index 0000000..52430dc
--- /dev/null
+++ b/hw/virtio/virtio-iommu.c
@@ -0,0 +1,256 @@
+/*
+ * virtio-iommu device
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/iov.h"
+#include "qemu-common.h"
+#include "hw/virtio/virtio.h"
+#include "sysemu/kvm.h"
+#include "qapi-event.h"
+#include "trace.h"
+
+#include "standard-headers/linux/virtio_ids.h"
+#include <linux/virtio_iommu.h>
+
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/virtio-iommu.h"
+
+/* Max size */
+#define VIOMMU_DEFAULT_QUEUE_SIZE 256
+
+static int virtio_iommu_handle_attach(VirtIOIOMMU *s,
+                                      struct iovec *iov,
+                                      unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_detach(VirtIOIOMMU *s,
+                                      struct iovec *iov,
+                                      unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_map(VirtIOIOMMU *s,
+                                   struct iovec *iov,
+                                   unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
+                                     struct iovec *iov,
+                                     unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+
+static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
+    VirtQueueElement *elem;
+    struct virtio_iommu_req_head head;
+    struct virtio_iommu_req_tail tail;
+    unsigned int iov_cnt;
+    struct iovec *iov;
+    size_t sz;
+
+    for (;;) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+        if (!elem) {
+            return;
+        }
+
+        if (iov_size(elem->in_sg, elem->in_num) < sizeof(tail) ||
+            iov_size(elem->out_sg, elem->out_num) < sizeof(head)) {
+            virtio_error(vdev, "virtio-iommu erroneous head or tail");
+            virtqueue_detach_element(vq, elem, 0);
+            g_free(elem);
+            break;
+        }
+
+        iov_cnt = elem->out_num;
+        iov = g_memdup(elem->out_sg, sizeof(struct iovec) * elem->out_num);
+        sz = iov_to_buf(iov, iov_cnt, 0, &head, sizeof(head));
+        if (sz != sizeof(head)) {
+            tail.status = VIRTIO_IOMMU_S_UNSUPP;
+        }
+        qemu_mutex_lock(&s->mutex);
+        switch (head.type) {
+        case VIRTIO_IOMMU_T_ATTACH:
+            tail.status = virtio_iommu_handle_attach(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_DETACH:
+            tail.status = virtio_iommu_handle_detach(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_MAP:
+            tail.status = virtio_iommu_handle_map(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_UNMAP:
+            tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
+            break;
+        default:
+            tail.status = VIRTIO_IOMMU_S_UNSUPP;
+        }
+        qemu_mutex_unlock(&s->mutex);
+
+        sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                          &tail, sizeof(tail));
+        assert(sz == sizeof(tail));
+
+        virtqueue_push(vq, elem, sizeof(tail));
+        virtio_notify(vdev, vq);
+        g_free(elem);
+    }
+}
+
+static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
+{
+    VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
+    struct virtio_iommu_config *config = &dev->config;
+
+    trace_virtio_iommu_get_config(config->page_size_mask,
+                                  config->input_range.start,
+                                  config->input_range.end,
+                                  config->domain_bits,
+                                  config->probe_size);
+    memcpy(config_data, &dev->config, sizeof(struct virtio_iommu_config));
+}
+
+static void virtio_iommu_set_config(VirtIODevice *vdev,
+                                      const uint8_t *config_data)
+{
+}
+
+static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
+                                            Error **errp)
+{
+    VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
+    f |= dev->host_features;
+    virtio_add_feature(&f, VIRTIO_RING_F_EVENT_IDX);
+    virtio_add_feature(&f, VIRTIO_RING_F_INDIRECT_DESC);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_INPUT_RANGE);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_MAP_UNMAP);
+    return f;
+}
+
+static void virtio_iommu_set_features(VirtIODevice *vdev, uint64_t val)
+{
+    trace_virtio_iommu_set_features(val);
+}
+
+static int virtio_iommu_post_load_device(void *opaque, int version_id)
+{
+    return 0;
+}
+
+static const VMStateDescription vmstate_virtio_iommu_device = {
+    .name = "virtio-iommu-device",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = virtio_iommu_post_load_device,
+    .fields = (VMStateField[]) {
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
+
+    virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
+                sizeof(struct virtio_iommu_config));
+
+    s->req_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE,
+                             virtio_iommu_handle_command);
+    s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL);
+
+    s->config.page_size_mask = TARGET_PAGE_MASK;
+    s->config.input_range.end = -1UL;
+}
+
+static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
+    virtio_cleanup(vdev);
+}
+
+static void virtio_iommu_device_reset(VirtIODevice *vdev)
+{
+    trace_virtio_iommu_device_reset();
+}
+
+static void virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
+{
+    trace_virtio_iommu_device_status(status);
+}
+
+static void virtio_iommu_instance_init(Object *obj)
+{
+}
+
+static const VMStateDescription vmstate_virtio_iommu = {
+    .name = "virtio-iommu",
+    .minimum_version_id = 1,
+    .version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_VIRTIO_DEVICE,
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static Property virtio_iommu_properties[] = {
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_iommu_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+    dc->props = virtio_iommu_properties;
+    dc->vmsd = &vmstate_virtio_iommu;
+
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    vdc->realize = virtio_iommu_device_realize;
+    vdc->unrealize = virtio_iommu_device_unrealize;
+    vdc->reset = virtio_iommu_device_reset;
+    vdc->get_config = virtio_iommu_get_config;
+    vdc->set_config = virtio_iommu_set_config;
+    vdc->get_features = virtio_iommu_get_features;
+    vdc->set_features = virtio_iommu_set_features;
+    vdc->set_status = virtio_iommu_set_status;
+    vdc->vmsd = &vmstate_virtio_iommu_device;
+}
+
+static const TypeInfo virtio_iommu_info = {
+    .name = TYPE_VIRTIO_IOMMU,
+    .parent = TYPE_VIRTIO_DEVICE,
+    .instance_size = sizeof(VirtIOIOMMU),
+    .instance_init = virtio_iommu_instance_init,
+    .class_init = virtio_iommu_class_init,
+};
+
+static void virtio_register_types(void)
+{
+    type_register_static(&virtio_iommu_info);
+}
+
+type_init(virtio_register_types)
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
new file mode 100644
index 0000000..d7cf73b
--- /dev/null
+++ b/include/hw/virtio/virtio-iommu.h
@@ -0,0 +1,60 @@
+/*
+ * virtio-iommu device
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef QEMU_VIRTIO_IOMMU_H
+#define QEMU_VIRTIO_IOMMU_H
+
+#include "standard-headers/linux/virtio_iommu.h"
+#include "hw/virtio/virtio.h"
+#include "hw/pci/pci.h"
+
+#define TYPE_VIRTIO_IOMMU "virtio-iommu-device"
+#define VIRTIO_IOMMU(obj) \
+        OBJECT_CHECK(VirtIOIOMMU, (obj), TYPE_VIRTIO_IOMMU)
+
+#define IOMMU_PCI_BUS_MAX      256
+#define IOMMU_PCI_DEVFN_MAX    256
+
+typedef struct IOMMUDevice {
+    void         *viommu;
+    PCIBus       *bus;
+    int           devfn;
+    IOMMUMemoryRegion  iommu_mr;
+    AddressSpace  as;
+} IOMMUDevice;
+
+typedef struct IOMMUPciBus {
+    PCIBus       *bus;
+    IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} IOMMUPciBus;
+
+typedef struct VirtIOIOMMU {
+    VirtIODevice parent_obj;
+    VirtQueue *req_vq;
+    VirtQueue *event_vq;
+    struct virtio_iommu_config config;
+    uint32_t host_features;
+    GHashTable *as_by_busptr;
+    IOMMUPciBus *as_by_bus_num[IOMMU_PCI_BUS_MAX];
+    GTree *domains;
+    QemuMutex mutex;
+    GTree *endpoints;
+} VirtIOIOMMU;
+
+#endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 07/22] virtio-iommu: Decode the command payload
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (5 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 06/22] virtio-iommu: Add skeleton Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 08/22] virtio-iommu: Add the iommu regions Eric Auger
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch adds the command payload decoding and
introduces the functions that will do the actual
command handling. Those functions are not yet implemented.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- change map/unmap semantics (remove size)

v4 -> v5:
- adopt new v0.5 terminology

v3 -> v4:
- no flags field anymore in struct virtio_iommu_req_unmap
- test reserved on attach/detach, change trace proto
- rebase on v2.10.0.
---
 hw/virtio/trace-events   |   6 ++-
 hw/virtio/virtio-iommu.c | 104 +++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 105 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 0094703..15e876f 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -37,4 +37,8 @@ vhost_iommu_region_del(void *p, const char *mr) "dev %p mr %s"
 virtio_iommu_set_features(uint64_t features) "features accepted by the driver =0x%"PRIx64
 virtio_iommu_device_reset(void) "reset!"
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
-virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, uint8_t ioasid_bits, uint32_t probe_size) "page_size_mask=0x%"PRIx64" start=0x%"PRIx64" end=0x%"PRIx64" ioasid_bits=%d probe_size=0x%x"
+virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, uint8_t domain_bits, uint32_t probe_size) "page_size_mask=0x%"PRIx64" start=0x%"PRIx64" end=0x%"PRIx64" domain_bits=%d probe_size=0x%x"
+virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
+virtio_iommu_detach(uint32_t ep_id) "endpoint=%d"
+virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
+virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 52430dc..6cbf007 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -35,29 +35,125 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+static int virtio_iommu_attach(VirtIOIOMMU *s,
+                               struct virtio_iommu_req_attach *req)
+{
+    uint32_t domain_id = le32_to_cpu(req->domain);
+    uint32_t ep_id = le32_to_cpu(req->endpoint);
+    uint32_t reserved = le32_to_cpu(req->reserved);
+
+    trace_virtio_iommu_attach(domain_id, ep_id);
+
+    if (reserved) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_detach(VirtIOIOMMU *s,
+                               struct virtio_iommu_req_detach *req)
+{
+    uint32_t ep_id = le32_to_cpu(req->endpoint);
+    uint32_t reserved = le32_to_cpu(req->reserved);
+
+    trace_virtio_iommu_detach(ep_id);
+
+    if (reserved) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_map(VirtIOIOMMU *s,
+                            struct virtio_iommu_req_map *req)
+{
+    uint32_t domain_id = le32_to_cpu(req->domain);
+    uint64_t phys_start = le64_to_cpu(req->phys_start);
+    uint64_t virt_start = le64_to_cpu(req->virt_start);
+    uint64_t virt_end = le64_to_cpu(req->virt_end);
+    uint32_t flags = le32_to_cpu(req->flags);
+
+    trace_virtio_iommu_map(domain_id, virt_start, virt_end, phys_start, flags);
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_unmap(VirtIOIOMMU *s,
+                              struct virtio_iommu_req_unmap *req)
+{
+    uint32_t domain_id = le32_to_cpu(req->domain);
+    uint64_t virt_start = le64_to_cpu(req->virt_start);
+    uint64_t virt_end = le64_to_cpu(req->virt_end);
+
+    trace_virtio_iommu_unmap(domain_id, virt_start, virt_end);
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+#define get_payload_size(req) (\
+sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
+
 static int virtio_iommu_handle_attach(VirtIOIOMMU *s,
                                       struct iovec *iov,
                                       unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_attach req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_attach(s, &req);
 }
 static int virtio_iommu_handle_detach(VirtIOIOMMU *s,
                                       struct iovec *iov,
                                       unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_detach req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_detach(s, &req);
 }
 static int virtio_iommu_handle_map(VirtIOIOMMU *s,
                                    struct iovec *iov,
                                    unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_map req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_map(s, &req);
 }
 static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
                                      struct iovec *iov,
                                      unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_unmap req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_unmap(s, &req);
 }
 
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 08/22] virtio-iommu: Add the iommu regions
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (6 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 07/22] virtio-iommu: Decode the command payload Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 09/22] virtio-iommu: Register attached endpoints Eric Auger
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch initializes the iommu memory regions so that
PCIe end point transactions get translated. The translation
function is not yet implemented though.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- include qapi/error.h
- fix g_hash_table_lookup key in virtio_iommu_find_add_as

v4 -> v5:
- use PCI bus handle as a key
- use get_primary_pci_bus() callback

v3 -> v4:
- add trace_virtio_iommu_init_iommu_mr

v2 -> v3:
- use IOMMUMemoryRegion
- iommu mr name built with BDF
- rename smmu_get_sid into virtio_iommu_get_sid and use PCI_BUILD_BDF
---
 hw/virtio/trace-events           |   2 +
 hw/virtio/virtio-iommu.c         | 103 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |   2 +
 3 files changed, 107 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 15e876f..7123ab2 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -42,3 +42,5 @@ virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_detach(uint32_t ep_id) "endpoint=%d"
 virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
 virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
+virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
+virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 6cbf007..0840854 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -22,7 +22,11 @@
 #include "qemu-common.h"
 #include "hw/virtio/virtio.h"
 #include "sysemu/kvm.h"
+#include "qapi/error.h"
 #include "qapi-event.h"
+#include "qemu/error-report.h"
+#include "hw/i386/pc.h"
+#include "hw/arm/virt.h"
 #include "trace.h"
 
 #include "standard-headers/linux/virtio_ids.h"
@@ -35,6 +39,50 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
+{
+    return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
+}
+
+static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
+                                              int devfn)
+{
+    VirtIOIOMMU *s = opaque;
+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+    IOMMUDevice *sdev;
+
+    if (!sbus) {
+        sbus = g_malloc0(sizeof(IOMMUPciBus) +
+                         sizeof(IOMMUDevice *) * IOMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->as_by_busptr, bus, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        char *name = g_strdup_printf("%s-%d-%d",
+                                     TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+                                     pci_bus_num(bus), devfn);
+        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(IOMMUDevice));
+
+        sdev->viommu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        trace_virtio_iommu_init_iommu_mr(name);
+
+        memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr),
+                                 TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+                                 OBJECT(s), name,
+                                 UINT64_MAX);
+        address_space_init(&sdev->as,
+                           MEMORY_REGION(&sdev->iommu_mr), TYPE_VIRTIO_IOMMU);
+    }
+
+    return &sdev->as;
+
+}
+
 static int virtio_iommu_attach(VirtIOIOMMU *s,
                                struct virtio_iommu_req_attach *req)
 {
@@ -215,6 +263,26 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
+static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+                                            IOMMUAccessFlags flag)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    uint32_t sid;
+
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = IOMMU_NONE,
+    };
+
+    sid = virtio_iommu_get_sid(sdev);
+
+    trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+    return entry;
+}
+
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
@@ -269,6 +337,17 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
+    MachineState *ms = MACHINE(qdev_get_machine());
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    PCIBus *pcibus;
+
+    if (!mc->get_primary_pci_bus) {
+        goto err;
+    }
+    pcibus = mc->get_primary_pci_bus(ms);
+    if (!pcibus) {
+        goto err;
+    }
 
     virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
                 sizeof(struct virtio_iommu_config));
@@ -279,6 +358,14 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
+
+    memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
+    s->as_by_busptr = g_hash_table_new(NULL, NULL);
+
+    pci_setup_iommu(pcibus, virtio_iommu_find_add_as, s);
+    return;
+err:
+    error_setg(&error_fatal, "virtio-iommu: no pci bus identified");
 }
 
 static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
@@ -336,6 +423,14 @@ static void virtio_iommu_class_init(ObjectClass *klass, void *data)
     vdc->vmsd = &vmstate_virtio_iommu_device;
 }
 
+static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
+                                                  void *data)
+{
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+    imrc->translate = virtio_iommu_translate;
+}
+
 static const TypeInfo virtio_iommu_info = {
     .name = TYPE_VIRTIO_IOMMU,
     .parent = TYPE_VIRTIO_DEVICE,
@@ -344,9 +439,17 @@ static const TypeInfo virtio_iommu_info = {
     .class_init = virtio_iommu_class_init,
 };
 
+static const TypeInfo virtio_iommu_memory_region_info = {
+    .parent = TYPE_IOMMU_MEMORY_REGION,
+    .name = TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+    .class_init = virtio_iommu_memory_region_class_init,
+};
+
+
 static void virtio_register_types(void)
 {
     type_register_static(&virtio_iommu_info);
+    type_register_static(&virtio_iommu_memory_region_info);
 }
 
 type_init(virtio_register_types)
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index d7cf73b..0b6b3f2 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -28,6 +28,8 @@
 #define VIRTIO_IOMMU(obj) \
         OBJECT_CHECK(VirtIOIOMMU, (obj), TYPE_VIRTIO_IOMMU)
 
+#define TYPE_VIRTIO_IOMMU_MEMORY_REGION "virtio-iommu-memory-region"
+
 #define IOMMU_PCI_BUS_MAX      256
 #define IOMMU_PCI_DEVFN_MAX    256
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 09/22] virtio-iommu: Register attached endpoints
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (7 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 08/22] virtio-iommu: Add the iommu regions Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 10/22] virtio-iommu: Implement attach/detach command Eric Auger
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch introduce domain and endpoint internal
datatypes. Both are stored in RB trees. The domain
owns a list of endpoints attached to it.

It is assumed the endpoint ID corresponds to the PCI BDF.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v4 -> v5:
- initialize as->endpoint_list

v3 -> v4:
- new separate patch
---
 hw/virtio/trace-events   |   4 ++
 hw/virtio/virtio-iommu.c | 123 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 7123ab2..a7743d2 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -44,3 +44,7 @@ virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, uin
 virtio_iommu_unmap(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end) "domain=%d virt_start=0x%"PRIx64" virt_end=0x%"PRIx64
 virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
 virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
+virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d"
+virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
+virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
+virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 0840854..207b17a 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -35,15 +35,118 @@
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
 #include "hw/virtio/virtio-iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci.h"
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+typedef struct viommu_domain {
+    uint32_t id;
+    GTree *mappings;
+    QLIST_HEAD(, viommu_endpoint) endpoint_list;
+} viommu_domain;
+
+typedef struct viommu_endpoint {
+    uint32_t id;
+    viommu_domain *domain;
+    QLIST_ENTRY(viommu_endpoint) next;
+    VirtIOIOMMU *viommu;
+} viommu_endpoint;
+
+typedef struct viommu_interval {
+    uint64_t low;
+    uint64_t high;
+} viommu_interval;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
 }
 
+static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
+{
+    viommu_interval *inta = (viommu_interval *)a;
+    viommu_interval *intb = (viommu_interval *)b;
+
+    if (inta->high <= intb->low) {
+        return -1;
+    } else if (intb->high <= inta->low) {
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep)
+{
+    QLIST_REMOVE(ep, next);
+    ep->domain = NULL;
+}
+
+static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
+                                                  uint32_t ep_id)
+{
+    viommu_endpoint *ep;
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id));
+    if (ep) {
+        return ep;
+    }
+    ep = g_malloc0(sizeof(*ep));
+    ep->id = ep_id;
+    ep->viommu = s;
+    trace_virtio_iommu_get_endpoint(ep_id);
+    g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
+    return ep;
+}
+
+static void virtio_iommu_put_endpoint(gpointer data)
+{
+    viommu_endpoint *ep = (viommu_endpoint *)data;
+
+    if (ep->domain) {
+        virtio_iommu_detach_endpoint_from_domain(ep);
+        g_tree_unref(ep->domain->mappings);
+    }
+
+    trace_virtio_iommu_put_endpoint(ep->id);
+    g_free(ep);
+}
+
+viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id);
+viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id)
+{
+    viommu_domain *domain;
+
+    domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+    if (domain) {
+        return domain;
+    }
+    domain = g_malloc0(sizeof(*domain));
+    domain->id = domain_id;
+    domain->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
+                                   NULL, (GDestroyNotify)g_free,
+                                   (GDestroyNotify)g_free);
+    g_tree_insert(s->domains, GUINT_TO_POINTER(domain_id), domain);
+    QLIST_INIT(&domain->endpoint_list);
+    trace_virtio_iommu_get_domain(domain_id);
+    return domain;
+}
+
+static void virtio_iommu_put_domain(gpointer data)
+{
+    viommu_domain *domain = (viommu_domain *)data;
+    viommu_endpoint *iter, *tmp;
+
+    QLIST_FOREACH_SAFE(iter, &domain->endpoint_list, next, tmp) {
+        virtio_iommu_detach_endpoint_from_domain(iter);
+    }
+    g_tree_destroy(domain->mappings);
+    trace_virtio_iommu_put_domain(domain->id);
+    g_free(domain);
+}
+
 static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
                                               int devfn)
 {
@@ -69,6 +172,8 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
         sdev->bus = bus;
         sdev->devfn = devfn;
 
+        virtio_iommu_get_endpoint(s, PCI_BUILD_BDF(pci_bus_num(bus), devfn));
+
         trace_virtio_iommu_init_iommu_mr(name);
 
         memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr),
@@ -333,6 +438,13 @@ static const VMStateDescription vmstate_virtio_iommu_device = {
     },
 };
 
+static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
+{
+    uint ua = GPOINTER_TO_UINT(a);
+    uint ub = GPOINTER_TO_UINT(b);
+    return (ua > ub) - (ua < ub);
+}
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -359,10 +471,17 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
 
+    qemu_mutex_init(&s->mutex);
+
     memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
     s->as_by_busptr = g_hash_table_new(NULL, NULL);
 
     pci_setup_iommu(pcibus, virtio_iommu_find_add_as, s);
+
+    s->domains = g_tree_new_full((GCompareDataFunc)int_cmp,
+                                 NULL, NULL, virtio_iommu_put_domain);
+    s->endpoints = g_tree_new_full((GCompareDataFunc)int_cmp,
+                                   NULL, NULL, virtio_iommu_put_endpoint);
     return;
 err:
     error_setg(&error_fatal, "virtio-iommu: no pci bus identified");
@@ -371,6 +490,10 @@ err:
 static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
+
+    g_tree_destroy(s->domains);
+    g_tree_destroy(s->endpoints);
 
     virtio_cleanup(vdev);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 10/22] virtio-iommu: Implement attach/detach command
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (8 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 09/22] virtio-iommu: Register attached endpoints Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 11/22] virtio-iommu: Implement map/unmap Eric Auger
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the endpoint attach/detach to/from
a domain.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 39 +++++++++++++++++++++++++++++++++------
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 207b17a..69ff516 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -114,8 +114,8 @@ static void virtio_iommu_put_endpoint(gpointer data)
     g_free(ep);
 }
 
-viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id);
-viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s, uint32_t domain_id)
+static viommu_domain *virtio_iommu_get_domain(VirtIOIOMMU *s,
+                                              uint32_t domain_id)
 {
     viommu_domain *domain;
 
@@ -194,6 +194,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     uint32_t domain_id = le32_to_cpu(req->domain);
     uint32_t ep_id = le32_to_cpu(req->endpoint);
     uint32_t reserved = le32_to_cpu(req->reserved);
+    viommu_domain *domain;
+    viommu_endpoint *ep;
 
     trace_virtio_iommu_attach(domain_id, ep_id);
 
@@ -201,7 +203,22 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
         return VIRTIO_IOMMU_S_INVAL;
     }
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    ep = virtio_iommu_get_endpoint(s, ep_id);
+    if (ep->domain) {
+        /*
+         * the device is already attached to a domain,
+         * detach it first
+         */
+        virtio_iommu_detach_endpoint_from_domain(ep);
+    }
+
+    domain = virtio_iommu_get_domain(s, domain_id);
+    QLIST_INSERT_HEAD(&domain->endpoint_list, ep, next);
+
+    ep->domain = domain;
+    g_tree_ref(domain->mappings);
+
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_detach(VirtIOIOMMU *s,
@@ -209,14 +226,24 @@ static int virtio_iommu_detach(VirtIOIOMMU *s,
 {
     uint32_t ep_id = le32_to_cpu(req->endpoint);
     uint32_t reserved = le32_to_cpu(req->reserved);
-
-    trace_virtio_iommu_detach(ep_id);
+    viommu_endpoint *ep;
 
     if (reserved) {
         return VIRTIO_IOMMU_S_INVAL;
     }
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id));
+    if (!ep) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
+    if (!ep->domain) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    virtio_iommu_detach_endpoint_from_domain(ep);
+    trace_virtio_iommu_detach(ep_id);
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_map(VirtIOIOMMU *s,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 11/22] virtio-iommu: Implement map/unmap
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (9 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 10/22] virtio-iommu: Implement attach/detach command Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 12/22] virtio-iommu: Implement translate Eric Auger
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements virtio_iommu_map/unmap.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use new v0.6 fields
- replace error_report by qemu_log_mask

v3 -> v4:
- implement unmap semantics as specified in v0.4
---
 hw/virtio/trace-events   |  3 ++
 hw/virtio/virtio-iommu.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a7743d2..7f7ea50 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -48,3 +48,6 @@ virtio_iommu_get_endpoint(uint32_t ep_id) "Alloc endpoint=%d"
 virtio_iommu_put_endpoint(uint32_t ep_id) "Free endpoint=%d"
 virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
+virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 69ff516..1986565 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 #include "qemu/iov.h"
 #include "qemu-common.h"
 #include "hw/virtio/virtio.h"
@@ -59,6 +60,13 @@ typedef struct viommu_interval {
     uint64_t high;
 } viommu_interval;
 
+typedef struct viommu_mapping {
+    uint64_t virt_addr;
+    uint64_t phys_addr;
+    uint64_t size;
+    uint32_t flags;
+} viommu_mapping;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -254,10 +262,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
     uint64_t virt_start = le64_to_cpu(req->virt_start);
     uint64_t virt_end = le64_to_cpu(req->virt_end);
     uint32_t flags = le32_to_cpu(req->flags);
+    viommu_domain *domain;
+    viommu_interval *interval;
+    viommu_mapping *mapping;
+
+    interval = g_malloc0(sizeof(*interval));
+
+    interval->low = virt_start;
+    interval->high = virt_end;
+
+    domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+    if (!domain) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
+    mapping = g_tree_lookup(domain->mappings, (gpointer)interval);
+    if (mapping) {
+        g_free(interval);
+        return VIRTIO_IOMMU_S_INVAL;
+    }
 
     trace_virtio_iommu_map(domain_id, virt_start, virt_end, phys_start, flags);
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    mapping = g_malloc0(sizeof(*mapping));
+    mapping->virt_addr = virt_start;
+    mapping->phys_addr = phys_start;
+    mapping->size = virt_end - virt_start + 1;
+    mapping->flags = flags;
+
+    g_tree_insert(domain->mappings, interval, mapping);
+
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_unmap(VirtIOIOMMU *s,
@@ -266,10 +301,65 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     uint32_t domain_id = le32_to_cpu(req->domain);
     uint64_t virt_start = le64_to_cpu(req->virt_start);
     uint64_t virt_end = le64_to_cpu(req->virt_end);
+    uint64_t size = virt_end - virt_start + 1;
+    viommu_mapping *mapping;
+    viommu_interval interval;
+    viommu_domain *domain;
 
     trace_virtio_iommu_unmap(domain_id, virt_start, virt_end);
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    domain = g_tree_lookup(s->domains, GUINT_TO_POINTER(domain_id));
+    if (!domain) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: no domain\n", __func__);
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+    interval.low = virt_start;
+    interval.high = virt_end;
+
+    mapping = g_tree_lookup(domain->mappings, (gpointer)(&interval));
+
+    while (mapping) {
+        viommu_interval current;
+        uint64_t low  = mapping->virt_addr;
+        uint64_t high = mapping->virt_addr + mapping->size - 1;
+
+        current.low = low;
+        current.high = high;
+
+        if (low == interval.low && size >= mapping->size) {
+            g_tree_remove(domain->mappings, (gpointer)(&current));
+            interval.low = high + 1;
+            trace_virtio_iommu_unmap_left_interval(current.low, current.high,
+                interval.low, interval.high);
+        } else if (high == interval.high && size >= mapping->size) {
+            trace_virtio_iommu_unmap_right_interval(current.low, current.high,
+                interval.low, interval.high);
+            g_tree_remove(domain->mappings, (gpointer)(&current));
+            interval.high = low - 1;
+        } else if (low > interval.low && high < interval.high) {
+            trace_virtio_iommu_unmap_inc_interval(current.low, current.high);
+            g_tree_remove(domain->mappings, (gpointer)(&current));
+        } else {
+            break;
+        }
+        if (interval.low >= interval.high) {
+            return VIRTIO_IOMMU_S_OK;
+        } else {
+            mapping = g_tree_lookup(domain->mappings, (gpointer)(&interval));
+        }
+    }
+
+    if (mapping) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "****** %s: Unmap 0x%"PRIx64" size=0x%"PRIx64
+                     " from 0x%"PRIx64" size=0x%"PRIx64" is not supported\n",
+                     __func__, interval.low, size,
+                     mapping->virt_addr, mapping->size);
+    } else {
+        return VIRTIO_IOMMU_S_OK;
+    }
+
+    return VIRTIO_IOMMU_S_INVAL;
 }
 
 #define get_payload_size(req) (\
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 12/22] virtio-iommu: Implement translate
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (10 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 11/22] virtio-iommu: Implement map/unmap Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 13/22] virtio-iommu: Implement probe request Eric Auger
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the translate callback

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- replace error_report by qemu_log_mask

v4 -> v5:
- check the device domain is not NULL
- s/printf/error_report
- set flags to IOMMU_NONE in case of all translation faults
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 7f7ea50..785d1fa 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -51,3 +51,4 @@ virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1986565..8c806dc 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -489,19 +489,62 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
                                             IOMMUAccessFlags flag)
 {
     IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
     uint32_t sid;
+    viommu_endpoint *ep;
+    viommu_mapping *mapping;
+    viommu_interval interval;
+
+    interval.low = addr;
+    interval.high = addr + 1;
 
     IOMMUTLBEntry entry = {
         .target_as = &address_space_memory,
         .iova = addr,
         .translated_addr = addr,
-        .addr_mask = ~(hwaddr)0,
+        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
         .perm = IOMMU_NONE,
     };
 
     sid = virtio_iommu_get_sid(sdev);
 
     trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+    qemu_mutex_lock(&s->mutex);
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
+    if (!ep) {
+        error_report("%s sid=%d is not known!!", __func__, sid);
+        goto unlock;
+    }
+
+    if (!ep->domain) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s %02x:%02x.%01x not attached to any domain\n",
+                      __func__, PCI_BUS_NUM(sid), PCI_SLOT(sid), PCI_FUNC(sid));
+        goto unlock;
+    }
+
+    mapping = g_tree_lookup(ep->domain->mappings, (gpointer)(&interval));
+    if (!mapping) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s no mapping for 0x%"PRIx64" for sid=%d\n",
+                      __func__, addr, sid);
+        goto unlock;
+    }
+
+    if (((flag & IOMMU_RO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
+        ((flag & IOMMU_WO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "Permission error on 0x%"PRIx64"(%d): allowed=%d\n",
+                      addr, flag, mapping->flags);
+        goto unlock;
+    }
+    entry.translated_addr = addr - mapping->virt_addr + mapping->phys_addr;
+    entry.perm = flag;
+    trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
     return entry;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 13/22] virtio-iommu: Implement probe request
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (11 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 12/22] virtio-iommu: Implement translate Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 14/22] virtio-iommu: Add an msi_bypass property Eric Auger
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the PROBE request. At the moment,
no reserved regions are returned as none are registered
per device. Only a NONE property is returned.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v4 -> v5:
- initialize bufstate.error to false
- add cpu_to_le64(size)
---
 hw/virtio/trace-events   |   2 +
 hw/virtio/virtio-iommu.c | 190 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 190 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 785d1fa..e3a916c 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -52,3 +52,5 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low,
 virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
+virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx"
+virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 8c806dc..85c5b95 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -41,6 +41,11 @@
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
+#define VIOMMU_PROBE_SIZE 512
+
+#define SUPPORTED_PROBE_PROPERTIES (\
+    VIRTIO_IOMMU_PROBE_T_NONE | \
+    VIRTIO_IOMMU_PROBE_T_RESV_MEM)
 
 typedef struct viommu_domain {
     uint32_t id;
@@ -53,6 +58,7 @@ typedef struct viommu_endpoint {
     viommu_domain *domain;
     QLIST_ENTRY(viommu_endpoint) next;
     VirtIOIOMMU *viommu;
+    GTree *reserved_regions;
 } viommu_endpoint;
 
 typedef struct viommu_interval {
@@ -67,6 +73,13 @@ typedef struct viommu_mapping {
     uint32_t flags;
 } viommu_mapping;
 
+typedef struct viommu_property_buffer {
+    viommu_endpoint *endpoint;
+    size_t filled;
+    uint8_t *start;
+    bool error;
+} viommu_property_buffer;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -106,6 +119,9 @@ static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
     ep->viommu = s;
     trace_virtio_iommu_get_endpoint(ep_id);
     g_tree_insert(s->endpoints, GUINT_TO_POINTER(ep_id), ep);
+    ep->reserved_regions = g_tree_new_full((GCompareDataFunc)interval_cmp,
+                                            NULL, (GDestroyNotify)g_free,
+                                            (GDestroyNotify)g_free);
     return ep;
 }
 
@@ -119,6 +135,7 @@ static void virtio_iommu_put_endpoint(gpointer data)
     }
 
     trace_virtio_iommu_put_endpoint(ep->id);
+    g_tree_destroy(ep->reserved_regions);
     g_free(ep);
 }
 
@@ -362,6 +379,139 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     return VIRTIO_IOMMU_S_INVAL;
 }
 
+/**
+ * virtio_iommu_fill_resv_mem_prop - Add a RESV_MEM probe
+ * property into the probe request buffer
+ *
+ * @key: interval handle
+ * @value: handle to the reserved memory region
+ * @data: handle to the probe request buffer state
+ */
+static gboolean virtio_iommu_fill_resv_mem_prop(gpointer key,
+                                                gpointer value,
+                                                gpointer data)
+{
+    struct virtio_iommu_probe_resv_mem *resv =
+        (struct virtio_iommu_probe_resv_mem *)value;
+    struct virtio_iommu_probe_property *prop;
+    struct virtio_iommu_probe_resv_mem *current;
+    viommu_property_buffer *bufstate = (viommu_property_buffer *)data;
+    size_t size = sizeof(*resv), total_size;
+
+    total_size = size + sizeof(*prop);
+
+    if (bufstate->filled + total_size >= VIOMMU_PROBE_SIZE) {
+        bufstate->error = true;
+        /* get the traversal stopped by returning true */
+        return true;
+    }
+    prop = (struct virtio_iommu_probe_property *)
+                (bufstate->start + bufstate->filled);
+    prop->type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_RESV_MEM) &
+                    VIRTIO_IOMMU_PROBE_T_MASK;
+    prop->length = cpu_to_le16(size);
+
+    current = (struct virtio_iommu_probe_resv_mem *)prop->value;
+    *current = *resv;
+    bufstate->filled += total_size;
+    trace_virtio_iommu_fill_resv_property(bufstate->endpoint->id,
+                                          resv->subtype, resv->addr,
+                                          resv->size, resv->subtype,
+                                          bufstate->filled);
+    return false;
+}
+
+static int virtio_iommu_fill_none_prop(viommu_property_buffer *bufstate)
+{
+    struct virtio_iommu_probe_property *prop;
+
+    prop = (struct virtio_iommu_probe_property *)
+                (bufstate->start + bufstate->filled);
+    prop->type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_NONE)
+                    & VIRTIO_IOMMU_PROBE_T_MASK;
+    prop->length = 0;
+    bufstate->filled += sizeof(*prop);
+    trace_virtio_iommu_fill_none_property(bufstate->endpoint->id);
+    return 0;
+}
+
+static int virtio_iommu_fill_property(int type,
+                                      viommu_property_buffer *bufstate)
+{
+    int ret = -ENOSPC;
+
+    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
+        /* Even the property header cannot be filled */
+        bufstate->error = true;
+        goto out;
+    }
+
+    switch (type) {
+    case VIRTIO_IOMMU_PROBE_T_NONE:
+        ret = virtio_iommu_fill_none_prop(bufstate);
+        break;
+    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
+    {
+        viommu_endpoint *ep = bufstate->endpoint;
+
+        g_tree_foreach(ep->reserved_regions,
+                       virtio_iommu_fill_resv_mem_prop,
+                       bufstate);
+        if (!bufstate->error) {
+            ret = 0;
+        }
+        break;
+    }
+    default:
+        ret = -ENOENT;
+        break;
+    }
+out:
+    if (ret) {
+        error_report("%s property of type=%d could not be filled (%d),"
+                     " remaining size = 0x%lx",
+                     __func__, type, ret, bufstate->filled);
+    }
+    return ret;
+}
+
+/**
+ * virtio_iommu_probe - Fill the probe request buffer with all
+ * the properties the device is able to return and add a NONE
+ * property at the end.
+ */
+static int virtio_iommu_probe(VirtIOIOMMU *s,
+                              struct virtio_iommu_req_probe *req,
+                              uint8_t *buf)
+{
+    uint32_t ep_id = le32_to_cpu(req->endpoint);
+    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
+    viommu_property_buffer bufstate;
+    viommu_endpoint *ep;
+    int ret;
+
+    ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(ep_id));
+    if (!ep) {
+        return -EINVAL;
+    }
+
+    bufstate.start = buf;
+    bufstate.filled = 0;
+    bufstate.error = false;
+    bufstate.endpoint = ep;
+
+    while ((type = ctz32(prop_types)) != 32) {
+        ret = virtio_iommu_fill_property(1 << type, &bufstate);
+        if (ret) {
+            break;
+        }
+        prop_types &= ~(1 << type);
+    }
+    virtio_iommu_fill_property(VIRTIO_IOMMU_PROBE_T_NONE, &bufstate);
+
+    return VIRTIO_IOMMU_S_OK;
+}
+
 #define get_payload_size(req) (\
 sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
 
@@ -426,6 +576,24 @@ static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
     return virtio_iommu_unmap(s, &req);
 }
 
+static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
+                                     struct iovec *iov,
+                                     unsigned int iov_cnt,
+                                     uint8_t *buf)
+{
+    struct virtio_iommu_req_probe req;
+    size_t sz, payload_sz;
+
+    payload_sz = sizeof(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return virtio_iommu_probe(s, &req, buf);
+}
+
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
@@ -470,16 +638,32 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
         case VIRTIO_IOMMU_T_UNMAP:
             tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
             break;
+        case VIRTIO_IOMMU_T_PROBE:
+        {
+            struct virtio_iommu_req_tail *ptail;
+            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
+
+            ptail = (struct virtio_iommu_req_tail *)
+                        (buf + s->config.probe_size);
+            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
+
+            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                              buf, s->config.probe_size + sizeof(tail));
+            g_free(buf);
+            assert(sz == s->config.probe_size + sizeof(tail));
+            goto push;
+        }
         default:
             tail.status = VIRTIO_IOMMU_S_UNSUPP;
         }
-        qemu_mutex_unlock(&s->mutex);
 
         sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
                           &tail, sizeof(tail));
         assert(sz == sizeof(tail));
 
-        virtqueue_push(vq, elem, sizeof(tail));
+push:
+        qemu_mutex_unlock(&s->mutex);
+        virtqueue_push(vq, elem, sz);
         virtio_notify(vdev, vq);
         g_free(elem);
     }
@@ -575,6 +759,7 @@ static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
     virtio_add_feature(&f, VIRTIO_RING_F_INDIRECT_DESC);
     virtio_add_feature(&f, VIRTIO_IOMMU_F_INPUT_RANGE);
     virtio_add_feature(&f, VIRTIO_IOMMU_F_MAP_UNMAP);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_PROBE);
     return f;
 }
 
@@ -630,6 +815,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
+    s->config.probe_size = VIOMMU_PROBE_SIZE;
 
     qemu_mutex_init(&s->mutex);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 14/22] virtio-iommu: Add an msi_bypass property
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (12 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 13/22] virtio-iommu: Implement probe request Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting Eric Auger
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

In case the msi_bypass property is set, it means we need
to register the IOAPIC MSI window as a reserved region:
0xFEE00000 - 0xFEEFFFFF.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 hw/virtio/virtio-iommu.c         | 52 ++++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |  1 +
 2 files changed, 53 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 85c5b95..c81a24b 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -43,6 +43,9 @@
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 #define VIOMMU_PROBE_SIZE 512
 
+#define IOAPIC_RANGE_START      (0xfee00000)
+#define IOAPIC_RANGE_SIZE       (0x100000)
+
 #define SUPPORTED_PROBE_PROPERTIES (\
     VIRTIO_IOMMU_PROBE_T_NONE | \
     VIRTIO_IOMMU_PROBE_T_RESV_MEM)
@@ -105,6 +108,25 @@ static void virtio_iommu_detach_endpoint_from_domain(viommu_endpoint *ep)
     ep->domain = NULL;
 }
 
+static void virtio_iommu_register_resv_region(viommu_endpoint *ep,
+                                              uint8_t subtype,
+                                              uint64_t addr, uint64_t size)
+{
+    viommu_interval *interval;
+    struct virtio_iommu_probe_resv_mem *reg;
+
+    interval = g_malloc0(sizeof(*interval));
+    interval->low = addr;
+    interval->high = addr + size - 1;
+
+    reg = g_malloc0(sizeof(*reg));
+    reg->subtype = subtype;
+    reg->addr = cpu_to_le64(addr);
+    reg->size = cpu_to_le64(size);
+
+    g_tree_insert(ep->reserved_regions, interval, reg);
+}
+
 static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
                                                   uint32_t ep_id)
 {
@@ -122,6 +144,12 @@ static viommu_endpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s,
     ep->reserved_regions = g_tree_new_full((GCompareDataFunc)interval_cmp,
                                             NULL, (GDestroyNotify)g_free,
                                             (GDestroyNotify)g_free);
+    if (s->msi_bypass) {
+        virtio_iommu_register_resv_region(ep, VIRTIO_IOMMU_RESV_MEM_T_MSI,
+                                          IOAPIC_RANGE_START,
+                                          IOAPIC_RANGE_SIZE);
+    }
+
     return ep;
 }
 
@@ -854,8 +882,32 @@ static void virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
     trace_virtio_iommu_device_status(status);
 }
 
+static bool virtio_iommu_get_msi_bypass(Object *obj, Error **errp)
+{
+    VirtIOIOMMU *s = VIRTIO_IOMMU(obj);
+
+    return s->msi_bypass;
+}
+
+static void virtio_iommu_set_msi_bypass(Object *obj, bool value, Error **errp)
+{
+    VirtIOIOMMU *s = VIRTIO_IOMMU(obj);
+
+    s->msi_bypass = value;
+}
+
 static void virtio_iommu_instance_init(Object *obj)
 {
+    VirtIOIOMMU *s = VIRTIO_IOMMU(obj);
+
+    object_property_add_bool(obj, "msi_bypass", virtio_iommu_get_msi_bypass,
+                             virtio_iommu_set_msi_bypass, NULL);
+    object_property_set_description(obj, "msi_bypass",
+                                    "Indicates whether msis are bypassed by "
+                                    "the IOMMU. Default is YES",
+                                    NULL);
+
+    s->msi_bypass = true;
 }
 
 static const VMStateDescription vmstate_virtio_iommu = {
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 0b6b3f2..458b2a0 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -57,6 +57,7 @@ typedef struct VirtIOIOMMU {
     GTree *domains;
     QemuMutex mutex;
     GTree *endpoints;
+    bool msi_bypass;
 } VirtIOIOMMU;
 
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (13 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 14/22] virtio-iommu: Add an msi_bypass property Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-03-21 13:15   ` Jean-Philippe Brucker
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 16/22] virtio_iommu: Handle reserved regions in translation process Eric Auger
                   ` (6 subsequent siblings)
  21 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

The event queue allows to report asynchronous errors.
The translate function now injects faults when relevant.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index e3a916c..ce718e8 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -54,3 +54,4 @@ virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRI
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx"
 virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
+virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index c81a24b..010e8e0 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -697,15 +697,61 @@ push:
     }
 }
 
+static void virtio_iommu_report_fault(VirtIOIOMMU *viommu, uint8_t reason,
+                                      uint32_t flags, uint32_t endpoint,
+                                      uint64_t address)
+{
+    VirtIODevice *vdev = &viommu->parent_obj;
+    VirtQueue *vq = viommu->event_vq;
+    struct virtio_iommu_fault fault;
+    VirtQueueElement *elem;
+    size_t sz;
+
+    memset(&fault, 0, sizeof(fault));
+    fault.reason = reason;
+    fault.flags = flags;
+    fault.endpoint = endpoint;
+    fault.address = address;
+
+    for (;;) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+
+        if (!elem) {
+            virtio_error(vdev,
+                         "no buffer available in event queue to report event");
+            return;
+        }
+
+        if (iov_size(elem->in_sg, elem->in_num) < sizeof(fault)) {
+            virtio_error(vdev, "error buffer of wrong size");
+            virtqueue_detach_element(vq, elem, 0);
+            g_free(elem);
+            continue;
+        }
+        break;
+    }
+    /* we have a buffer to fill in */
+    sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                      &fault, sizeof(fault));
+    assert(sz == sizeof(fault));
+
+    trace_virtio_iommu_report_fault(reason, flags, endpoint, address);
+    virtqueue_push(vq, elem, sz);
+    virtio_notify(vdev, vq);
+    g_free(elem);
+
+}
+
 static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
                                             IOMMUAccessFlags flag)
 {
     IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
     VirtIOIOMMU *s = sdev->viommu;
-    uint32_t sid;
+    uint32_t sid, flags;
     viommu_endpoint *ep;
     viommu_mapping *mapping;
     viommu_interval interval;
+    bool read_fault, write_fault;
 
     interval.low = addr;
     interval.high = addr + 1;
@@ -726,6 +772,8 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid));
     if (!ep) {
         error_report("%s sid=%d is not known!!", __func__, sid);
+        virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_UNKNOWN,
+                                  0, sid, 0);
         goto unlock;
     }
 
@@ -733,6 +781,8 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s %02x:%02x.%01x not attached to any domain\n",
                       __func__, PCI_BUS_NUM(sid), PCI_SLOT(sid), PCI_FUNC(sid));
+        virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_DOMAIN,
+                                  0, sid, 0);
         goto unlock;
     }
 
@@ -741,14 +791,25 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s no mapping for 0x%"PRIx64" for sid=%d\n",
                       __func__, addr, sid);
+        virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
+                                  0, sid, addr);
         goto unlock;
     }
 
-    if (((flag & IOMMU_RO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
-        ((flag & IOMMU_WO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
+    read_fault = (flag & IOMMU_RO) &&
+                    !(mapping->flags & VIRTIO_IOMMU_MAP_F_READ);
+    write_fault = (flag & IOMMU_WO) &&
+                    !(mapping->flags & VIRTIO_IOMMU_MAP_F_WRITE);
+
+    flags = read_fault ? VIRTIO_IOMMU_FAULT_F_READ : 0;
+    flags |= write_fault ? VIRTIO_IOMMU_FAULT_F_WRITE : 0;
+    if (flags) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "Permission error on 0x%"PRIx64"(%d): allowed=%d\n",
                       addr, flag, mapping->flags);
+        flags |= VIRTIO_IOMMU_FAULT_F_ADDRESS;
+        virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
+                                  flags, sid, addr);
         goto unlock;
     }
     entry.translated_addr = addr - mapping->virt_addr + mapping->phys_addr;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 16/22] virtio_iommu: Handle reserved regions in translation process
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (14 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 17/22] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

When translating an address we need to check if it belongs to
a reserved virtual address range. If it does, there are 2 cases:

- it belongs to a RESERVED region: the guest should neither use
  this address in a MAP not instruct the end-point to DMA on
  them. We report an error

- It belongs to an MSI region: we bypass the translation.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 010e8e0..a8fabef 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -752,6 +752,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
     viommu_mapping *mapping;
     viommu_interval interval;
     bool read_fault, write_fault;
+    struct virtio_iommu_probe_resv_mem *reg;
 
     interval.low = addr;
     interval.high = addr + 1;
@@ -777,6 +778,21 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
         goto unlock;
     }
 
+    reg = g_tree_lookup(ep->reserved_regions, (gpointer)(&interval));
+    if (reg) {
+        switch (reg->subtype) {
+        case VIRTIO_IOMMU_RESV_MEM_T_MSI:
+            entry.perm = flag;
+            break;
+        case VIRTIO_IOMMU_RESV_MEM_T_RESERVED:
+        default:
+            virtio_iommu_report_fault(s, VIRTIO_IOMMU_FAULT_R_MAPPING,
+                                      0, sid, addr);
+            break;
+        }
+        goto unlock;
+    }
+
     if (!ep->domain) {
         qemu_log_mask(LOG_GUEST_ERROR,
                       "%s %02x:%02x.%01x not attached to any domain\n",
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 17/22] hw/arm/virt: Add virtio-iommu to the virt board
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (15 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 16/22] virtio_iommu: Handle reserved regions in translation process Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table Eric Auger
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

The specific virtio-mmio node is inconditionally added on
machine init while the binding between this latter and the
PCIe host bridge is done on machine init done notifier, only
if -device virtio-iommu-device was added to the qemu command
line.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v4 -> v5:
- VirtMachineClass no_iommu added in this patch
- Use object_resolve_path_type
---
 hw/arm/virt.c         | 83 +++++++++++++++++++++++++++++++++++++++++++++++----
 include/hw/arm/virt.h |  5 ++++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 08ac411..cf81716 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -54,6 +54,7 @@
 #include "hw/arm/fdt.h"
 #include "hw/intc/arm_gic.h"
 #include "hw/intc/arm_gicv3_common.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "kvm_arm.h"
 #include "hw/smbios/smbios.h"
 #include "qapi/visitor.h"
@@ -141,6 +142,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
+    [VIRT_IOMMU] =              { 0x09050000, 0x00000200 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -161,6 +163,7 @@ static const int a15irqmap[] = {
     [VIRT_SECURE_UART] = 8,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
+    [VIRT_IOMMU] = 74,
     [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
@@ -941,6 +944,69 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
+static void virtio_iommu_notifier(Notifier *notifier, void *data)
+{
+    VirtMachineState *vms = container_of(notifier, VirtMachineState,
+                                         virtio_iommu_done);
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    struct arm_boot_info *info = &vms->bootinfo;
+    bool ambiguous;
+    Object *obj = object_resolve_path_type("", TYPE_VIRTIO_IOMMU, &ambiguous);
+    int dtb_size;
+    void *fdt = info->get_dtb(info, &dtb_size);
+
+    if (!obj) {
+        return;
+    }
+
+    if (vmc->no_iommu) {
+        error_setg(&error_fatal, "this machine version does not support iommu");
+    }
+
+    if (ambiguous) {
+        error_setg(&error_fatal, "a single virtio-iommu device is supported!");
+    }
+
+    object_property_set_bool(obj, false, "msi_bypass", &error_fatal);
+
+    qemu_fdt_setprop_cells(fdt, vms->pcie_host_nodename, "iommu-map",
+                           0x0, vms->iommu_phandle, 0x0, 0x10000);
+}
+
+static void create_virtio_iommu(VirtMachineState *vms, qemu_irq *pic)
+{
+    char *node;
+    const char compat[] = "virtio,mmio";
+    int irq =  vms->irqmap[VIRT_IOMMU];
+    hwaddr base = vms->memmap[VIRT_IOMMU].base;
+    hwaddr size = vms->memmap[VIRT_IOMMU].size;
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+
+    if (vmc->no_iommu) {
+        return;
+    }
+
+    vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
+    sysbus_create_simple("virtio-mmio", base, pic[irq]);
+
+    node = g_strdup_printf("/virtio_mmio@%" PRIx64, base);
+    qemu_fdt_add_subnode(vms->fdt, node);
+    qemu_fdt_setprop(vms->fdt, node, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vms->fdt, node, "reg", 2, base, 2, size);
+
+    qemu_fdt_setprop_cells(vms->fdt, node, "interrupts",
+            GIC_FDT_IRQ_TYPE_SPI, irq, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+
+    qemu_fdt_setprop(vms->fdt, node, "dma-coherent", NULL, 0);
+    qemu_fdt_setprop_cell(vms->fdt, node, "#iommu-cells", 1);
+    qemu_fdt_setprop_cell(vms->fdt, node, "phandle", vms->iommu_phandle);
+    g_free(node);
+
+    vms->virtio_iommu_done.notify = virtio_iommu_notifier;
+    qemu_add_machine_init_done_notifier(&vms->virtio_iommu_done);
+}
+
 static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
@@ -1016,7 +1082,8 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
         }
     }
 
-    nodename = g_strdup_printf("/pcie@%" PRIx64, base);
+    vms->pcie_host_nodename = g_strdup_printf("/pcie@%" PRIx64, base);
+    nodename = vms->pcie_host_nodename;
     qemu_fdt_add_subnode(vms->fdt, nodename);
     qemu_fdt_setprop_string(vms->fdt, nodename,
                             "compatible", "pci-host-ecam-generic");
@@ -1055,7 +1122,6 @@ static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
 
-    g_free(nodename);
 }
 
 static void create_platform_bus(VirtMachineState *vms, qemu_irq *pic)
@@ -1370,16 +1436,16 @@ static void machvirt_init(MachineState *machine)
 
     create_rtc(vms, pic);
 
-    create_pcie(vms, pic);
-
-    create_gpio(vms, pic);
-
     /* Create mmio transports, so the user can create virtio backends
      * (which will be automatically plugged in to the transports). If
      * no backend is created the transport will just sit harmlessly idle.
      */
     create_virtio_devices(vms, pic);
 
+    create_pcie(vms, pic);
+
+    create_gpio(vms, pic);
+
     vms->fw_cfg = create_fw_cfg(vms, &address_space_memory);
     rom_set_fw(vms->fw_cfg);
 
@@ -1404,6 +1470,7 @@ static void machvirt_init(MachineState *machine)
      * Notifiers are executed in registration reverse order.
      */
     create_platform_bus(vms, pic);
+    create_virtio_iommu(vms, pic);
 }
 
 static bool virt_get_secure(Object *obj, Error **errp)
@@ -1645,8 +1712,12 @@ static void virt_2_11_instance_init(Object *obj)
 
 static void virt_machine_2_11_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
     virt_machine_2_12_options(mc);
     SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_11);
+
+    vmc->no_iommu = true;
 }
 DEFINE_VIRT_MACHINE(2, 11)
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 7e31e99..a13b895 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -59,6 +59,7 @@ enum {
     VIRT_GIC_V2M,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
+    VIRT_IOMMU,
     VIRT_UART,
     VIRT_MMIO,
     VIRT_RTC,
@@ -84,12 +85,14 @@ typedef struct {
     bool disallow_affinity_adjustment;
     bool no_its;
     bool no_pmu;
+    bool no_iommu;
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
 typedef struct {
     MachineState parent;
     Notifier machine_done;
+    Notifier virtio_iommu_done;
     FWCfgState *fw_cfg;
     bool secure;
     bool highmem;
@@ -105,6 +108,8 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t iommu_phandle;
+    char *pcie_host_nodename;
     int psci_conduit;
     PCIBus *pci_bus;
 } VirtMachineState;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (16 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 17/22] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-13 12:24   ` Andrew Jones
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 19/22] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
                   ` (3 subsequent siblings)
  21 siblings, 1 reply; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch builds the virtio-iommu node in the ACPI IORT table.
The dt node creation function fills the information used by
the IORT table generation function (base address, base irq,
type of the smmu).

The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use type=128
- new gsiv and reserved2 fields
---
 hw/arm/virt-acpi-build.c    | 54 +++++++++++++++++++++++++++++++++++++++------
 hw/arm/virt.c               |  5 +++++
 include/hw/acpi/acpi-defs.h | 21 +++++++++++++++++-
 include/hw/arm/virt.h       | 13 +++++++++++
 4 files changed, 85 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f7fa795..24efb69 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -42,6 +42,7 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "hw/arm/virt.h"
 #include "sysemu/numa.h"
 #include "kvm_arm.h"
@@ -393,18 +394,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
 }
 
 static void
-build_iort(GArray *table_data, BIOSLinker *linker)
+build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-    int iort_start = table_data->len;
+    int nb_nodes, iort_start = table_data->len;
     AcpiIortIdMapping *idmap;
     AcpiIortItsGroup *its;
     AcpiIortTable *iort;
-    size_t node_size, iort_length;
+    AcpiIortPVIommu *iommu;
+    size_t node_size, iort_length, iommu_offset = 0;
     AcpiIortRC *rc;
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
+    if (vms->iommu_info.type) {
+        nb_nodes = 3; /* RC, ITS, IOMMU */
+    } else {
+        nb_nodes = 2; /* RC, ITS */
+    }
+
     iort_length = sizeof(*iort);
+    iort->node_count = cpu_to_le32(nb_nodes);
     iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
     iort->node_offset = cpu_to_le32(sizeof(*iort));
 
@@ -418,6 +427,31 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
+    if (vms->iommu_info.type == VIRT_IOMMU_VIRTIO) {
+
+        /* Para-virtualized IOMMU node */
+        iommu_offset = cpu_to_le32(iort->node_offset + node_size);
+        node_size = sizeof(*iommu) + sizeof(*idmap);
+        iort_length += node_size;
+        iommu = acpi_data_push(table_data, node_size);
+
+        iommu->type = ACPI_IORT_NODE_PARAVIRT;
+        iommu->length = cpu_to_le16(node_size);
+        iommu->base_address = cpu_to_le64(vms->iommu_info.reg.base);
+        iommu->span = cpu_to_le64(vms->iommu_info.reg.size);
+        iommu->model = ACPI_IORT_NODE_PV_VIRTIO_IOMMU;
+        iommu->flags = ACPI_IORT_NODE_PV_CACHE_COHERENT;
+        iommu->gsiv = cpu_to_le64(vms->iommu_info.irq_base);
+
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &iommu->id_mapping_array[0];
+        idmap->input_base = 0;
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = 0;
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
     /* Root Complex Node */
     node_size = sizeof(*rc) + sizeof(*idmap);
     iort_length += node_size;
@@ -438,10 +472,16 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     idmap->input_base = 0;
     idmap->id_count = cpu_to_le32(0xFFFF);
     idmap->output_base = 0;
-    /* output IORT node is the ITS group node (the first node) */
-    idmap->output_reference = cpu_to_le32(iort->node_offset);
 
-    iort->length = cpu_to_le32(iort_length);
+    if (vms->iommu_info.type) {
+        /* output IORT node is the smmuv3 node */
+        idmap->output_reference = cpu_to_le32(iommu_offset);
+    } else {
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
+   iort->length = cpu_to_le32(iort_length);
 
     build_header(linker, table_data, (void *)(table_data->data + iort_start),
                  "IORT", table_data->len - iort_start, 0, NULL, NULL);
@@ -786,7 +826,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
-        build_iort(tables_blob, tables->linker);
+        build_iort(tables_blob, tables->linker, vms);
     }
 
     /* XSDT is pointed to by RSDP */
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index cf81716..80740ac 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -969,6 +969,11 @@ static void virtio_iommu_notifier(Notifier *notifier, void *data)
 
     object_property_set_bool(obj, false, "msi_bypass", &error_fatal);
 
+    vms->iommu_info.type = VIRT_IOMMU_VIRTIO;
+    vms->iommu_info.reg.base = vms->memmap[VIRT_IOMMU].base;
+    vms->iommu_info.reg.size = vms->memmap[VIRT_IOMMU].size;
+    vms->iommu_info.irq_base = vms->irqmap[VIRT_IOMMU];
+
     qemu_fdt_setprop_cells(fdt, vms->pcie_host_nodename, "iommu-map",
                            0x0, vms->iommu_phandle, 0x0, 0x10000);
 }
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 80c8099..57b9cf9 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -673,7 +673,8 @@ enum {
         ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
         ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
         ACPI_IORT_NODE_SMMU = 0x03,
-        ACPI_IORT_NODE_SMMU_V3 = 0x04
+        ACPI_IORT_NODE_SMMU_V3 = 0x04,
+        ACPI_IORT_NODE_PARAVIRT = 0x80
 };
 
 struct AcpiIortIdMapping {
@@ -700,6 +701,24 @@ struct AcpiIortItsGroup {
 } QEMU_PACKED;
 typedef struct AcpiIortItsGroup AcpiIortItsGroup;
 
+struct AcpiIortPVIommu {
+    ACPI_IORT_NODE_HEADER_DEF
+    uint64_t base_address;
+    uint64_t span;
+    uint32_t model;
+    uint32_t flags;
+    uint64_t gsiv;
+    uint64_t reserved2;
+    AcpiIortIdMapping id_mapping_array[0];
+} QEMU_PACKED;
+typedef struct AcpiIortPVIommu AcpiIortPVIommu;
+
+enum {
+    ACPI_IORT_NODE_PV_VIRTIO_IOMMU = 0x00,
+};
+
+#define ACPI_IORT_NODE_PV_CACHE_COHERENT    (1 << 0)
+
 struct AcpiIortRC {
     ACPI_IORT_NODE_HEADER_DEF
     AcpiIortMemoryAccess memory_properties;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index a13b895..2e1e907 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -89,6 +89,18 @@ typedef struct {
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
+typedef enum VirtIOMMUType {
+    VIRT_IOMMU_NONE,
+    VIRT_IOMMU_SMMUV3,
+    VIRT_IOMMU_VIRTIO,
+} VirtIOMMUType;
+
+typedef struct VirtIOMMUInfo {
+    VirtIOMMUType type;
+    MemMapEntry reg;
+    int irq_base;
+} VirtIOMMUInfo;
+
 typedef struct {
     MachineState parent;
     Notifier machine_done;
@@ -98,6 +110,7 @@ typedef struct {
     bool highmem;
     bool its;
     bool virt;
+    VirtIOMMUInfo iommu_info;
     int32_t gic_version;
     struct arm_boot_info bootinfo;
     const MemMapEntry *memmap;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 19/22] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (17 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 20/22] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This callback allows to inform the IOMMU memory region about
restrictions on the supported page sizes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 783ef64..1c0374f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -208,6 +208,10 @@ typedef struct IOMMUMemoryRegionClass {
                                IOMMUAccessFlags flag);
     /* Returns minimum supported page size */
     uint64_t (*get_min_page_size)(IOMMUMemoryRegion *iommu);
+
+    /* Limits the supported page sizes to @pgsizes */
+    void (*set_page_size_mask)(IOMMUMemoryRegion *iommu, uint64_t pgsizes);
+
     /* Called when IOMMU Notifier flag changed */
     void (*notify_flag_changed)(IOMMUMemoryRegion *iommu,
                                 IOMMUNotifierFlag old_flags,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 20/22] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (18 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 19/22] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 21/22] virtio-iommu: Implement set_page_size_mask Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 22/22] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

We store the page_size_mask in the container and on
vfio_listener_region_add(), the information is conveyed
to the IOMMUMemoryRegion.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c              | 5 +++++
 include/hw/vfio/vfio-common.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f895e3c..8f3fa0c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -506,6 +506,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
     if (memory_region_is_iommu(section->mr)) {
         VFIOGuestIOMMU *giommu;
         IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+        IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
 
         trace_vfio_listener_region_add_iommu(iova, end);
         /*
@@ -526,6 +527,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
                             IOMMU_NOTIFIER_ALL,
                             section->offset_within_region,
                             int128_get64(llend));
+        if (imrc->set_page_size_mask) {
+            imrc->set_page_size_mask(iommu_mr, container->page_size_mask);
+        }
         QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
         memory_region_register_iommu_notifier(section->mr, &giommu->n);
@@ -1053,6 +1057,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
             /* Assume 4k IOVA page size */
             info.iova_pgsizes = 4096;
         }
+        container->page_size_mask = info.iova_pgsizes;
         vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
     } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU) ||
                ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index f3a2ac9..221cc30 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -79,6 +79,7 @@ typedef struct VFIOContainer {
     int fd; /* /dev/vfio/vfio, empowered by the attached groups */
     MemoryListener listener;
     MemoryListener prereg_listener;
+    uint64_t page_size_mask; /* page sizes supported for this container */
     unsigned iommu_type;
     int error;
     bool initialized;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 21/22] virtio-iommu: Implement set_page_size_mask
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (19 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 20/22] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 22/22] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

We implement the set_page_size_mask callback to allow the
virtio-iommu to be aware of any restrictions on the page size
mask due to an underlying HW IOMMU.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index ce718e8..1145995 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -55,3 +55,4 @@ virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid)
 virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx"
 virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+virtio_iommu_set_page_size_mask(const char *iommu_mr, uint64_t mask) "mr=%s page_size_mask=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index a8fabef..1a15ccc 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -837,6 +837,21 @@ unlock:
     return entry;
 }
 
+static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
+                                            uint64_t page_size_mask)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+
+    s->config.page_size_mask &= page_size_mask;
+    if (!s->config.page_size_mask) {
+        error_setg(&error_fatal,
+                   "No compatible page size between guest and host iommus");
+    }
+
+    trace_virtio_iommu_set_page_size_mask(mr->parent_obj.name, page_size_mask);
+}
+
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
@@ -1027,6 +1042,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = virtio_iommu_translate;
+    imrc->set_page_size_mask = virtio_iommu_set_page_size_mask;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [RFC v6 22/22] hw/vfio/common: Do not print error when viommu translates into an mmio region
  2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
                   ` (20 preceding siblings ...)
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 21/22] virtio-iommu: Implement set_page_size_mask Eric Auger
@ 2018-02-12 18:58 ` Eric Auger
  21 siblings, 0 replies; 26+ messages in thread
From: Eric Auger @ 2018-02-12 18:58 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

On ARM, the MSI doorbell is translated by the virtual IOMMU.
As such address_space_translate() returns the MSI controller
MMIO region and we get an "iommu map to non memory area"
message. Let's remove this latter.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8f3fa0c..1fc8e28 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -326,8 +326,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
                                  iotlb->translated_addr,
                                  &xlat, &len, writable);
     if (!memory_region_is_ram(mr)) {
-        error_report("iommu map to non memory area %"HWADDR_PRIx"",
-                     xlat);
         return false;
     }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table Eric Auger
@ 2018-02-13 12:24   ` Andrew Jones
  2018-02-13 13:22     ` Auger Eric
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Jones @ 2018-02-13 12:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, alex.williamson, mst, qemu-arm,
	qemu-devel, jean-philippe.brucker, wei, kevin.tian, marc.zyngier,
	tn, will.deacon, peterx, linuc.decode, bharat.bhushan,
	christoffer.dall

On Mon, Feb 12, 2018 at 06:58:20PM +0000, Eric Auger wrote:
> This patch builds the virtio-iommu node in the ACPI IORT table.
> The dt node creation function fills the information used by
> the IORT table generation function (base address, base irq,
> type of the smmu).
> 
> The RID space of the root complex, which spans 0x0-0x10000
> maps to streamid space 0x0-0x10000 in smmuv3, which in turn
> maps to deviceid space 0x0-0x10000 in the ITS group.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v5 -> v6:
> - use type=128
> - new gsiv and reserved2 fields
> ---
>  hw/arm/virt-acpi-build.c    | 54 +++++++++++++++++++++++++++++++++++++++------
>  hw/arm/virt.c               |  5 +++++
>  include/hw/acpi/acpi-defs.h | 21 +++++++++++++++++-
>  include/hw/arm/virt.h       | 13 +++++++++++
>  4 files changed, 85 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index f7fa795..24efb69 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -42,6 +42,7 @@
>  #include "hw/acpi/aml-build.h"
>  #include "hw/pci/pcie_host.h"
>  #include "hw/pci/pci.h"
> +#include "hw/virtio/virtio-iommu.h"
>  #include "hw/arm/virt.h"
>  #include "sysemu/numa.h"
>  #include "kvm_arm.h"
> @@ -393,18 +394,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
>  }
>  
>  static void
> -build_iort(GArray *table_data, BIOSLinker *linker)
> +build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>  {
> -    int iort_start = table_data->len;
> +    int nb_nodes, iort_start = table_data->len;
>      AcpiIortIdMapping *idmap;
>      AcpiIortItsGroup *its;
>      AcpiIortTable *iort;
> -    size_t node_size, iort_length;
> +    AcpiIortPVIommu *iommu;
> +    size_t node_size, iort_length, iommu_offset = 0;
>      AcpiIortRC *rc;
>  
>      iort = acpi_data_push(table_data, sizeof(*iort));
>  
> +    if (vms->iommu_info.type) {
> +        nb_nodes = 3; /* RC, ITS, IOMMU */
> +    } else {
> +        nb_nodes = 2; /* RC, ITS */
> +    }
> +
>      iort_length = sizeof(*iort);
> +    iort->node_count = cpu_to_le32(nb_nodes);
>      iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */

The above line should be removed, as you're replacing it with the previous
one. I wonder how the guest was able to parse this table without seeing
the appropriate node count?

>      iort->node_offset = cpu_to_le32(sizeof(*iort));
>  
> @@ -418,6 +427,31 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>      its->its_count = cpu_to_le32(1);
>      its->identifiers[0] = 0; /* MADT translation_id */
>  
> +    if (vms->iommu_info.type == VIRT_IOMMU_VIRTIO) {
> +

extra blank line

> +        /* Para-virtualized IOMMU node */
> +        iommu_offset = cpu_to_le32(iort->node_offset + node_size);
> +        node_size = sizeof(*iommu) + sizeof(*idmap);
> +        iort_length += node_size;
> +        iommu = acpi_data_push(table_data, node_size);
> +
> +        iommu->type = ACPI_IORT_NODE_PARAVIRT;
> +        iommu->length = cpu_to_le16(node_size);
> +        iommu->base_address = cpu_to_le64(vms->iommu_info.reg.base);
> +        iommu->span = cpu_to_le64(vms->iommu_info.reg.size);
> +        iommu->model = ACPI_IORT_NODE_PV_VIRTIO_IOMMU;
> +        iommu->flags = ACPI_IORT_NODE_PV_CACHE_COHERENT;

model and flags are both larger than a byte, so they need cpu_to_le*'s

> +        iommu->gsiv = cpu_to_le64(vms->iommu_info.irq_base);
> +
> +        /* Identity RID mapping covering the whole input RID range */
> +        idmap = &iommu->id_mapping_array[0];
> +        idmap->input_base = 0;
> +        idmap->id_count = cpu_to_le32(0xFFFF);
> +        idmap->output_base = 0;
> +        /* output IORT node is the ITS group node (the first node) */
> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
> +    }
> +
>      /* Root Complex Node */
>      node_size = sizeof(*rc) + sizeof(*idmap);
>      iort_length += node_size;
> @@ -438,10 +472,16 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>      idmap->input_base = 0;
>      idmap->id_count = cpu_to_le32(0xFFFF);
>      idmap->output_base = 0;
> -    /* output IORT node is the ITS group node (the first node) */
> -    idmap->output_reference = cpu_to_le32(iort->node_offset);
>  
> -    iort->length = cpu_to_le32(iort_length);
> +    if (vms->iommu_info.type) {
> +        /* output IORT node is the smmuv3 node */
> +        idmap->output_reference = cpu_to_le32(iommu_offset);
> +    } else {
> +        /* output IORT node is the ITS group node (the first node) */
> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
> +    }
> +
> +   iort->length = cpu_to_le32(iort_length);

This line appears to have moved down for no reason, but what happened
was the indentation of it got messed up (missing a space)

>  
>      build_header(linker, table_data, (void *)(table_data->data + iort_start),
>                   "IORT", table_data->len - iort_start, 0, NULL, NULL);
> @@ -786,7 +826,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>  
>      if (its_class_name() && !vmc->no_its) {
>          acpi_add_table(table_offsets, tables_blob);
> -        build_iort(tables_blob, tables->linker);
> +        build_iort(tables_blob, tables->linker, vms);
>      }
>  
>      /* XSDT is pointed to by RSDP */
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index cf81716..80740ac 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -969,6 +969,11 @@ static void virtio_iommu_notifier(Notifier *notifier, void *data)
>  
>      object_property_set_bool(obj, false, "msi_bypass", &error_fatal);
>  
> +    vms->iommu_info.type = VIRT_IOMMU_VIRTIO;
> +    vms->iommu_info.reg.base = vms->memmap[VIRT_IOMMU].base;
> +    vms->iommu_info.reg.size = vms->memmap[VIRT_IOMMU].size;
> +    vms->iommu_info.irq_base = vms->irqmap[VIRT_IOMMU];
> +
>      qemu_fdt_setprop_cells(fdt, vms->pcie_host_nodename, "iommu-map",
>                             0x0, vms->iommu_phandle, 0x0, 0x10000);
>  }
> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
> index 80c8099..57b9cf9 100644
> --- a/include/hw/acpi/acpi-defs.h
> +++ b/include/hw/acpi/acpi-defs.h
> @@ -673,7 +673,8 @@ enum {
>          ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
>          ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
>          ACPI_IORT_NODE_SMMU = 0x03,
> -        ACPI_IORT_NODE_SMMU_V3 = 0x04
> +        ACPI_IORT_NODE_SMMU_V3 = 0x04,
> +        ACPI_IORT_NODE_PARAVIRT = 0x80

I recommend putting a comma on the last line too, to avoid having to touch
the line in the future if new enums are added later (as must be done in
this patch). It's nicer for git-blame

>  };
>  
>  struct AcpiIortIdMapping {
> @@ -700,6 +701,24 @@ struct AcpiIortItsGroup {
>  } QEMU_PACKED;
>  typedef struct AcpiIortItsGroup AcpiIortItsGroup;
>  
> +struct AcpiIortPVIommu {
> +    ACPI_IORT_NODE_HEADER_DEF
> +    uint64_t base_address;
> +    uint64_t span;
> +    uint32_t model;
> +    uint32_t flags;
> +    uint64_t gsiv;
> +    uint64_t reserved2;
> +    AcpiIortIdMapping id_mapping_array[0];
> +} QEMU_PACKED;
> +typedef struct AcpiIortPVIommu AcpiIortPVIommu;
> +
> +enum {
> +    ACPI_IORT_NODE_PV_VIRTIO_IOMMU = 0x00,
> +};
> +
> +#define ACPI_IORT_NODE_PV_CACHE_COHERENT    (1 << 0)
> +
>  struct AcpiIortRC {
>      ACPI_IORT_NODE_HEADER_DEF
>      AcpiIortMemoryAccess memory_properties;
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index a13b895..2e1e907 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -89,6 +89,18 @@ typedef struct {
>      bool claim_edge_triggered_timers;
>  } VirtMachineClass;
>  
> +typedef enum VirtIOMMUType {
> +    VIRT_IOMMU_NONE,
> +    VIRT_IOMMU_SMMUV3,
> +    VIRT_IOMMU_VIRTIO,
> +} VirtIOMMUType;
> +
> +typedef struct VirtIOMMUInfo {
> +    VirtIOMMUType type;
> +    MemMapEntry reg;
> +    int irq_base;
> +} VirtIOMMUInfo;
> +
>  typedef struct {
>      MachineState parent;
>      Notifier machine_done;
> @@ -98,6 +110,7 @@ typedef struct {
>      bool highmem;
>      bool its;
>      bool virt;
> +    VirtIOMMUInfo iommu_info;
>      int32_t gic_version;
>      struct arm_boot_info bootinfo;
>      const MemMapEntry *memmap;
> -- 
> 1.9.1
> 
>

I just caught the above comments while skimming. I'm afraid I didn't have
time to review this to the spec yet.

Thanks,
drew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table
  2018-02-13 12:24   ` Andrew Jones
@ 2018-02-13 13:22     ` Auger Eric
  0 siblings, 0 replies; 26+ messages in thread
From: Auger Eric @ 2018-02-13 13:22 UTC (permalink / raw)
  To: Andrew Jones
  Cc: wei, peter.maydell, kevin.tian, mst, jean-philippe.brucker, tn,
	will.deacon, qemu-devel, peterx, marc.zyngier, alex.williamson,
	qemu-arm, linuc.decode, bharat.bhushan, christoffer.dall,
	eric.auger.pro

Hi Drew,
On 13/02/18 13:24, Andrew Jones wrote:
> On Mon, Feb 12, 2018 at 06:58:20PM +0000, Eric Auger wrote:
>> This patch builds the virtio-iommu node in the ACPI IORT table.
>> The dt node creation function fills the information used by
>> the IORT table generation function (base address, base irq,
>> type of the smmu).
>>
>> The RID space of the root complex, which spans 0x0-0x10000
>> maps to streamid space 0x0-0x10000 in smmuv3, which in turn
>> maps to deviceid space 0x0-0x10000 in the ITS group.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v5 -> v6:
>> - use type=128
>> - new gsiv and reserved2 fields
>> ---
>>  hw/arm/virt-acpi-build.c    | 54 +++++++++++++++++++++++++++++++++++++++------
>>  hw/arm/virt.c               |  5 +++++
>>  include/hw/acpi/acpi-defs.h | 21 +++++++++++++++++-
>>  include/hw/arm/virt.h       | 13 +++++++++++
>>  4 files changed, 85 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index f7fa795..24efb69 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -42,6 +42,7 @@
>>  #include "hw/acpi/aml-build.h"
>>  #include "hw/pci/pcie_host.h"
>>  #include "hw/pci/pci.h"
>> +#include "hw/virtio/virtio-iommu.h"
>>  #include "hw/arm/virt.h"
>>  #include "sysemu/numa.h"
>>  #include "kvm_arm.h"
>> @@ -393,18 +394,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
>>  }
>>  
>>  static void
>> -build_iort(GArray *table_data, BIOSLinker *linker)
>> +build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>>  {
>> -    int iort_start = table_data->len;
>> +    int nb_nodes, iort_start = table_data->len;
>>      AcpiIortIdMapping *idmap;
>>      AcpiIortItsGroup *its;
>>      AcpiIortTable *iort;
>> -    size_t node_size, iort_length;
>> +    AcpiIortPVIommu *iommu;
>> +    size_t node_size, iort_length, iommu_offset = 0;
>>      AcpiIortRC *rc;
>>  
>>      iort = acpi_data_push(table_data, sizeof(*iort));
>>  
>> +    if (vms->iommu_info.type) {
>> +        nb_nodes = 3; /* RC, ITS, IOMMU */
>> +    } else {
>> +        nb_nodes = 2; /* RC, ITS */
>> +    }
>> +
>>      iort_length = sizeof(*iort);
>> +    iort->node_count = cpu_to_le32(nb_nodes);
>>      iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
> 
> The above line should be removed, as you're replacing it with the previous
> one. I wonder how the guest was able to parse this table without seeing
> the appropriate node count?
thanks for spotting this.

hum me too. Need to double check the acpi test.
> 
>>      iort->node_offset = cpu_to_le32(sizeof(*iort));
>>  
>> @@ -418,6 +427,31 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>      its->its_count = cpu_to_le32(1);
>>      its->identifiers[0] = 0; /* MADT translation_id */
>>  
>> +    if (vms->iommu_info.type == VIRT_IOMMU_VIRTIO) {
>> +
> 
> extra blank line
> 
>> +        /* Para-virtualized IOMMU node */
>> +        iommu_offset = cpu_to_le32(iort->node_offset + node_size);
>> +        node_size = sizeof(*iommu) + sizeof(*idmap);
>> +        iort_length += node_size;
>> +        iommu = acpi_data_push(table_data, node_size);
>> +
>> +        iommu->type = ACPI_IORT_NODE_PARAVIRT;
>> +        iommu->length = cpu_to_le16(node_size);
>> +        iommu->base_address = cpu_to_le64(vms->iommu_info.reg.base);
>> +        iommu->span = cpu_to_le64(vms->iommu_info.reg.size);
>> +        iommu->model = ACPI_IORT_NODE_PV_VIRTIO_IOMMU;
>> +        iommu->flags = ACPI_IORT_NODE_PV_CACHE_COHERENT;
> 
> model and flags are both larger than a byte, so they need cpu_to_le*'s
ok
> 
>> +        iommu->gsiv = cpu_to_le64(vms->iommu_info.irq_base);
>> +
>> +        /* Identity RID mapping covering the whole input RID range */
>> +        idmap = &iommu->id_mapping_array[0];
>> +        idmap->input_base = 0;
>> +        idmap->id_count = cpu_to_le32(0xFFFF);
>> +        idmap->output_base = 0;
>> +        /* output IORT node is the ITS group node (the first node) */
>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>> +    }
>> +
>>      /* Root Complex Node */
>>      node_size = sizeof(*rc) + sizeof(*idmap);
>>      iort_length += node_size;
>> @@ -438,10 +472,16 @@ build_iort(GArray *table_data, BIOSLinker *linker)
>>      idmap->input_base = 0;
>>      idmap->id_count = cpu_to_le32(0xFFFF);
>>      idmap->output_base = 0;
>> -    /* output IORT node is the ITS group node (the first node) */
>> -    idmap->output_reference = cpu_to_le32(iort->node_offset);
>>  
>> -    iort->length = cpu_to_le32(iort_length);
>> +    if (vms->iommu_info.type) {
>> +        /* output IORT node is the smmuv3 node */
>> +        idmap->output_reference = cpu_to_le32(iommu_offset);
>> +    } else {
>> +        /* output IORT node is the ITS group node (the first node) */
>> +        idmap->output_reference = cpu_to_le32(iort->node_offset);
>> +    }
>> +
>> +   iort->length = cpu_to_le32(iort_length);
> 
> This line appears to have moved down for no reason, but what happened
> was the indentation of it got messed up (missing a space)
ok
> 
>>  
>>      build_header(linker, table_data, (void *)(table_data->data + iort_start),
>>                   "IORT", table_data->len - iort_start, 0, NULL, NULL);
>> @@ -786,7 +826,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>>  
>>      if (its_class_name() && !vmc->no_its) {
>>          acpi_add_table(table_offsets, tables_blob);
>> -        build_iort(tables_blob, tables->linker);
>> +        build_iort(tables_blob, tables->linker, vms);
>>      }
>>  
>>      /* XSDT is pointed to by RSDP */
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index cf81716..80740ac 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -969,6 +969,11 @@ static void virtio_iommu_notifier(Notifier *notifier, void *data)
>>  
>>      object_property_set_bool(obj, false, "msi_bypass", &error_fatal);
>>  
>> +    vms->iommu_info.type = VIRT_IOMMU_VIRTIO;
>> +    vms->iommu_info.reg.base = vms->memmap[VIRT_IOMMU].base;
>> +    vms->iommu_info.reg.size = vms->memmap[VIRT_IOMMU].size;
>> +    vms->iommu_info.irq_base = vms->irqmap[VIRT_IOMMU];
>> +
>>      qemu_fdt_setprop_cells(fdt, vms->pcie_host_nodename, "iommu-map",
>>                             0x0, vms->iommu_phandle, 0x0, 0x10000);
>>  }
>> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
>> index 80c8099..57b9cf9 100644
>> --- a/include/hw/acpi/acpi-defs.h
>> +++ b/include/hw/acpi/acpi-defs.h
>> @@ -673,7 +673,8 @@ enum {
>>          ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
>>          ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
>>          ACPI_IORT_NODE_SMMU = 0x03,
>> -        ACPI_IORT_NODE_SMMU_V3 = 0x04
>> +        ACPI_IORT_NODE_SMMU_V3 = 0x04,
>> +        ACPI_IORT_NODE_PARAVIRT = 0x80
> 
> I recommend putting a comma on the last line too, to avoid having to touch
> the line in the future if new enums are added later (as must be done in
> this patch). It's nicer for git-blame
ok
> 
>>  };
>>  
>>  struct AcpiIortIdMapping {
>> @@ -700,6 +701,24 @@ struct AcpiIortItsGroup {
>>  } QEMU_PACKED;
>>  typedef struct AcpiIortItsGroup AcpiIortItsGroup;
>>  
>> +struct AcpiIortPVIommu {
>> +    ACPI_IORT_NODE_HEADER_DEF
>> +    uint64_t base_address;
>> +    uint64_t span;
>> +    uint32_t model;
>> +    uint32_t flags;
>> +    uint64_t gsiv;
>> +    uint64_t reserved2;
>> +    AcpiIortIdMapping id_mapping_array[0];
>> +} QEMU_PACKED;
>> +typedef struct AcpiIortPVIommu AcpiIortPVIommu;
>> +
>> +enum {
>> +    ACPI_IORT_NODE_PV_VIRTIO_IOMMU = 0x00,
>> +};
>> +
>> +#define ACPI_IORT_NODE_PV_CACHE_COHERENT    (1 << 0)
>> +
>>  struct AcpiIortRC {
>>      ACPI_IORT_NODE_HEADER_DEF
>>      AcpiIortMemoryAccess memory_properties;
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index a13b895..2e1e907 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -89,6 +89,18 @@ typedef struct {
>>      bool claim_edge_triggered_timers;
>>  } VirtMachineClass;
>>  
>> +typedef enum VirtIOMMUType {
>> +    VIRT_IOMMU_NONE,
>> +    VIRT_IOMMU_SMMUV3,
>> +    VIRT_IOMMU_VIRTIO,
>> +} VirtIOMMUType;
>> +
>> +typedef struct VirtIOMMUInfo {
>> +    VirtIOMMUType type;
>> +    MemMapEntry reg;
>> +    int irq_base;
>> +} VirtIOMMUInfo;
>> +
>>  typedef struct {
>>      MachineState parent;
>>      Notifier machine_done;
>> @@ -98,6 +110,7 @@ typedef struct {
>>      bool highmem;
>>      bool its;
>>      bool virt;
>> +    VirtIOMMUInfo iommu_info;
>>      int32_t gic_version;
>>      struct arm_boot_info bootinfo;
>>      const MemMapEntry *memmap;
>> -- 
>> 1.9.1
>>
>>
> 
> I just caught the above comments while skimming. I'm afraid I didn't have
> time to review this to the spec yet.

Thank you for the review!

Eric
> 
> Thanks,
> drew
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting
  2018-02-12 18:58 ` [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting Eric Auger
@ 2018-03-21 13:15   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 26+ messages in thread
From: Jean-Philippe Brucker @ 2018-03-21 13:15 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel
  Cc: Will Deacon, kevin.tian, Marc Zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

Hi Eric,

On 12/02/18 18:58, Eric Auger wrote:
[...]
> +    for (;;) {
> +        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +
> +        if (!elem) {
> +            virtio_error(vdev,
> +                         "no buffer available in event queue to report event");
> +            return;
> +        }

When $user attempts something silly like trying VFIO without the VFIO
patches, this outputs thousands of messages per second... I guess it would
also happen if the driver isn't consuming error reports fast enough. Is
there a simple way of ratelimiting printfs so the context of the error
doesn't get washed away?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-03-21 13:12 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-12 18:58 [Qemu-devel] [RFC v6 00/22] VIRTIO-IOMMU device Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 01/22] machine: Add a get_primary_pci_bus callback Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 02/22] hw/arm/virt: Implement get_primary_pci_bus Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 03/22] pc: " Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 04/22] update-linux-headers: Import virtio_iommu.h Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 05/22] linux-headers: Partial update for virtio-iommu v0.6 Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 06/22] virtio-iommu: Add skeleton Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 07/22] virtio-iommu: Decode the command payload Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 08/22] virtio-iommu: Add the iommu regions Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 09/22] virtio-iommu: Register attached endpoints Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 10/22] virtio-iommu: Implement attach/detach command Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 11/22] virtio-iommu: Implement map/unmap Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 12/22] virtio-iommu: Implement translate Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 13/22] virtio-iommu: Implement probe request Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 14/22] virtio-iommu: Add an msi_bypass property Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 15/22] virtio-iommu: Implement fault reporting Eric Auger
2018-03-21 13:15   ` Jean-Philippe Brucker
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 16/22] virtio_iommu: Handle reserved regions in translation process Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 17/22] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 18/22] hw/arm/virt-acpi-build: Add virtio-iommu node in IORT table Eric Auger
2018-02-13 12:24   ` Andrew Jones
2018-02-13 13:22     ` Auger Eric
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 19/22] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 20/22] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 21/22] virtio-iommu: Implement set_page_size_mask Eric Auger
2018-02-12 18:58 ` [Qemu-devel] [RFC v6 22/22] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.