All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
@ 2017-09-19  7:46 Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 01/16] update-linux-headers: import virtio_iommu.h Eric Auger
                   ` (17 more replies)
  0 siblings, 18 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This series implements the virtio-iommu device.

This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
- probe request support although no reserved region is returned at
  the moment
- unmap semantics less strict, as specified in v0.4
- device registration, attach/detach revisited
- split into smaller patches to ease review
- propose a way to inform the IOMMU mr about the page_size_mask
  of underlying HW IOMMU, if any
- remove warning associated with the translation of the MSI doorbell

The device gets instantiated using the "-device virtio-iommu-device"
option. It currently works with ARM virt machine only, as the machine
must handle the dt binding between the virtio-mmio "iommu" node and
the PCI host bridge node.

The associated VHOST/VFIO adaptation is available on the branch
below but is not officially delivered as part of this series.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4

References:
[1] [RFC] virtio-iommu version 0.4
    git://linux-arm.org/virtio-iommu.git branch viommu/v0.4

Testing:
- guest kernel with v0.4 virtio-iommu driver (4kB page)
- tested with guest using virtio-pci-net and vhost-net
  (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
- tested with VFIO and guest assigned with 2 VFs
- tested with DPDK on guest with 2 assigned VFs

Not tested:
- hot-plug/hot-unplug of EP: not implemented

History:
v2-> v4:
- see above

v2 -> v3:
- rebase on top of 2.10-rc0 and especially
  [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
- add mutex init
- fix as->mappings deletion using g_tree_ref/unref
- when a dev is attached whereas it is already attached to
  another address space, first detach it
- fix some error values
- page_sizes = TARGET_PAGE_MASK;
- I haven't changed the unmap() semantics yet, waiting for the
  next virtio-iommu spec revision.

v1 -> v2:
- fix redifinition of viommu_as typedef

Eric Auger (16):
  update-linux-headers: import virtio_iommu.h
  linux-headers: Update for virtio-iommu
  virtio-iommu: add skeleton
  virtio-iommu: Decode the command payload
  virtio-iommu: Add the iommu regions
  virtio-iommu: Register attached devices
  virtio-iommu: Implement attach/detach command
  virtio-iommu: Implement map/unmap
  virtio-iommu: Implement translate
  virtio-iommu: Implement probe request
  hw/arm/virt: Add 2.11 machine type
  hw/arm/virt: Add virtio-iommu to the virt board
  memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
  hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
  virtio-iommu: Implement set_page_size_mask
  hw/vfio/common: Do not print error when viommu translates into an mmio
    region

 hw/arm/virt.c                                 | 115 +++-
 hw/vfio/common.c                              |   7 +-
 hw/virtio/Makefile.objs                       |   1 +
 hw/virtio/trace-events                        |  24 +
 hw/virtio/virtio-iommu.c                      | 923 ++++++++++++++++++++++++++
 include/exec/memory.h                         |   4 +
 include/hw/arm/virt.h                         |   5 +
 include/hw/vfio/vfio-common.h                 |   1 +
 include/hw/virtio/virtio-iommu.h              |  61 ++
 include/standard-headers/linux/virtio_ids.h   |   1 +
 include/standard-headers/linux/virtio_iommu.h | 177 +++++
 linux-headers/linux/virtio_iommu.h            |   1 +
 scripts/update-linux-headers.sh               |   3 +
 13 files changed, 1312 insertions(+), 11 deletions(-)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h
 create mode 100644 include/standard-headers/linux/virtio_iommu.h
 create mode 100644 linux-headers/linux/virtio_iommu.h

-- 
2.5.5

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 01/16] update-linux-headers: import virtio_iommu.h
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 02/16] linux-headers: Update for virtio-iommu Eric Auger
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

Update the script to update the virtio_iommu.h header.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 scripts/update-linux-headers.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 2f906c4..03f6712 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -134,6 +134,9 @@ EOF
 cat <<EOF >$output/linux-headers/linux/virtio_config.h
 #include "standard-headers/linux/virtio_config.h"
 EOF
+cat <<EOF >$output/linux-headers/linux/virtio_iommu.h
+#include "standard-headers/linux/virtio_iommu.h"
+EOF
 cat <<EOF >$output/linux-headers/linux/virtio_ring.h
 #include "standard-headers/linux/virtio_ring.h"
 EOF
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 02/16] linux-headers: Update for virtio-iommu
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 01/16] update-linux-headers: import virtio_iommu.h Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 03/16] virtio-iommu: add skeleton Eric Auger
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This is a partial linux header update against Jean-Philippe's branch:
git://linux-arm.org/linux-jpb.git virtio-iommu/v0.4 (unstable)

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3- >v3:
- upgrade the API from v0.1 to v0.4
---
 include/standard-headers/linux/virtio_ids.h   |   1 +
 include/standard-headers/linux/virtio_iommu.h | 177 ++++++++++++++++++++++++++
 linux-headers/linux/virtio_iommu.h            |   1 +
 3 files changed, 179 insertions(+)
 create mode 100644 include/standard-headers/linux/virtio_iommu.h
 create mode 100644 linux-headers/linux/virtio_iommu.h

diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 6d5c3b2..934ed3d 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -43,5 +43,6 @@
 #define VIRTIO_ID_INPUT        18 /* virtio input */
 #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
 #define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
+#define VIRTIO_ID_IOMMU	    61216 /* virtio IOMMU (temporary) */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
new file mode 100644
index 0000000..7f486cf
--- /dev/null
+++ b/include/standard-headers/linux/virtio_iommu.h
@@ -0,0 +1,177 @@
+/*
+ * Virtio-iommu definition v0.4
+ *
+ * Copyright (C) 2017 ARM Ltd.
+ *
+ * This header is BSD licensed so anyone can use the definitions
+ * to implement compatible drivers/servers:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of ARM Ltd. nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#ifndef _LINUX_VIRTIO_IOMMU_H
+#define _LINUX_VIRTIO_IOMMU_H
+
+/* Feature bits */
+#define VIRTIO_IOMMU_F_INPUT_RANGE		0
+#define VIRTIO_IOMMU_F_IOASID_BITS		1
+#define VIRTIO_IOMMU_F_MAP_UNMAP		2
+#define VIRTIO_IOMMU_F_BYPASS			3
+#define VIRTIO_IOMMU_F_PROBE			4
+
+struct virtio_iommu_config {
+	/* Supported page sizes */
+	uint64_t					page_size_mask;
+	struct virtio_iommu_range {
+		uint64_t				start;
+		uint64_t				end;
+	} input_range;
+	uint8_t 					ioasid_bits;
+	uint8_t					padding[3];
+	uint32_t					probe_size;
+} QEMU_PACKED;
+
+/* Request types */
+#define VIRTIO_IOMMU_T_ATTACH			0x01
+#define VIRTIO_IOMMU_T_DETACH			0x02
+#define VIRTIO_IOMMU_T_MAP			0x03
+#define VIRTIO_IOMMU_T_UNMAP			0x04
+#define VIRTIO_IOMMU_T_PROBE			0x05
+
+/* Status types */
+#define VIRTIO_IOMMU_S_OK			0x00
+#define VIRTIO_IOMMU_S_IOERR			0x01
+#define VIRTIO_IOMMU_S_UNSUPP			0x02
+#define VIRTIO_IOMMU_S_DEVERR			0x03
+#define VIRTIO_IOMMU_S_INVAL			0x04
+#define VIRTIO_IOMMU_S_RANGE			0x05
+#define VIRTIO_IOMMU_S_NOENT			0x06
+#define VIRTIO_IOMMU_S_FAULT			0x07
+
+struct virtio_iommu_req_head {
+	uint8_t					type;
+	uint8_t					reserved[3];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_tail {
+	uint8_t					status;
+	uint8_t					reserved[3];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_attach {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					address_space;
+	uint32_t					device;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+struct virtio_iommu_req_detach {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					device;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_MAP_F_READ			(1 << 0)
+#define VIRTIO_IOMMU_MAP_F_WRITE		(1 << 1)
+#define VIRTIO_IOMMU_MAP_F_EXEC			(1 << 2)
+
+#define VIRTIO_IOMMU_MAP_F_MASK			(VIRTIO_IOMMU_MAP_F_READ |	\
+						 VIRTIO_IOMMU_MAP_F_WRITE |	\
+						 VIRTIO_IOMMU_MAP_F_EXEC)
+
+struct virtio_iommu_req_map {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					address_space;
+	uint64_t					virt_addr;
+	uint64_t					phys_addr;
+	uint64_t					size;
+	uint32_t					flags;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+QEMU_PACKED
+struct virtio_iommu_req_unmap {
+	struct virtio_iommu_req_head		head;
+
+	uint32_t					address_space;
+	uint64_t					virt_addr;
+	uint64_t					size;
+	uint32_t					reserved;
+
+	struct virtio_iommu_req_tail		tail;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_PROBE_RESV_MEM_T_ABORT	0
+#define VIRTIO_IOMMU_PROBE_RESV_MEM_T_BYPASS	1
+
+#define VIRTIO_IOMMU_PROBE_RESV_MEM_F_MSI	(1 << 0)
+
+struct virtio_iommu_probe_resv_mem {
+	uint8_t					subtype;
+	uint8_t					reserved[3];
+	uint64_t					addr;
+	uint64_t					size;
+	uint32_t					flags;
+} QEMU_PACKED;
+
+#define VIRTIO_IOMMU_PROBE_T_NONE		0
+#define VIRTIO_IOMMU_PROBE_T_RESV_MEM		2
+
+#define VIRTIO_IOMMU_PROBE_T_MASK		0xfff
+
+struct virtio_iommu_probe_property {
+	uint16_t					type;
+	uint16_t					length;
+	uint8_t					value[];
+} QEMU_PACKED;
+
+struct virtio_iommu_req_probe {
+	struct virtio_iommu_req_head		head;
+	uint32_t					device;
+	uint8_t					reserved[64];
+
+	uint8_t					properties[];
+
+	/* Tail follows the variable-length properties array (no padding) */
+} QEMU_PACKED;
+
+union virtio_iommu_req {
+	struct virtio_iommu_req_head		head;
+
+	struct virtio_iommu_req_attach		attach;
+	struct virtio_iommu_req_detach		detach;
+	struct virtio_iommu_req_map		map;
+	struct virtio_iommu_req_unmap		unmap;
+	struct virtio_iommu_req_probe		probe;
+};
+
+#endif
diff --git a/linux-headers/linux/virtio_iommu.h b/linux-headers/linux/virtio_iommu.h
new file mode 100644
index 0000000..2dc4609
--- /dev/null
+++ b/linux-headers/linux/virtio_iommu.h
@@ -0,0 +1 @@
+#include "standard-headers/linux/virtio_iommu.h"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 03/16] virtio-iommu: add skeleton
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 01/16] update-linux-headers: import virtio_iommu.h Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 02/16] linux-headers: Update for virtio-iommu Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 04/16] virtio-iommu: Decode the command payload Eric Auger
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patchs adds the skeleton for the virtio-iommu device.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- use page_size_mask instead of page_sizes
- added set_features()
- added some traces (reset, set_status, set_features)
- empty virtio_iommu_set_config() as the driver MUST NOT
  write to device configuration fields
- add get_config trace

v2 -> v3:
- rebase on 2.10-rc0, ie. use IOMMUMemoryRegion and remove
  iommu_ops.
- advertise VIRTIO_IOMMU_F_MAP_UNMAP feature
- page_sizes set to TARGET_PAGE_SIZE
---
 hw/virtio/Makefile.objs          |   1 +
 hw/virtio/trace-events           |   7 ++
 hw/virtio/virtio-iommu.c         | 255 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |  59 +++++++++
 4 files changed, 322 insertions(+)
 create mode 100644 hw/virtio/virtio-iommu.c
 create mode 100644 include/hw/virtio/virtio-iommu.h

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 765d363..8967a4a 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -6,6 +6,7 @@ common-obj-y += virtio-mmio.o
 
 obj-y += virtio.o virtio-balloon.o 
 obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
+obj-$(CONFIG_LINUX) += virtio-iommu.o
 obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
 obj-y += virtio-crypto.o
 obj-$(CONFIG_VIRTIO_PCI) += virtio-crypto-pci.o
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 775461a..98cf9a1 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -25,3 +25,10 @@ virtio_balloon_handle_output(const char *name, uint64_t gpa) "section name: %s g
 virtio_balloon_get_config(uint32_t num_pages, uint32_t actual) "num_pages: %d actual: %d"
 virtio_balloon_set_config(uint32_t actual, uint32_t oldactual) "actual: %d oldactual: %d"
 virtio_balloon_to_target(uint64_t target, uint32_t num_pages) "balloon target: 0x%"PRIx64" num_pages: %d"
+
+# hw/virtio/virtio-iommu.c
+#
+virtio_iommu_set_features(uint64_t features) "features accepted by the driver =0x%"PRIx64
+virtio_iommu_device_reset(void) "reset!"
+virtio_iommu_device_status(uint8_t status) "driver status = %d"
+virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, uint8_t ioasid_bits, uint32_t probe_size) "page_size_mask=0x%"PRIx64" start=0x%"PRIx64" end=0x%"PRIx64" ioasid_bits=%d probe_size=0x%x"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
new file mode 100644
index 0000000..36e5b5f
--- /dev/null
+++ b/hw/virtio/virtio-iommu.c
@@ -0,0 +1,255 @@
+/*
+ * virtio-iommu device
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/iov.h"
+#include "qemu-common.h"
+#include "hw/virtio/virtio.h"
+#include "sysemu/kvm.h"
+#include "qapi-event.h"
+#include "trace.h"
+
+#include "standard-headers/linux/virtio_ids.h"
+#include <linux/virtio_iommu.h>
+
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/virtio-iommu.h"
+
+/* Max size */
+#define VIOMMU_DEFAULT_QUEUE_SIZE 256
+
+static int virtio_iommu_handle_attach(VirtIOIOMMU *s,
+                                      struct iovec *iov,
+                                      unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_detach(VirtIOIOMMU *s,
+                                      struct iovec *iov,
+                                      unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_map(VirtIOIOMMU *s,
+                                   struct iovec *iov,
+                                   unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
+                                     struct iovec *iov,
+                                     unsigned int iov_cnt)
+{
+    return -ENOENT;
+}
+
+static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
+    VirtQueueElement *elem;
+    struct virtio_iommu_req_head head;
+    struct virtio_iommu_req_tail tail;
+    unsigned int iov_cnt;
+    struct iovec *iov;
+    size_t sz;
+
+    for (;;) {
+        elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+        if (!elem) {
+            return;
+        }
+
+        if (iov_size(elem->in_sg, elem->in_num) < sizeof(tail) ||
+            iov_size(elem->out_sg, elem->out_num) < sizeof(head)) {
+            virtio_error(vdev, "virtio-iommu erroneous head or tail");
+            virtqueue_detach_element(vq, elem, 0);
+            g_free(elem);
+            break;
+        }
+
+        iov_cnt = elem->out_num;
+        iov = g_memdup(elem->out_sg, sizeof(struct iovec) * elem->out_num);
+        sz = iov_to_buf(iov, iov_cnt, 0, &head, sizeof(head));
+        if (sz != sizeof(head)) {
+            tail.status = VIRTIO_IOMMU_S_UNSUPP;
+        }
+        qemu_mutex_lock(&s->mutex);
+        switch (head.type) {
+        case VIRTIO_IOMMU_T_ATTACH:
+            tail.status = virtio_iommu_handle_attach(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_DETACH:
+            tail.status = virtio_iommu_handle_detach(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_MAP:
+            tail.status = virtio_iommu_handle_map(s, iov, iov_cnt);
+            break;
+        case VIRTIO_IOMMU_T_UNMAP:
+            tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
+            break;
+        default:
+            tail.status = VIRTIO_IOMMU_S_UNSUPP;
+        }
+        qemu_mutex_unlock(&s->mutex);
+
+        sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                          &tail, sizeof(tail));
+        assert(sz == sizeof(tail));
+
+        virtqueue_push(vq, elem, sizeof(tail));
+        virtio_notify(vdev, vq);
+        g_free(elem);
+    }
+}
+
+static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
+{
+    VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
+    struct virtio_iommu_config *config = &dev->config;
+
+    trace_virtio_iommu_get_config(config->page_size_mask,
+                                  config->input_range.start,
+                                  config->input_range.end,
+                                  config->ioasid_bits,
+                                  config->probe_size);
+    memcpy(config_data, &dev->config, sizeof(struct virtio_iommu_config));
+}
+
+static void virtio_iommu_set_config(VirtIODevice *vdev,
+                                      const uint8_t *config_data)
+{
+}
+
+static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
+                                            Error **errp)
+{
+    VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
+    f |= dev->host_features;
+    virtio_add_feature(&f, VIRTIO_RING_F_EVENT_IDX);
+    virtio_add_feature(&f, VIRTIO_RING_F_INDIRECT_DESC);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_INPUT_RANGE);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_MAP_UNMAP);
+    return f;
+}
+
+static void virtio_iommu_set_features(VirtIODevice *vdev, uint64_t val)
+{
+    trace_virtio_iommu_set_features(val);
+}
+
+static int virtio_iommu_post_load_device(void *opaque, int version_id)
+{
+    return 0;
+}
+
+static const VMStateDescription vmstate_virtio_iommu_device = {
+    .name = "virtio-iommu-device",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = virtio_iommu_post_load_device,
+    .fields = (VMStateField[]) {
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
+
+    virtio_init(vdev, "virtio-iommu", VIRTIO_ID_IOMMU,
+                sizeof(struct virtio_iommu_config));
+
+    s->vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE,
+                             virtio_iommu_handle_command);
+
+    s->config.page_size_mask = TARGET_PAGE_MASK;
+    s->config.input_range.end = -1UL;
+}
+
+static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+
+    virtio_cleanup(vdev);
+}
+
+static void virtio_iommu_device_reset(VirtIODevice *vdev)
+{
+    trace_virtio_iommu_device_reset();
+}
+
+static void virtio_iommu_set_status(VirtIODevice *vdev, uint8_t status)
+{
+    trace_virtio_iommu_device_status(status);
+}
+
+static void virtio_iommu_instance_init(Object *obj)
+{
+}
+
+static const VMStateDescription vmstate_virtio_iommu = {
+    .name = "virtio-iommu",
+    .minimum_version_id = 1,
+    .version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_VIRTIO_DEVICE,
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static Property virtio_iommu_properties[] = {
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_iommu_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+    dc->props = virtio_iommu_properties;
+    dc->vmsd = &vmstate_virtio_iommu;
+
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    vdc->realize = virtio_iommu_device_realize;
+    vdc->unrealize = virtio_iommu_device_unrealize;
+    vdc->reset = virtio_iommu_device_reset;
+    vdc->get_config = virtio_iommu_get_config;
+    vdc->set_config = virtio_iommu_set_config;
+    vdc->get_features = virtio_iommu_get_features;
+    vdc->set_features = virtio_iommu_set_features;
+    vdc->set_status = virtio_iommu_set_status;
+    vdc->vmsd = &vmstate_virtio_iommu_device;
+}
+
+static const TypeInfo virtio_iommu_info = {
+    .name = TYPE_VIRTIO_IOMMU,
+    .parent = TYPE_VIRTIO_DEVICE,
+    .instance_size = sizeof(VirtIOIOMMU),
+    .instance_init = virtio_iommu_instance_init,
+    .class_init = virtio_iommu_class_init,
+};
+
+static void virtio_register_types(void)
+{
+    type_register_static(&virtio_iommu_info);
+}
+
+type_init(virtio_register_types)
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
new file mode 100644
index 0000000..6716cdb
--- /dev/null
+++ b/include/hw/virtio/virtio-iommu.h
@@ -0,0 +1,59 @@
+/*
+ * virtio-iommu device
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef QEMU_VIRTIO_IOMMU_H
+#define QEMU_VIRTIO_IOMMU_H
+
+#include "standard-headers/linux/virtio_iommu.h"
+#include "hw/virtio/virtio.h"
+#include "hw/pci/pci.h"
+
+#define TYPE_VIRTIO_IOMMU "virtio-iommu-device"
+#define VIRTIO_IOMMU(obj) \
+        OBJECT_CHECK(VirtIOIOMMU, (obj), TYPE_VIRTIO_IOMMU)
+
+#define IOMMU_PCI_BUS_MAX      256
+#define IOMMU_PCI_DEVFN_MAX    256
+
+typedef struct IOMMUDevice {
+    void         *viommu;
+    PCIBus       *bus;
+    int           devfn;
+    IOMMUMemoryRegion  iommu_mr;
+    AddressSpace  as;
+} IOMMUDevice;
+
+typedef struct IOMMUPciBus {
+    PCIBus       *bus;
+    IOMMUDevice  *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} IOMMUPciBus;
+
+typedef struct VirtIOIOMMU {
+    VirtIODevice parent_obj;
+    VirtQueue *vq;
+    struct virtio_iommu_config config;
+    uint32_t host_features;
+    GHashTable *as_by_busptr;
+    IOMMUPciBus *as_by_bus_num[IOMMU_PCI_BUS_MAX];
+    GTree *address_spaces;
+    QemuMutex mutex;
+    GTree *devices;
+} VirtIOIOMMU;
+
+#endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 04/16] virtio-iommu: Decode the command payload
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (2 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 03/16] virtio-iommu: add skeleton Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 05/16] virtio-iommu: Add the iommu regions Eric Auger
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch adds the command payload decoding and
introduces the functions that will do the actual
command handling. Those functions are not yet implemented.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- no flags field anymore in struct virtio_iommu_req_unmap
- test reserved on attach/detach, change trace proto

v3 -> v4:
- rebase on v2.10.0.
---
 hw/virtio/trace-events   |   4 ++
 hw/virtio/virtio-iommu.c | 104 +++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 98cf9a1..a35dc62 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -32,3 +32,7 @@ virtio_iommu_set_features(uint64_t features) "features accepted by the driver =0
 virtio_iommu_device_reset(void) "reset!"
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
 virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, uint8_t ioasid_bits, uint32_t probe_size) "page_size_mask=0x%"PRIx64" start=0x%"PRIx64" end=0x%"PRIx64" ioasid_bits=%d probe_size=0x%x"
+virtio_iommu_attach(uint32_t as, uint32_t dev) "as=%d dev=%d"
+virtio_iommu_detach(uint32_t dev) "dev=%d"
+virtio_iommu_map(uint32_t as, uint64_t phys_addr, uint64_t virt_addr, uint64_t size, uint32_t flags) "as= %d phys_addr=0x%"PRIx64" virt_addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d"
+virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size) "as= %d virt_addr=0x%"PRIx64" size=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 36e5b5f..51f9003 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -35,29 +35,125 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+static int virtio_iommu_attach(VirtIOIOMMU *s,
+                               struct virtio_iommu_req_attach *req)
+{
+    uint32_t asid = le32_to_cpu(req->address_space);
+    uint32_t devid = le32_to_cpu(req->device);
+    uint32_t reserved = le32_to_cpu(req->reserved);
+
+    trace_virtio_iommu_attach(asid, devid);
+
+    if (reserved) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_detach(VirtIOIOMMU *s,
+                               struct virtio_iommu_req_detach *req)
+{
+    uint32_t devid = le32_to_cpu(req->device);
+    uint32_t reserved = le32_to_cpu(req->reserved);
+
+    trace_virtio_iommu_detach(devid);
+
+    if (reserved) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_map(VirtIOIOMMU *s,
+                            struct virtio_iommu_req_map *req)
+{
+    uint32_t asid = le32_to_cpu(req->address_space);
+    uint64_t phys_addr = le64_to_cpu(req->phys_addr);
+    uint64_t virt_addr = le64_to_cpu(req->virt_addr);
+    uint64_t size = le64_to_cpu(req->size);
+    uint32_t flags = le32_to_cpu(req->flags);
+
+    trace_virtio_iommu_map(asid, phys_addr, virt_addr, size, flags);
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+static int virtio_iommu_unmap(VirtIOIOMMU *s,
+                              struct virtio_iommu_req_unmap *req)
+{
+    uint32_t asid = le32_to_cpu(req->address_space);
+    uint64_t virt_addr = le64_to_cpu(req->virt_addr);
+    uint64_t size = le64_to_cpu(req->size);
+
+    trace_virtio_iommu_unmap(asid, virt_addr, size);
+
+    return VIRTIO_IOMMU_S_UNSUPP;
+}
+
+#define get_payload_size(req) (\
+sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
+
 static int virtio_iommu_handle_attach(VirtIOIOMMU *s,
                                       struct iovec *iov,
                                       unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_attach req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_attach(s, &req);
 }
 static int virtio_iommu_handle_detach(VirtIOIOMMU *s,
                                       struct iovec *iov,
                                       unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_detach req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_detach(s, &req);
 }
 static int virtio_iommu_handle_map(VirtIOIOMMU *s,
                                    struct iovec *iov,
                                    unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_map req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_map(s, &req);
 }
 static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
                                      struct iovec *iov,
                                      unsigned int iov_cnt)
 {
-    return -ENOENT;
+    struct virtio_iommu_req_unmap req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+    return virtio_iommu_unmap(s, &req);
 }
 
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 05/16] virtio-iommu: Add the iommu regions
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (3 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 04/16] virtio-iommu: Decode the command payload Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices Eric Auger
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch initializes the iommu memory regions so that
PCIe end point transactions get translated. The translation function
is not yet implemented at that stage.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- add trace_virtio_iommu_init_iommu_mr

v2 -> v3:
- use IOMMUMemoryRegion
- iommu mr name built with BDF
- rename smmu_get_sid into virtio_iommu_get_sid and use PCI_BUILD_BDF
---
 hw/virtio/trace-events           |   2 +
 hw/virtio/virtio-iommu.c         | 119 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-iommu.h |   2 +
 3 files changed, 123 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index a35dc62..bc65356 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -36,3 +36,5 @@ virtio_iommu_attach(uint32_t as, uint32_t dev) "as=%d dev=%d"
 virtio_iommu_detach(uint32_t dev) "dev=%d"
 virtio_iommu_map(uint32_t as, uint64_t phys_addr, uint64_t virt_addr, uint64_t size, uint32_t flags) "as= %d phys_addr=0x%"PRIx64" virt_addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d"
 virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size) "as= %d virt_addr=0x%"PRIx64" size=0x%"PRIx64
+virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
+virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 51f9003..f4cb76f 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -23,6 +23,7 @@
 #include "hw/virtio/virtio.h"
 #include "sysemu/kvm.h"
 #include "qapi-event.h"
+#include "qemu/error-report.h"
 #include "trace.h"
 
 #include "standard-headers/linux/virtio_ids.h"
@@ -35,6 +36,66 @@
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
+{
+    return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
+}
+
+static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
+                                              int devfn)
+{
+    VirtIOIOMMU *s = opaque;
+    uintptr_t key = (uintptr_t)bus;
+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, &key);
+    IOMMUDevice *sdev;
+
+    if (!sbus) {
+        uintptr_t *new_key = g_malloc(sizeof(*new_key));
+
+        *new_key = (uintptr_t)bus;
+        sbus = g_malloc0(sizeof(IOMMUPciBus) +
+                         sizeof(IOMMUDevice *) * IOMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->as_by_busptr, new_key, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        char *name = g_strdup_printf("%s-%d-%d",
+                                     TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+                                     pci_bus_num(bus), devfn);
+        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(IOMMUDevice));
+
+        sdev->viommu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        trace_virtio_iommu_init_iommu_mr(name);
+
+        memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr),
+                                 TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+                                 OBJECT(s), name,
+                                 UINT64_MAX);
+        address_space_init(&sdev->as,
+                           MEMORY_REGION(&sdev->iommu_mr), TYPE_VIRTIO_IOMMU);
+    }
+
+    return &sdev->as;
+
+}
+
+static void virtio_iommu_init_as(VirtIOIOMMU *s)
+{
+    PCIBus *pcibus = pci_find_primary_bus();
+
+    if (pcibus) {
+        pci_setup_iommu(pcibus, virtio_iommu_find_add_as, s);
+    } else {
+        error_report("No PCI bus, virtio-iommu is not registered");
+    }
+}
+
+
 static int virtio_iommu_attach(VirtIOIOMMU *s,
                                struct virtio_iommu_req_attach *req)
 {
@@ -215,6 +276,26 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
     }
 }
 
+static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+                                            IOMMUAccessFlags flag)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    uint32_t sid;
+
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = IOMMU_NONE,
+    };
+
+    sid = virtio_iommu_get_sid(sdev);
+
+    trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+    return entry;
+}
+
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
@@ -265,6 +346,21 @@ static const VMStateDescription vmstate_virtio_iommu_device = {
     },
 };
 
+/*****************************
+ * Hash Table
+ *****************************/
+
+static inline gboolean as_uint64_equal(gconstpointer v1, gconstpointer v2)
+{
+    return *((const uint64_t *)v1) == *((const uint64_t *)v2);
+}
+
+static inline guint as_uint64_hash(gconstpointer v)
+{
+    return (guint)*(const uint64_t *)v;
+}
+
+
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -278,6 +374,13 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
+
+    memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
+    s->as_by_busptr = g_hash_table_new_full(as_uint64_hash,
+                                            as_uint64_equal,
+                                            g_free, g_free);
+
+    virtio_iommu_init_as(s);
 }
 
 static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
@@ -335,6 +438,14 @@ static void virtio_iommu_class_init(ObjectClass *klass, void *data)
     vdc->vmsd = &vmstate_virtio_iommu_device;
 }
 
+static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
+                                                  void *data)
+{
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+    imrc->translate = virtio_iommu_translate;
+}
+
 static const TypeInfo virtio_iommu_info = {
     .name = TYPE_VIRTIO_IOMMU,
     .parent = TYPE_VIRTIO_DEVICE,
@@ -343,9 +454,17 @@ static const TypeInfo virtio_iommu_info = {
     .class_init = virtio_iommu_class_init,
 };
 
+static const TypeInfo virtio_iommu_memory_region_info = {
+    .parent = TYPE_IOMMU_MEMORY_REGION,
+    .name = TYPE_VIRTIO_IOMMU_MEMORY_REGION,
+    .class_init = virtio_iommu_memory_region_class_init,
+};
+
+
 static void virtio_register_types(void)
 {
     type_register_static(&virtio_iommu_info);
+    type_register_static(&virtio_iommu_memory_region_info);
 }
 
 type_init(virtio_register_types)
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 6716cdb..f9c988f 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -28,6 +28,8 @@
 #define VIRTIO_IOMMU(obj) \
         OBJECT_CHECK(VirtIOIOMMU, (obj), TYPE_VIRTIO_IOMMU)
 
+#define TYPE_VIRTIO_IOMMU_MEMORY_REGION "virtio-iommu-memory-region"
+
 #define IOMMU_PCI_BUS_MAX      256
 #define IOMMU_PCI_DEVFN_MAX    256
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (4 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 05/16] virtio-iommu: Add the iommu regions Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-22  7:29   ` Bharat Bhushan
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 07/16] virtio-iommu: Implement attach/detach command Eric Auger
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch introduce address space and device internal
datatypes. Both are stored in RB trees. The address space
owns a list of devices attached to it.

It is assumed the devid corresponds to the PCI BDF.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v3 -> v4:
- new separate patch
---
 hw/virtio/trace-events   |   4 ++
 hw/virtio/virtio-iommu.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 124 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index bc65356..74b92d3 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -38,3 +38,7 @@ virtio_iommu_map(uint32_t as, uint64_t phys_addr, uint64_t virt_addr, uint64_t s
 virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size) "as= %d virt_addr=0x%"PRIx64" size=0x%"PRIx64
 virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
 virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
+virtio_iommu_get_dev(uint32_t devid) "Alloc devid=%d"
+virtio_iommu_put_dev(uint32_t devid) "Free devid=%d"
+virtio_iommu_get_as(uint32_t asid) "Alloc asid=%d"
+virtio_iommu_put_as(uint32_t asid) "Free asid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index f4cb76f..41a4bbc 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -32,15 +32,116 @@
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
 #include "hw/virtio/virtio-iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci.h"
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
 
+typedef struct viommu_as {
+    uint32_t id;
+    GTree *mappings;
+    QLIST_HEAD(, viommu_dev) device_list;
+} viommu_as;
+
+typedef struct viommu_dev {
+    uint32_t id;
+    viommu_as *as;
+    QLIST_ENTRY(viommu_dev) next;
+    VirtIOIOMMU *viommu;
+} viommu_dev;
+
+typedef struct viommu_interval {
+    uint64_t low;
+    uint64_t high;
+} viommu_interval;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
 }
 
+static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
+{
+    viommu_interval *inta = (viommu_interval *)a;
+    viommu_interval *intb = (viommu_interval *)b;
+
+    if (inta->high <= intb->low) {
+        return -1;
+    } else if (intb->high <= inta->low) {
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static void virtio_iommu_detach_dev_from_as(viommu_dev *dev)
+{
+    QLIST_REMOVE(dev, next);
+    dev->as = NULL;
+}
+
+static viommu_dev *virtio_iommu_get_dev(VirtIOIOMMU *s, uint32_t devid)
+{
+    viommu_dev *dev;
+
+    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
+    if (dev) {
+        return dev;
+    }
+    dev = g_malloc0(sizeof(*dev));
+    dev->id = devid;
+    dev->viommu = s;
+    trace_virtio_iommu_get_dev(devid);
+    g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
+    return dev;
+}
+
+static void virtio_iommu_put_dev(gpointer data)
+{
+    viommu_dev *dev = (viommu_dev *)data;
+
+    if (dev->as) {
+        virtio_iommu_detach_dev_from_as(dev);
+        g_tree_unref(dev->as->mappings);
+    }
+
+    trace_virtio_iommu_put_dev(dev->id);
+    g_free(dev);
+}
+
+viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid);
+viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid)
+{
+    viommu_as *as;
+
+    as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid));
+    if (as) {
+        return as;
+    }
+    as = g_malloc0(sizeof(*as));
+    as->id = asid;
+    as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
+                                   NULL, (GDestroyNotify)g_free,
+                                   (GDestroyNotify)g_free);
+    g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
+    trace_virtio_iommu_get_as(asid);
+    return as;
+}
+
+static void virtio_iommu_put_as(gpointer data)
+{
+    viommu_as *as = (viommu_as *)data;
+    viommu_dev *iter, *tmp;
+
+    QLIST_FOREACH_SAFE(iter, &as->device_list, next, tmp) {
+        virtio_iommu_detach_dev_from_as(iter);
+    }
+    g_tree_destroy(as->mappings);
+    trace_virtio_iommu_put_as(as->id);
+    g_free(as);
+}
+
 static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
                                               int devfn)
 {
@@ -70,6 +171,8 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
         sdev->bus = bus;
         sdev->devfn = devfn;
 
+        virtio_iommu_get_dev(s, PCI_BUILD_BDF(pci_bus_num(bus), devfn));
+
         trace_virtio_iommu_init_iommu_mr(name);
 
         memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr),
@@ -360,6 +463,12 @@ static inline guint as_uint64_hash(gconstpointer v)
     return (guint)*(const uint64_t *)v;
 }
 
+static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
+{
+    uint ua = GPOINTER_TO_UINT(a);
+    uint ub = GPOINTER_TO_UINT(b);
+    return (ua > ub) - (ua < ub);
+}
 
 static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 {
@@ -375,17 +484,28 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
 
+    qemu_mutex_init(&s->mutex);
+
     memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
     s->as_by_busptr = g_hash_table_new_full(as_uint64_hash,
                                             as_uint64_equal,
                                             g_free, g_free);
 
+    s->address_spaces = g_tree_new_full((GCompareDataFunc)int_cmp,
+                                         NULL, NULL, virtio_iommu_put_as);
+    s->devices = g_tree_new_full((GCompareDataFunc)int_cmp,
+                                 NULL, NULL, virtio_iommu_put_dev);
+
     virtio_iommu_init_as(s);
 }
 
 static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
+
+    g_tree_destroy(s->address_spaces);
+    g_tree_destroy(s->devices);
 
     virtio_cleanup(vdev);
 }
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 07/16] virtio-iommu: Implement attach/detach command
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (5 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 08/16] virtio-iommu: Implement map/unmap Eric Auger
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the device attach/detach to an
address space.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/virtio-iommu.c | 38 ++++++++++++++++++++++++++++++++------
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 41a4bbc..bd859fb 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -110,8 +110,7 @@ static void virtio_iommu_put_dev(gpointer data)
     g_free(dev);
 }
 
-viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid);
-viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid)
+static viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid)
 {
     viommu_as *as;
 
@@ -205,6 +204,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
     uint32_t asid = le32_to_cpu(req->address_space);
     uint32_t devid = le32_to_cpu(req->device);
     uint32_t reserved = le32_to_cpu(req->reserved);
+    viommu_as *as;
+    viommu_dev *dev;
 
     trace_virtio_iommu_attach(asid, devid);
 
@@ -212,7 +213,22 @@ static int virtio_iommu_attach(VirtIOIOMMU *s,
         return VIRTIO_IOMMU_S_INVAL;
     }
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    dev = virtio_iommu_get_dev(s, devid);
+    if (dev->as) {
+        /*
+         * the device is already attached to an address space,
+         * detach it first
+         */
+        virtio_iommu_detach_dev_from_as(dev);
+    }
+
+    as = virtio_iommu_get_as(s, asid);
+    QLIST_INSERT_HEAD(&as->device_list, dev, next);
+
+    dev->as = as;
+    g_tree_ref(as->mappings);
+
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_detach(VirtIOIOMMU *s,
@@ -220,14 +236,24 @@ static int virtio_iommu_detach(VirtIOIOMMU *s,
 {
     uint32_t devid = le32_to_cpu(req->device);
     uint32_t reserved = le32_to_cpu(req->reserved);
-
-    trace_virtio_iommu_detach(devid);
+    viommu_dev *dev;
 
     if (reserved) {
         return VIRTIO_IOMMU_S_INVAL;
     }
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
+    if (!dev) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
+    if (!dev->as) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    virtio_iommu_detach_dev_from_as(dev);
+    trace_virtio_iommu_detach(devid);
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_map(VirtIOIOMMU *s,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 08/16] virtio-iommu: Implement map/unmap
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (6 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 07/16] virtio-iommu: Implement attach/detach command Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate Eric Auger
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements virtio_iommu_map/unmap.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v3 -> v4:
- implement unmap semantics as specified in v0.4
---
 hw/virtio/trace-events   |  3 ++
 hw/virtio/virtio-iommu.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 74b92d3..da298c1 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -42,3 +42,6 @@ virtio_iommu_get_dev(uint32_t devid) "Alloc devid=%d"
 virtio_iommu_put_dev(uint32_t devid) "Free devid=%d"
 virtio_iommu_get_as(uint32_t asid) "Alloc asid=%d"
 virtio_iommu_put_as(uint32_t asid) "Free asid=%d"
+virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index bd859fb..6f1a7d1 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -56,6 +56,13 @@ typedef struct viommu_interval {
     uint64_t high;
 } viommu_interval;
 
+typedef struct viommu_mapping {
+    uint64_t virt_addr;
+    uint64_t phys_addr;
+    uint64_t size;
+    uint32_t flags;
+} viommu_mapping;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -264,10 +271,37 @@ static int virtio_iommu_map(VirtIOIOMMU *s,
     uint64_t virt_addr = le64_to_cpu(req->virt_addr);
     uint64_t size = le64_to_cpu(req->size);
     uint32_t flags = le32_to_cpu(req->flags);
+    viommu_as *as;
+    viommu_interval *interval;
+    viommu_mapping *mapping;
+
+    interval = g_malloc0(sizeof(*interval));
+
+    interval->low = virt_addr;
+    interval->high = virt_addr + size - 1;
+
+    as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid));
+    if (!as) {
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+
+    mapping = g_tree_lookup(as->mappings, (gpointer)interval);
+    if (mapping) {
+        g_free(interval);
+        return VIRTIO_IOMMU_S_INVAL;
+    }
 
     trace_virtio_iommu_map(asid, phys_addr, virt_addr, size, flags);
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    mapping = g_malloc0(sizeof(*mapping));
+    mapping->virt_addr = virt_addr;
+    mapping->phys_addr = phys_addr;
+    mapping->size = size;
+    mapping->flags = flags;
+
+    g_tree_insert(as->mappings, interval, mapping);
+
+    return VIRTIO_IOMMU_S_OK;
 }
 
 static int virtio_iommu_unmap(VirtIOIOMMU *s,
@@ -276,10 +310,63 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     uint32_t asid = le32_to_cpu(req->address_space);
     uint64_t virt_addr = le64_to_cpu(req->virt_addr);
     uint64_t size = le64_to_cpu(req->size);
+    viommu_mapping *mapping;
+    viommu_interval interval;
+    viommu_as *as;
 
     trace_virtio_iommu_unmap(asid, virt_addr, size);
 
-    return VIRTIO_IOMMU_S_UNSUPP;
+    as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid));
+    if (!as) {
+        error_report("%s: no as", __func__);
+        return VIRTIO_IOMMU_S_NOENT;
+    }
+    interval.low = virt_addr;
+    interval.high = virt_addr + size - 1;
+
+    mapping = g_tree_lookup(as->mappings, (gpointer)(&interval));
+
+    while (mapping) {
+        viommu_interval current;
+        uint64_t low  = mapping->virt_addr;
+        uint64_t high = mapping->virt_addr + mapping->size - 1;
+
+        current.low = low;
+        current.high = high;
+
+        if (low == interval.low && size >= mapping->size) {
+            g_tree_remove(as->mappings, (gpointer)(&current));
+            interval.low = high + 1;
+            trace_virtio_iommu_unmap_left_interval(current.low, current.high,
+                interval.low, interval.high);
+        } else if (high == interval.high && size >= mapping->size) {
+            trace_virtio_iommu_unmap_right_interval(current.low, current.high,
+                interval.low, interval.high);
+            g_tree_remove(as->mappings, (gpointer)(&current));
+            interval.high = low - 1;
+        } else if (low > interval.low && high < interval.high) {
+            trace_virtio_iommu_unmap_inc_interval(current.low, current.high);
+            g_tree_remove(as->mappings, (gpointer)(&current));
+        } else {
+            break;
+        }
+        if (interval.low >= interval.high) {
+            return VIRTIO_IOMMU_S_OK;
+        } else {
+            mapping = g_tree_lookup(as->mappings, (gpointer)(&interval));
+        }
+    }
+
+    if (mapping) {
+        error_report("****** %s: Unmap 0x%"PRIx64" size=0x%"PRIx64
+                     " from 0x%"PRIx64" size=0x%"PRIx64" is not supported",
+                     __func__, interval.low, size,
+                     mapping->virt_addr, mapping->size);
+    } else {
+        return VIRTIO_IOMMU_S_OK;
+    }
+
+    return VIRTIO_IOMMU_S_INVAL;
 }
 
 #define get_payload_size(req) (\
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (7 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 08/16] virtio-iommu: Implement map/unmap Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-22  6:52   ` Bharat Bhushan
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request Eric Auger
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the translate callback

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 39 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index da298c1..9010fbd 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -45,3 +45,4 @@ virtio_iommu_put_as(uint32_t asid) "Free asid=%d"
 virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
+virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 6f1a7d1..db46a91 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -496,19 +496,54 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
                                             IOMMUAccessFlags flag)
 {
     IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
     uint32_t sid;
+    viommu_dev *dev;
+    viommu_mapping *mapping;
+    viommu_interval interval;
+
+    interval.low = addr;
+    interval.high = addr + 1;
 
     IOMMUTLBEntry entry = {
         .target_as = &address_space_memory,
         .iova = addr,
         .translated_addr = addr,
-        .addr_mask = ~(hwaddr)0,
-        .perm = IOMMU_NONE,
+        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
+        .perm = flag,
     };
 
     sid = virtio_iommu_get_sid(sdev);
 
     trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
+    qemu_mutex_lock(&s->mutex);
+
+    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
+    if (!dev) {
+        /* device cannot be attached to another as */
+        printf("%s sid=%d is not known!!\n", __func__, sid);
+        goto unlock;
+    }
+
+    mapping = g_tree_lookup(dev->as->mappings, (gpointer)(&interval));
+    if (!mapping) {
+        printf("%s no mapping for 0x%"PRIx64" for sid=%d\n", __func__,
+               addr, sid);
+        goto unlock;
+    }
+
+    if (((flag & IOMMU_RO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_READ)) ||
+        ((flag & IOMMU_WO) && !(mapping->flags & VIRTIO_IOMMU_MAP_F_WRITE))) {
+        error_report("Permission error on 0x%"PRIx64"(%d): allowed=%d",
+                     addr, flag, mapping->flags);
+        entry.perm = IOMMU_NONE;
+        goto unlock;
+    }
+    entry.translated_addr = addr - mapping->virt_addr + mapping->phys_addr,
+    trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
+
+unlock:
+    qemu_mutex_unlock(&s->mutex);
     return entry;
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (8 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-27 10:53   ` Tomasz Nowicki
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 11/16] hw/arm/virt: Add 2.11 machine type Eric Auger
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This patch implements the PROBE request. At the moment,
no reserved regions are returned.

At the moment reserved regions are stored per device.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

Waiting for clarifications on v0.4 spec
---
 hw/virtio/trace-events   |   2 +
 hw/virtio/virtio-iommu.c | 173 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 173 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 9010fbd..9ccfad1 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -46,3 +46,5 @@ virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t next_low,
 virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"], new interval=[0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRIx64",0x%"PRIx64"]"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
+virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx"
+virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index db46a91..281b0f8 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -37,6 +37,11 @@
 
 /* Max size */
 #define VIOMMU_DEFAULT_QUEUE_SIZE 256
+#define VIOMMU_PROBE_SIZE 512
+
+#define SUPPORTED_PROBE_PROPERTIES (\
+    VIRTIO_IOMMU_PROBE_T_NONE | \
+    VIRTIO_IOMMU_PROBE_T_RESV_MEM)
 
 typedef struct viommu_as {
     uint32_t id;
@@ -49,6 +54,7 @@ typedef struct viommu_dev {
     viommu_as *as;
     QLIST_ENTRY(viommu_dev) next;
     VirtIOIOMMU *viommu;
+    GTree *reserved_regions;
 } viommu_dev;
 
 typedef struct viommu_interval {
@@ -63,6 +69,13 @@ typedef struct viommu_mapping {
     uint32_t flags;
 } viommu_mapping;
 
+typedef struct viommu_property_buffer {
+    viommu_dev *dev;
+    size_t filled;
+    uint8_t *start;
+    bool error;
+} viommu_property_buffer;
+
 static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
 {
     return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
@@ -101,6 +114,9 @@ static viommu_dev *virtio_iommu_get_dev(VirtIOIOMMU *s, uint32_t devid)
     dev->viommu = s;
     trace_virtio_iommu_get_dev(devid);
     g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
+    dev->reserved_regions = g_tree_new_full((GCompareDataFunc)interval_cmp,
+                                            NULL, (GDestroyNotify)g_free,
+                                            (GDestroyNotify)g_free);
     return dev;
 }
 
@@ -114,6 +130,7 @@ static void virtio_iommu_put_dev(gpointer data)
     }
 
     trace_virtio_iommu_put_dev(dev->id);
+    g_tree_destroy(dev->reserved_regions);
     g_free(dev);
 }
 
@@ -369,6 +386,123 @@ static int virtio_iommu_unmap(VirtIOIOMMU *s,
     return VIRTIO_IOMMU_S_INVAL;
 }
 
+static gboolean virtio_iommu_fill_resv_mem_prop(gpointer key,
+                                                gpointer value,
+                                                gpointer data)
+{
+    struct virtio_iommu_probe_resv_mem *resv =
+        (struct virtio_iommu_probe_resv_mem *)value;
+    struct virtio_iommu_probe_property *prop;
+    struct virtio_iommu_probe_resv_mem *current;
+    viommu_property_buffer *bufstate = (viommu_property_buffer *)data;
+    size_t size = sizeof(*resv), total_size;
+
+    total_size = size + 4;
+
+    if (bufstate->filled >= VIOMMU_PROBE_SIZE) {
+        bufstate->error = true;
+        return true;
+    }
+    prop = (struct virtio_iommu_probe_property *)
+                (bufstate->start + bufstate->filled);
+    prop->type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_RESV_MEM) &
+                    VIRTIO_IOMMU_PROBE_T_MASK;
+    prop->length = size;
+
+    current = (struct virtio_iommu_probe_resv_mem *)prop->value;
+    *current = *resv;
+    bufstate->filled += total_size;
+    trace_virtio_iommu_fill_resv_property(bufstate->dev->id,
+                                          resv->subtype, resv->addr,
+                                          resv->size, resv->flags,
+                                          bufstate->filled);
+    return false;
+}
+
+static int virtio_iommu_fill_none_prop(viommu_property_buffer *bufstate)
+{
+    struct virtio_iommu_probe_property *prop;
+
+    prop = (struct virtio_iommu_probe_property *)
+                (bufstate->start + bufstate->filled);
+    prop->type = cpu_to_le16(VIRTIO_IOMMU_PROBE_T_NONE)
+                    & VIRTIO_IOMMU_PROBE_T_MASK;
+    prop->length = 0;
+    bufstate->filled += 4;
+    trace_virtio_iommu_fill_none_property(bufstate->dev->id);
+    return 0;
+}
+
+static int virtio_iommu_fill_property(int devid, int type,
+                                      viommu_property_buffer *bufstate)
+{
+    int ret = -ENOSPC;
+
+    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
+        bufstate->error = true;
+        goto out;
+    }
+
+    switch (type) {
+    case VIRTIO_IOMMU_PROBE_T_NONE:
+        ret = virtio_iommu_fill_none_prop(bufstate);
+        break;
+    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
+    {
+        viommu_dev *dev = bufstate->dev;
+
+        g_tree_foreach(dev->reserved_regions,
+                       virtio_iommu_fill_resv_mem_prop,
+                       bufstate);
+        if (!bufstate->error) {
+            ret = 0;
+        }
+        break;
+    }
+    default:
+        ret = -ENOENT;
+        break;
+    }
+out:
+    if (ret) {
+        error_report("%s property of type=%d could not be filled (%d),"
+                     " remaining size = 0x%lx",
+                     __func__, type, ret, bufstate->filled);
+    }
+    return ret;
+}
+
+static int virtio_iommu_probe(VirtIOIOMMU *s,
+                              struct virtio_iommu_req_probe *req,
+                              uint8_t *buf)
+{
+    uint32_t devid = le32_to_cpu(req->device);
+    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
+    viommu_property_buffer bufstate;
+    viommu_dev *dev;
+    int ret;
+
+    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
+    if (!dev) {
+        return -EINVAL;
+    }
+
+    bufstate.start = buf;
+    bufstate.filled = 0;
+    bufstate.dev = dev;
+
+    while ((type = ctz32(prop_types)) != 32) {
+        ret = virtio_iommu_fill_property(devid, 1 << type, &bufstate);
+        if (ret) {
+            break;
+        }
+        prop_types &= ~(1 << type);
+    }
+    virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE, &bufstate);
+
+    return VIRTIO_IOMMU_S_OK;
+}
+
 #define get_payload_size(req) (\
 sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
 
@@ -433,6 +567,24 @@ static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
     return virtio_iommu_unmap(s, &req);
 }
 
+static int virtio_iommu_handle_probe(VirtIOIOMMU *s,
+                                     struct iovec *iov,
+                                     unsigned int iov_cnt,
+                                     uint8_t *buf)
+{
+    struct virtio_iommu_req_probe req;
+    size_t sz, payload_sz;
+
+    payload_sz = get_payload_size(req);
+
+    sz = iov_to_buf(iov, iov_cnt, 0, &req, payload_sz);
+    if (sz != payload_sz) {
+        return VIRTIO_IOMMU_S_INVAL;
+    }
+
+    return virtio_iommu_probe(s, &req, buf);
+}
+
 static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
@@ -477,16 +629,31 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
         case VIRTIO_IOMMU_T_UNMAP:
             tail.status = virtio_iommu_handle_unmap(s, iov, iov_cnt);
             break;
+        case VIRTIO_IOMMU_T_PROBE:
+        {
+            struct virtio_iommu_req_tail *ptail;
+            uint8_t *buf = g_malloc0(s->config.probe_size + sizeof(tail));
+
+            ptail = (struct virtio_iommu_req_tail *)buf + s->config.probe_size;
+            ptail->status = virtio_iommu_handle_probe(s, iov, iov_cnt, buf);
+
+            sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
+                              buf, s->config.probe_size + sizeof(tail));
+            g_free(buf);
+            assert(sz == s->config.probe_size + sizeof(tail));
+            goto push;
+        }
         default:
             tail.status = VIRTIO_IOMMU_S_UNSUPP;
         }
-        qemu_mutex_unlock(&s->mutex);
 
         sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
                           &tail, sizeof(tail));
         assert(sz == sizeof(tail));
 
-        virtqueue_push(vq, elem, sizeof(tail));
+push:
+        qemu_mutex_unlock(&s->mutex);
+        virtqueue_push(vq, elem, sz);
         virtio_notify(vdev, vq);
         g_free(elem);
     }
@@ -574,6 +741,7 @@ static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
     virtio_add_feature(&f, VIRTIO_RING_F_INDIRECT_DESC);
     virtio_add_feature(&f, VIRTIO_IOMMU_F_INPUT_RANGE);
     virtio_add_feature(&f, VIRTIO_IOMMU_F_MAP_UNMAP);
+    virtio_add_feature(&f, VIRTIO_IOMMU_F_PROBE);
     return f;
 }
 
@@ -631,6 +799,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
 
     s->config.page_size_mask = TARGET_PAGE_MASK;
     s->config.input_range.end = -1UL;
+    s->config.probe_size = VIOMMU_PROBE_SIZE;
 
     qemu_mutex_init(&s->mutex);
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 11/16] hw/arm/virt: Add 2.11 machine type
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (9 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

The new machine type allows virtio-iommu instantiation.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 hw/arm/virt.c         | 23 +++++++++++++++++++++--
 include/hw/arm/virt.h |  1 +
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index cfd834d..9737a5f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1648,7 +1648,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_10_instance_init(Object *obj)
+static void virt_2_11_instance_init(Object *obj)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1708,10 +1708,29 @@ static void virt_2_10_instance_init(Object *obj)
     vms->irqmap = a15irqmap;
 }
 
+static void virt_machine_2_11_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(2, 11)
+
+#define VIRT_COMPAT_2_10 \
+    HW_COMPAT_2_10
+
+static void virt_2_10_instance_init(Object *obj)
+{
+    virt_2_11_instance_init(obj);
+}
+
 static void virt_machine_2_10_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
+    virt_machine_2_11_options(mc);
+    SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);
+
+    vmc->no_iommu = true;
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 10)
+DEFINE_VIRT_MACHINE(2, 10)
 
 #define VIRT_COMPAT_2_9 \
     HW_COMPAT_2_9
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..ff27551 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -84,6 +84,7 @@ typedef struct {
     bool disallow_affinity_adjustment;
     bool no_its;
     bool no_pmu;
+    bool no_iommu;
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 12/16] hw/arm/virt: Add virtio-iommu to the virt board
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (10 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 11/16] hw/arm/virt: Add 2.11 machine type Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 13/16] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

The specific virtio-mmio node is inconditionally added on
machine init while the binding between this latter and the
PCIe host bridge is done on machine init done notifier, only
if -device virtio-iommu-device was added to the qemu command
line.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
---
 hw/arm/virt.c         | 92 +++++++++++++++++++++++++++++++++++++++++++++++----
 include/hw/arm/virt.h |  4 +++
 2 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9737a5f..f057d7a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -52,6 +52,7 @@
 #include "hw/arm/fdt.h"
 #include "hw/intc/arm_gic.h"
 #include "hw/intc/arm_gicv3_common.h"
+#include "hw/virtio/virtio-iommu.h"
 #include "kvm_arm.h"
 #include "hw/smbios/smbios.h"
 #include "qapi/visitor.h"
@@ -139,6 +140,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
+    [VIRT_IOMMU] =              { 0x09050000, 0x00000200 },
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -159,6 +161,7 @@ static const int a15irqmap[] = {
     [VIRT_SECURE_UART] = 8,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
+    [VIRT_IOMMU] = 74,
     [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
@@ -999,7 +1002,81 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
-static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
+static int bind_virtio_iommu_device(Object *obj, void *opaque)
+{
+    VirtMachineState *vms = (VirtMachineState *)opaque;
+    struct arm_boot_info *info = &vms->bootinfo;
+    int dtb_size;
+    void *fdt = info->get_dtb(info, &dtb_size);
+    Object *dev;
+
+    dev = object_dynamic_cast(obj, TYPE_VIRTIO_IOMMU);
+
+    if (!dev) {
+        /* Container, traverse it for children */
+        return object_child_foreach(obj, bind_virtio_iommu_device, opaque);
+    }
+
+    qemu_fdt_setprop_cells(fdt, vms->pcie_host_nodename, "iommu-map",
+                           0x0, vms->iommu_phandle, 0x0, 0x10000);
+
+    return true;
+}
+
+static
+void virtio_iommu_notifier(Notifier *notifier, void *data)
+{
+    VirtMachineState *vms = container_of(notifier, VirtMachineState,
+                                         virtio_iommu_done);
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    Object *container;
+
+
+    if (vmc->no_iommu) {
+        return;
+    }
+
+    container = container_get(qdev_get_machine(), "/peripheral");
+    bind_virtio_iommu_device(container, vms);
+    container = container_get(qdev_get_machine(), "/peripheral-anon");
+    bind_virtio_iommu_device(container, vms);
+}
+
+static void create_virtio_iommu(VirtMachineState *vms, qemu_irq *pic)
+{
+    char *smmu;
+    const char compat[] = "virtio,mmio";
+    int irq =  vms->irqmap[VIRT_IOMMU];
+    hwaddr base = vms->memmap[VIRT_IOMMU].base;
+    hwaddr size = vms->memmap[VIRT_IOMMU].size;
+    VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+
+    if (vmc->no_iommu) {
+        return;
+    }
+
+    vms->iommu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
+    sysbus_create_simple("virtio-mmio", base, pic[irq]);
+
+    smmu = g_strdup_printf("/virtio_mmio@%" PRIx64, base);
+    qemu_fdt_add_subnode(vms->fdt, smmu);
+    qemu_fdt_setprop(vms->fdt, smmu, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vms->fdt, smmu, "reg", 2, base, 2, size);
+
+    qemu_fdt_setprop_cells(vms->fdt, smmu, "interrupts",
+            GIC_FDT_IRQ_TYPE_SPI, irq, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+
+    qemu_fdt_setprop(vms->fdt, smmu, "dma-coherent", NULL, 0);
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "#iommu-cells", 1);
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "phandle", vms->iommu_phandle);
+    g_free(smmu);
+
+    vms->virtio_iommu_done.notify = virtio_iommu_notifier;
+    qemu_add_machine_init_done_notifier(&vms->virtio_iommu_done);
+}
+
+static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = vms->memmap[VIRT_PCIE_MMIO].size;
@@ -1073,7 +1150,8 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
         }
     }
 
-    nodename = g_strdup_printf("/pcie@%" PRIx64, base);
+    vms->pcie_host_nodename = g_strdup_printf("/pcie@%" PRIx64, base);
+    nodename = vms->pcie_host_nodename;
     qemu_fdt_add_subnode(vms->fdt, nodename);
     qemu_fdt_setprop_string(vms->fdt, nodename,
                             "compatible", "pci-host-ecam-generic");
@@ -1112,7 +1190,6 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
 
-    g_free(nodename);
 }
 
 static void create_platform_bus(VirtMachineState *vms, qemu_irq *pic)
@@ -1457,16 +1534,16 @@ static void machvirt_init(MachineState *machine)
 
     create_rtc(vms, pic);
 
-    create_pcie(vms, pic);
-
-    create_gpio(vms, pic);
-
     /* Create mmio transports, so the user can create virtio backends
      * (which will be automatically plugged in to the transports). If
      * no backend is created the transport will just sit harmlessly idle.
      */
     create_virtio_devices(vms, pic);
 
+    create_pcie(vms, pic);
+
+    create_gpio(vms, pic);
+
     vms->fw_cfg = create_fw_cfg(vms, &address_space_memory);
     rom_set_fw(vms->fw_cfg);
 
@@ -1491,6 +1568,7 @@ static void machvirt_init(MachineState *machine)
      * Notifiers are executed in registration reverse order.
      */
     create_platform_bus(vms, pic);
+    create_virtio_iommu(vms, pic);
 }
 
 static bool virt_get_secure(Object *obj, Error **errp)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index ff27551..e00fa65 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -59,6 +59,7 @@ enum {
     VIRT_GIC_V2M,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
+    VIRT_IOMMU,
     VIRT_UART,
     VIRT_MMIO,
     VIRT_RTC,
@@ -91,6 +92,7 @@ typedef struct {
 typedef struct {
     MachineState parent;
     Notifier machine_done;
+    Notifier virtio_iommu_done;
     FWCfgState *fw_cfg;
     bool secure;
     bool highmem;
@@ -106,6 +108,8 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t iommu_phandle;
+    char *pcie_host_nodename;
     int psci_conduit;
 } VirtMachineState;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 13/16] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (11 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 14/16] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

This callback allows to inform the IOMMU memory region about
restrictions on the supported page sizes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1dcd312..0c91dca 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -204,6 +204,10 @@ typedef struct IOMMUMemoryRegionClass {
                                IOMMUAccessFlags flag);
     /* Returns minimum supported page size */
     uint64_t (*get_min_page_size)(IOMMUMemoryRegion *iommu);
+
+    /* Limits the supported page sizes to @pgsizes */
+    void (*set_page_size_mask)(IOMMUMemoryRegion *iommu, uint64_t pgsizes);
+
     /* Called when IOMMU Notifier flag changed */
     void (*notify_flag_changed)(IOMMUMemoryRegion *iommu,
                                 IOMMUNotifierFlag old_flags,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 14/16] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (12 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 13/16] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 15/16] virtio-iommu: Implement set_page_size_mask Eric Auger
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

We store the page_size_mask in the container and on
vfio_listener_region_add(), the information is conveyed
to the IOMMUMemoryRegion.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c              | 5 +++++
 include/hw/vfio/vfio-common.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7b2924c..468a259 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -480,6 +480,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
     if (memory_region_is_iommu(section->mr)) {
         VFIOGuestIOMMU *giommu;
         IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+        IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
 
         trace_vfio_listener_region_add_iommu(iova, end);
         /*
@@ -500,6 +501,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
                             IOMMU_NOTIFIER_ALL,
                             section->offset_within_region,
                             int128_get64(llend));
+        if (imrc->set_page_size_mask) {
+            imrc->set_page_size_mask(iommu_mr, container->page_size_mask);
+        }
         QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
         memory_region_register_iommu_notifier(section->mr, &giommu->n);
@@ -1024,6 +1028,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
             /* Assume 4k IOVA page size */
             info.iova_pgsizes = 4096;
         }
+        container->page_size_mask = info.iova_pgsizes;
         vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
     } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU) ||
                ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index f3a2ac9..221cc30 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -79,6 +79,7 @@ typedef struct VFIOContainer {
     int fd; /* /dev/vfio/vfio, empowered by the attached groups */
     MemoryListener listener;
     MemoryListener prereg_listener;
+    uint64_t page_size_mask; /* page sizes supported for this container */
     unsigned iommu_type;
     int error;
     bool initialized;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 15/16] virtio-iommu: Implement set_page_size_mask
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (13 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 14/16] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 16/16] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

We implement the set_page_size_mask callback to allow the
virtio-iommu to be aware of any restrictions on the page size
mask due to an underlying HW IOMMU.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/virtio/trace-events   |  1 +
 hw/virtio/virtio-iommu.c | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 9ccfad1..2793604 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -48,3 +48,4 @@ virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc [0x%"PRI
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_fill_resv_property(uint32_t devid, uint8_t subtype, uint64_t addr, uint64_t size, uint32_t flags, size_t filled) "dev= %d, subtype=%d addr=0x%"PRIx64" size=0x%"PRIx64" flags=%d filled=0x%lx"
 virtio_iommu_fill_none_property(uint32_t devid) "devid=%d"
+virtio_iommu_set_page_size_mask(const char *iommu_mr, uint64_t mask) "mr=%s page_size_mask=0x%"PRIx64
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 281b0f8..1873b9a 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -714,6 +714,21 @@ unlock:
     return entry;
 }
 
+static void virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
+                                            uint64_t page_size_mask)
+{
+    IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
+    VirtIOIOMMU *s = sdev->viommu;
+
+    s->config.page_size_mask &= page_size_mask;
+    if (!s->config.page_size_mask) {
+        error_setg(&error_fatal,
+                   "No compatible page size between guest and host iommus");
+    }
+
+    trace_virtio_iommu_set_page_size_mask(mr->parent_obj.name, page_size_mask);
+}
+
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
@@ -881,6 +896,7 @@ static void virtio_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = virtio_iommu_translate;
+    imrc->set_page_size_mask = virtio_iommu_set_page_size_mask;
 }
 
 static const TypeInfo virtio_iommu_info = {
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC v4 16/16] hw/vfio/common: Do not print error when viommu translates into an mmio region
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (14 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 15/16] virtio-iommu: Implement set_page_size_mask Eric Auger
@ 2017-09-19  7:46 ` Eric Auger
  2017-09-27 11:07 ` [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Tomasz Nowicki
  2017-10-11 14:56 ` Peter Maydell
  17 siblings, 0 replies; 32+ messages in thread
From: Eric Auger @ 2017-09-19  7:46 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, bharat.bhushan, peterx, linuc.decode

On ARM, the MSI doorbell is translated by the virtual IOMMU.
As such address_space_translate() returns the MSI controller
MMIO region and we get an "iommu map to non memory area"
message. Let's remove this latter.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/vfio/common.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 468a259..fb6f397 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -326,8 +326,6 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
                                  iotlb->translated_addr,
                                  &xlat, &len, writable);
     if (!memory_region_is_ram(mr)) {
-        error_report("iommu map to non memory area %"HWADDR_PRIx"",
-                     xlat);
         return false;
     }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate Eric Auger
@ 2017-09-22  6:52   ` Bharat Bhushan
  0 siblings, 0 replies; 32+ messages in thread
From: Bharat Bhushan @ 2017-09-22  6:52 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, peterx, linuc.decode



> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: Tuesday, September 19, 2017 1:17 PM
> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
> peter.maydell@linaro.org; alex.williamson@redhat.com; mst@redhat.com;
> qemu-arm@nongnu.org; qemu-devel@nongnu.org; jean-
> philippe.brucker@arm.com
> Cc: will.deacon@arm.com; kevin.tian@intel.com; marc.zyngier@arm.com;
> christoffer.dall@linaro.org; drjones@redhat.com; wei@redhat.com;
> tn@semihalf.com; Bharat Bhushan <bharat.bhushan@nxp.com>;
> peterx@redhat.com; linuc.decode@gmail.com
> Subject: [RFC v4 09/16] virtio-iommu: Implement translate
> 
> This patch implements the translate callback
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/virtio/trace-events   |  1 +
>  hw/virtio/virtio-iommu.c | 39
> +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index
> da298c1..9010fbd 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -45,3 +45,4 @@ virtio_iommu_put_as(uint32_t asid) "Free asid=%d"
>  virtio_iommu_unmap_left_interval(uint64_t low, uint64_t high, uint64_t
> next_low, uint64_t next_high) "Unmap left [0x%"PRIx64",0x%"PRIx64"],
> new interval=[0x%"PRIx64",0x%"PRIx64"]"
>  virtio_iommu_unmap_right_interval(uint64_t low, uint64_t high, uint64_t
> next_low, uint64_t next_high) "Unmap right [0x%"PRIx64",0x%"PRIx64"],
> new interval=[0x%"PRIx64",0x%"PRIx64"]"
>  virtio_iommu_unmap_inc_interval(uint64_t low, uint64_t high) "Unmap inc
> [0x%"PRIx64",0x%"PRIx64"]"
> +virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr,
> uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index
> 6f1a7d1..db46a91 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -496,19 +496,54 @@ static IOMMUTLBEntry
> virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
>                                              IOMMUAccessFlags flag)  {
>      IOMMUDevice *sdev = container_of(mr, IOMMUDevice, iommu_mr);
> +    VirtIOIOMMU *s = sdev->viommu;
>      uint32_t sid;
> +    viommu_dev *dev;
> +    viommu_mapping *mapping;
> +    viommu_interval interval;
> +
> +    interval.low = addr;
> +    interval.high = addr + 1;
> 
>      IOMMUTLBEntry entry = {
>          .target_as = &address_space_memory,
>          .iova = addr,
>          .translated_addr = addr,
> -        .addr_mask = ~(hwaddr)0,
> -        .perm = IOMMU_NONE,
> +        .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
> +        .perm = flag,
>      };
> 
>      sid = virtio_iommu_get_sid(sdev);
> 
>      trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag);
> +    qemu_mutex_lock(&s->mutex);
> +
> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(sid));
> +    if (!dev) {
> +        /* device cannot be attached to another as */
> +        printf("%s sid=%d is not known!!\n", __func__, sid);
> +        goto unlock;
> +    }
> +
> +    mapping = g_tree_lookup(dev->as->mappings, (gpointer)(&interval));

Should check of !dev->as before accessing this.

Thanks
-Bharat 

> +    if (!mapping) {
> +        printf("%s no mapping for 0x%"PRIx64" for sid=%d\n", __func__,
> +               addr, sid);
> +        goto unlock;
> +    }
> +
> +    if (((flag & IOMMU_RO) && !(mapping->flags &
> VIRTIO_IOMMU_MAP_F_READ)) ||
> +        ((flag & IOMMU_WO) && !(mapping->flags &
> VIRTIO_IOMMU_MAP_F_WRITE))) {
> +        error_report("Permission error on 0x%"PRIx64"(%d): allowed=%d",
> +                     addr, flag, mapping->flags);
> +        entry.perm = IOMMU_NONE;
> +        goto unlock;
> +    }
> +    entry.translated_addr = addr - mapping->virt_addr + mapping-
> >phys_addr,
> +    trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid);
> +
> +unlock:
> +    qemu_mutex_unlock(&s->mutex);
>      return entry;
>  }
> 
> --
> 2.5.5

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices Eric Auger
@ 2017-09-22  7:29   ` Bharat Bhushan
  0 siblings, 0 replies; 32+ messages in thread
From: Bharat Bhushan @ 2017-09-22  7:29 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, tn, peterx, linuc.decode



> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: Tuesday, September 19, 2017 1:17 PM
> To: eric.auger.pro@gmail.com; eric.auger@redhat.com;
> peter.maydell@linaro.org; alex.williamson@redhat.com; mst@redhat.com;
> qemu-arm@nongnu.org; qemu-devel@nongnu.org; jean-
> philippe.brucker@arm.com
> Cc: will.deacon@arm.com; kevin.tian@intel.com; marc.zyngier@arm.com;
> christoffer.dall@linaro.org; drjones@redhat.com; wei@redhat.com;
> tn@semihalf.com; Bharat Bhushan <bharat.bhushan@nxp.com>;
> peterx@redhat.com; linuc.decode@gmail.com
> Subject: [RFC v4 06/16] virtio-iommu: Register attached devices
> 
> This patch introduce address space and device internal
> datatypes. Both are stored in RB trees. The address space
> owns a list of devices attached to it.
> 
> It is assumed the devid corresponds to the PCI BDF.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v3 -> v4:
> - new separate patch
> ---
>  hw/virtio/trace-events   |   4 ++
>  hw/virtio/virtio-iommu.c | 120
> +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 124 insertions(+)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index bc65356..74b92d3 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -38,3 +38,7 @@ virtio_iommu_map(uint32_t as, uint64_t phys_addr,
> uint64_t virt_addr, uint64_t s
>  virtio_iommu_unmap(uint32_t as, uint64_t virt_addr, uint64_t size) "as= %d
> virt_addr=0x%"PRIx64" size=0x%"PRIx64
>  virtio_iommu_translate(const char *name, uint32_t rid, uint64_t iova, int
> flag) "mr=%s rid=%d addr=0x%"PRIx64" flag=%d"
>  virtio_iommu_init_iommu_mr(char *iommu_mr) "init %s"
> +virtio_iommu_get_dev(uint32_t devid) "Alloc devid=%d"
> +virtio_iommu_put_dev(uint32_t devid) "Free devid=%d"
> +virtio_iommu_get_as(uint32_t asid) "Alloc asid=%d"
> +virtio_iommu_put_as(uint32_t asid) "Free asid=%d"
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index f4cb76f..41a4bbc 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -32,15 +32,116 @@
>  #include "hw/virtio/virtio-bus.h"
>  #include "hw/virtio/virtio-access.h"
>  #include "hw/virtio/virtio-iommu.h"
> +#include "hw/pci/pci_bus.h"
> +#include "hw/pci/pci.h"
> 
>  /* Max size */
>  #define VIOMMU_DEFAULT_QUEUE_SIZE 256
> 
> +typedef struct viommu_as {
> +    uint32_t id;
> +    GTree *mappings;
> +    QLIST_HEAD(, viommu_dev) device_list;
> +} viommu_as;
> +
> +typedef struct viommu_dev {
> +    uint32_t id;
> +    viommu_as *as;
> +    QLIST_ENTRY(viommu_dev) next;
> +    VirtIOIOMMU *viommu;
> +} viommu_dev;
> +
> +typedef struct viommu_interval {
> +    uint64_t low;
> +    uint64_t high;
> +} viommu_interval;
> +
>  static inline uint16_t virtio_iommu_get_sid(IOMMUDevice *dev)
>  {
>      return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn);
>  }
> 
> +static gint interval_cmp(gconstpointer a, gconstpointer b, gpointer
> user_data)
> +{
> +    viommu_interval *inta = (viommu_interval *)a;
> +    viommu_interval *intb = (viommu_interval *)b;
> +
> +    if (inta->high <= intb->low) {
> +        return -1;
> +    } else if (intb->high <= inta->low) {
> +        return 1;
> +    } else {
> +        return 0;
> +    }
> +}
> +
> +static void virtio_iommu_detach_dev_from_as(viommu_dev *dev)
> +{
> +    QLIST_REMOVE(dev, next);
> +    dev->as = NULL;
> +}
> +
> +static viommu_dev *virtio_iommu_get_dev(VirtIOIOMMU *s, uint32_t
> devid)
> +{
> +    viommu_dev *dev;
> +
> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
> +    if (dev) {
> +        return dev;
> +    }
> +    dev = g_malloc0(sizeof(*dev));
> +    dev->id = devid;
> +    dev->viommu = s;
> +    trace_virtio_iommu_get_dev(devid);
> +    g_tree_insert(s->devices, GUINT_TO_POINTER(devid), dev);
> +    return dev;
> +}
> +
> +static void virtio_iommu_put_dev(gpointer data)
> +{
> +    viommu_dev *dev = (viommu_dev *)data;
> +
> +    if (dev->as) {
> +        virtio_iommu_detach_dev_from_as(dev);
> +        g_tree_unref(dev->as->mappings);
> +    }
> +
> +    trace_virtio_iommu_put_dev(dev->id);
> +    g_free(dev);
> +}
> +
> +viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid);
> +viommu_as *virtio_iommu_get_as(VirtIOIOMMU *s, uint32_t asid)
> +{
> +    viommu_as *as;
> +
> +    as = g_tree_lookup(s->address_spaces, GUINT_TO_POINTER(asid));
> +    if (as) {
> +        return as;
> +    }
> +    as = g_malloc0(sizeof(*as));
> +    as->id = asid;

I see now you maintain device-list in address-space in your base patches, but I do not see where it is initialized?
Should not that be initialized here (vfio integration patches initialized here).

Thanks
-Bharat

> +    as->mappings = g_tree_new_full((GCompareDataFunc)interval_cmp,
> +                                   NULL, (GDestroyNotify)g_free,
> +                                   (GDestroyNotify)g_free);
> +    g_tree_insert(s->address_spaces, GUINT_TO_POINTER(asid), as);
> +    trace_virtio_iommu_get_as(asid);
> +    return as;
> +}
> +
> +static void virtio_iommu_put_as(gpointer data)
> +{
> +    viommu_as *as = (viommu_as *)data;
> +    viommu_dev *iter, *tmp;
> +
> +    QLIST_FOREACH_SAFE(iter, &as->device_list, next, tmp) {
> +        virtio_iommu_detach_dev_from_as(iter);
> +    }
> +    g_tree_destroy(as->mappings);
> +    trace_virtio_iommu_put_as(as->id);
> +    g_free(as);
> +}
> +
>  static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void
> *opaque,
>                                                int devfn)
>  {
> @@ -70,6 +171,8 @@ static AddressSpace
> *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>          sdev->bus = bus;
>          sdev->devfn = devfn;
> 
> +        virtio_iommu_get_dev(s, PCI_BUILD_BDF(pci_bus_num(bus), devfn));
> +
>          trace_virtio_iommu_init_iommu_mr(name);
> 
>          memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev-
> >iommu_mr),
> @@ -360,6 +463,12 @@ static inline guint as_uint64_hash(gconstpointer v)
>      return (guint)*(const uint64_t *)v;
>  }
> 
> +static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
> +{
> +    uint ua = GPOINTER_TO_UINT(a);
> +    uint ub = GPOINTER_TO_UINT(b);
> +    return (ua > ub) - (ua < ub);
> +}
> 
>  static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>  {
> @@ -375,17 +484,28 @@ static void
> virtio_iommu_device_realize(DeviceState *dev, Error **errp)
>      s->config.page_size_mask = TARGET_PAGE_MASK;
>      s->config.input_range.end = -1UL;
> 
> +    qemu_mutex_init(&s->mutex);
> +
>      memset(s->as_by_bus_num, 0, sizeof(s->as_by_bus_num));
>      s->as_by_busptr = g_hash_table_new_full(as_uint64_hash,
>                                              as_uint64_equal,
>                                              g_free, g_free);
> 
> +    s->address_spaces = g_tree_new_full((GCompareDataFunc)int_cmp,
> +                                         NULL, NULL, virtio_iommu_put_as);
> +    s->devices = g_tree_new_full((GCompareDataFunc)int_cmp,
> +                                 NULL, NULL, virtio_iommu_put_dev);
> +
>      virtio_iommu_init_as(s);
>  }
> 
>  static void virtio_iommu_device_unrealize(DeviceState *dev, Error **errp)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
> +
> +    g_tree_destroy(s->address_spaces);
> +    g_tree_destroy(s->devices);
> 
>      virtio_cleanup(vdev);
>  }
> --
> 2.5.5

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request Eric Auger
@ 2017-09-27 10:53   ` Tomasz Nowicki
  2017-09-27 11:00     ` Bharat Bhushan
  2017-09-27 15:40     ` Auger Eric
  0 siblings, 2 replies; 32+ messages in thread
From: Tomasz Nowicki @ 2017-09-27 10:53 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, bharat.bhushan, peterx, linuc.decode

Hi Eric,

On 19.09.2017 09:46, Eric Auger wrote:
> This patch implements the PROBE request. At the moment,
> no reserved regions are returned.
> 
> At the moment reserved regions are stored per device.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 

[...]

> +
> +static int virtio_iommu_fill_property(int devid, int type,
> +                                      viommu_property_buffer *bufstate)
> +{
> +    int ret = -ENOSPC;
> +
> +    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
> +        bufstate->error = true;
> +        goto out;
> +    }
> +
> +    switch (type) {
> +    case VIRTIO_IOMMU_PROBE_T_NONE:
> +        ret = virtio_iommu_fill_none_prop(bufstate);
> +        break;
> +    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
> +    {
> +        viommu_dev *dev = bufstate->dev;
> +
> +        g_tree_foreach(dev->reserved_regions,
> +                       virtio_iommu_fill_resv_mem_prop,
> +                       bufstate);
> +        if (!bufstate->error) {
> +            ret = 0;
> +        }
> +        break;
> +    }
> +    default:
> +        ret = -ENOENT;
> +        break;
> +    }
> +out:
> +    if (ret) {
> +        error_report("%s property of type=%d could not be filled (%d),"
> +                     " remaining size = 0x%lx",
> +                     __func__, type, ret, bufstate->filled);
> +    }
> +    return ret;
> +}
> +
> +static int virtio_iommu_probe(VirtIOIOMMU *s,
> +                              struct virtio_iommu_req_probe *req,
> +                              uint8_t *buf)
> +{
> +    uint32_t devid = le32_to_cpu(req->device);
> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> +    viommu_property_buffer bufstate;
> +    viommu_dev *dev;
> +    int ret;
> +
> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
> +    if (!dev) {
> +        return -EINVAL;
> +    }
> +
> +    bufstate.start = buf;
> +    bufstate.filled = 0;
> +    bufstate.dev = dev;

bufstate.error is not initialized which may cause false alarm in 
virtio_iommu_fill_property()

> +
> +    while ((type = ctz32(prop_types)) != 32) {
> +        ret = virtio_iommu_fill_property(devid, 1 << type, &bufstate);
> +        if (ret) {
> +            break;
> +        }
> +        prop_types &= ~(1 << type);
> +    }
> +    virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE, &bufstate);
> +
> +    return VIRTIO_IOMMU_S_OK;
> +}
> +
>   #define get_payload_size(req) (\
>   sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
>   
> @@ -433,6 +567,24 @@ static int virtio_iommu_handle_unmap(VirtIOIOMMU *s,
>       return virtio_iommu_unmap(s, &req);
>   }

Thanks,
Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
  2017-09-27 10:53   ` Tomasz Nowicki
@ 2017-09-27 11:00     ` Bharat Bhushan
  2017-09-27 15:44       ` Auger Eric
  2017-09-27 15:40     ` Auger Eric
  1 sibling, 1 reply; 32+ messages in thread
From: Bharat Bhushan @ 2017-09-27 11:00 UTC (permalink / raw)
  To: Tomasz Nowicki, Eric Auger, eric.auger.pro, peter.maydell,
	alex.williamson, mst, qemu-arm, qemu-devel,
	jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, peterx, linuc.decode



> -----Original Message-----
> From: Tomasz Nowicki [mailto:tnowicki@caviumnetworks.com]
> Sent: Wednesday, September 27, 2017 4:23 PM
> To: Eric Auger <eric.auger@redhat.com>; eric.auger.pro@gmail.com;
> peter.maydell@linaro.org; alex.williamson@redhat.com; mst@redhat.com;
> qemu-arm@nongnu.org; qemu-devel@nongnu.org; jean-
> philippe.brucker@arm.com
> Cc: will.deacon@arm.com; kevin.tian@intel.com; marc.zyngier@arm.com;
> christoffer.dall@linaro.org; drjones@redhat.com; wei@redhat.com; Bharat
> Bhushan <bharat.bhushan@nxp.com>; peterx@redhat.com;
> linuc.decode@gmail.com
> Subject: Re: [RFC v4 10/16] virtio-iommu: Implement probe request
> 
> Hi Eric,
> 
> On 19.09.2017 09:46, Eric Auger wrote:
> > This patch implements the PROBE request. At the moment, no reserved
> > regions are returned.
> >
> > At the moment reserved regions are stored per device.
> >
> > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >
> > ---
> >
> 
> [...]
> 
> > +
> > +static int virtio_iommu_fill_property(int devid, int type,
> > +                                      viommu_property_buffer
> > +*bufstate) {
> > +    int ret = -ENOSPC;
> > +
> > +    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
> > +        bufstate->error = true;
> > +        goto out;
> > +    }
> > +
> > +    switch (type) {
> > +    case VIRTIO_IOMMU_PROBE_T_NONE:
> > +        ret = virtio_iommu_fill_none_prop(bufstate);
> > +        break;
> > +    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
> > +    {
> > +        viommu_dev *dev = bufstate->dev;
> > +
> > +        g_tree_foreach(dev->reserved_regions,
> > +                       virtio_iommu_fill_resv_mem_prop,
> > +                       bufstate);
> > +        if (!bufstate->error) {
> > +            ret = 0;
> > +        }
> > +        break;
> > +    }
> > +    default:
> > +        ret = -ENOENT;
> > +        break;
> > +    }
> > +out:
> > +    if (ret) {
> > +        error_report("%s property of type=%d could not be filled (%d),"
> > +                     " remaining size = 0x%lx",
> > +                     __func__, type, ret, bufstate->filled);
> > +    }
> > +    return ret;
> > +}
> > +
> > +static int virtio_iommu_probe(VirtIOIOMMU *s,
> > +                              struct virtio_iommu_req_probe *req,
> > +                              uint8_t *buf) {
> > +    uint32_t devid = le32_to_cpu(req->device);
> > +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
> > +    viommu_property_buffer bufstate;
> > +    viommu_dev *dev;
> > +    int ret;
> > +
> > +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
> > +    if (!dev) {
> > +        return -EINVAL;
> > +    }
> > +
> > +    bufstate.start = buf;
> > +    bufstate.filled = 0;
> > +    bufstate.dev = dev;
> 
> bufstate.error is not initialized which may cause false alarm in
> virtio_iommu_fill_property()

I observed below prints 
	 "qemu-system-aarch64: virtio_iommu_fill_property property of type=2 could not be filled (-28), remaining size = 0x0 "

When I initialized the bufstate.error = 0,it goes.

Thanks
-Bharat

> 
> > +
> > +    while ((type = ctz32(prop_types)) != 32) {
> > +        ret = virtio_iommu_fill_property(devid, 1 << type, &bufstate);
> > +        if (ret) {
> > +            break;
> > +        }
> > +        prop_types &= ~(1 << type);
> > +    }
> > +    virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE,
> > + &bufstate);
> > +
> > +    return VIRTIO_IOMMU_S_OK;
> > +}
> > +
> >   #define get_payload_size(req) (\
> >   sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
> >
> > @@ -433,6 +567,24 @@ static int
> virtio_iommu_handle_unmap(VirtIOIOMMU *s,
> >       return virtio_iommu_unmap(s, &req);
> >   }
> 
> Thanks,
> Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (15 preceding siblings ...)
  2017-09-19  7:46 ` [Qemu-devel] [RFC v4 16/16] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger
@ 2017-09-27 11:07 ` Tomasz Nowicki
  2017-09-27 15:38   ` Auger Eric
  2017-10-11 14:56 ` Peter Maydell
  17 siblings, 1 reply; 32+ messages in thread
From: Tomasz Nowicki @ 2017-09-27 11:07 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, peter.maydell, alex.williamson, mst,
	qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: will.deacon, kevin.tian, marc.zyngier, christoffer.dall, drjones,
	wei, bharat.bhushan, peterx, linuc.decode

Hi Eric,

Out of curiosity, I compared your SMMUv3 full emulation against 
virtio-iommu. For virtio-net device behind the virtio-iommu I get 50% 
performance of what I can get for SMMUv3 emulation.

Do you have similar observations? Since there is no need to emulate HW 
behaviour in QEMU I expected virtio-iommu to be faster. I will run some 
more benchmarks.

Thanks,
Tomasz

On 19.09.2017 09:46, Eric Auger wrote:
> This series implements the virtio-iommu device.
> 
> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
> - probe request support although no reserved region is returned at
>    the moment
> - unmap semantics less strict, as specified in v0.4
> - device registration, attach/detach revisited
> - split into smaller patches to ease review
> - propose a way to inform the IOMMU mr about the page_size_mask
>    of underlying HW IOMMU, if any
> - remove warning associated with the translation of the MSI doorbell
> 
> The device gets instantiated using the "-device virtio-iommu-device"
> option. It currently works with ARM virt machine only, as the machine
> must handle the dt binding between the virtio-mmio "iommu" node and
> the PCI host bridge node.
> 
> The associated VHOST/VFIO adaptation is available on the branch
> below but is not officially delivered as part of this series.
> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4
> 
> References:
> [1] [RFC] virtio-iommu version 0.4
>      git://linux-arm.org/virtio-iommu.git branch viommu/v0.4
> 
> Testing:
> - guest kernel with v0.4 virtio-iommu driver (4kB page)
> - tested with guest using virtio-pci-net and vhost-net
>    (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
> - tested with VFIO and guest assigned with 2 VFs
> - tested with DPDK on guest with 2 assigned VFs
> 
> Not tested:
> - hot-plug/hot-unplug of EP: not implemented
> 
> History:
> v2-> v4:
> - see above
> 
> v2 -> v3:
> - rebase on top of 2.10-rc0 and especially
>    [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
> - add mutex init
> - fix as->mappings deletion using g_tree_ref/unref
> - when a dev is attached whereas it is already attached to
>    another address space, first detach it
> - fix some error values
> - page_sizes = TARGET_PAGE_MASK;
> - I haven't changed the unmap() semantics yet, waiting for the
>    next virtio-iommu spec revision.
> 
> v1 -> v2:
> - fix redifinition of viommu_as typedef
> 
> Eric Auger (16):
>    update-linux-headers: import virtio_iommu.h
>    linux-headers: Update for virtio-iommu
>    virtio-iommu: add skeleton
>    virtio-iommu: Decode the command payload
>    virtio-iommu: Add the iommu regions
>    virtio-iommu: Register attached devices
>    virtio-iommu: Implement attach/detach command
>    virtio-iommu: Implement map/unmap
>    virtio-iommu: Implement translate
>    virtio-iommu: Implement probe request
>    hw/arm/virt: Add 2.11 machine type
>    hw/arm/virt: Add virtio-iommu to the virt board
>    memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
>    hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
>    virtio-iommu: Implement set_page_size_mask
>    hw/vfio/common: Do not print error when viommu translates into an mmio
>      region
> 
>   hw/arm/virt.c                                 | 115 +++-
>   hw/vfio/common.c                              |   7 +-
>   hw/virtio/Makefile.objs                       |   1 +
>   hw/virtio/trace-events                        |  24 +
>   hw/virtio/virtio-iommu.c                      | 923 ++++++++++++++++++++++++++
>   include/exec/memory.h                         |   4 +
>   include/hw/arm/virt.h                         |   5 +
>   include/hw/vfio/vfio-common.h                 |   1 +
>   include/hw/virtio/virtio-iommu.h              |  61 ++
>   include/standard-headers/linux/virtio_ids.h   |   1 +
>   include/standard-headers/linux/virtio_iommu.h | 177 +++++
>   linux-headers/linux/virtio_iommu.h            |   1 +
>   scripts/update-linux-headers.sh               |   3 +
>   13 files changed, 1312 insertions(+), 11 deletions(-)
>   create mode 100644 hw/virtio/virtio-iommu.c
>   create mode 100644 include/hw/virtio/virtio-iommu.h
>   create mode 100644 include/standard-headers/linux/virtio_iommu.h
>   create mode 100644 linux-headers/linux/virtio_iommu.h
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-09-27 11:07 ` [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Tomasz Nowicki
@ 2017-09-27 15:38   ` Auger Eric
  0 siblings, 0 replies; 32+ messages in thread
From: Auger Eric @ 2017-09-27 15:38 UTC (permalink / raw)
  To: Tomasz Nowicki, eric.auger.pro, peter.maydell, alex.williamson,
	mst, qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: wei, kevin.tian, marc.zyngier, will.deacon, drjones, peterx,
	linuc.decode, bharat.bhushan, christoffer.dall

Hi Tomasz

On 27/09/2017 13:07, Tomasz Nowicki wrote:
> Hi Eric,
> 
> Out of curiosity, I compared your SMMUv3 full emulation against
> virtio-iommu. For virtio-net device behind the virtio-iommu I get 50%
> performance of what I can get for SMMUv3 emulation.
> 
> Do you have similar observations? Since there is no need to emulate HW
> behaviour in QEMU I expected virtio-iommu to be faster. I will run some
> more benchmarks.

No I don't have formal comparative figures at the moment. Benchmarking
is also on the top of my todo list though.

One reason could be that virtio-iommu sends requests for each map
operations whereas this does not happen for vsmmuv3 (in non caching
mode). You could try to turn the vsmmuv3 caching-mode option and I guess
you should have more similar perf figures between virtio-iommu and
vsmuv3.

Thanks

Eric
> 
> Thanks,
> Tomasz
> 
> On 19.09.2017 09:46, Eric Auger wrote:
>> This series implements the virtio-iommu device.
>>
>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>> - probe request support although no reserved region is returned at
>>    the moment
>> - unmap semantics less strict, as specified in v0.4
>> - device registration, attach/detach revisited
>> - split into smaller patches to ease review
>> - propose a way to inform the IOMMU mr about the page_size_mask
>>    of underlying HW IOMMU, if any
>> - remove warning associated with the translation of the MSI doorbell
>>
>> The device gets instantiated using the "-device virtio-iommu-device"
>> option. It currently works with ARM virt machine only, as the machine
>> must handle the dt binding between the virtio-mmio "iommu" node and
>> the PCI host bridge node.
>>
>> The associated VHOST/VFIO adaptation is available on the branch
>> below but is not officially delivered as part of this series.
>>
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/v2.10.0-virtio-iommu-v4
>>
>> References:
>> [1] [RFC] virtio-iommu version 0.4
>>      git://linux-arm.org/virtio-iommu.git branch viommu/v0.4
>>
>> Testing:
>> - guest kernel with v0.4 virtio-iommu driver (4kB page)
>> - tested with guest using virtio-pci-net and vhost-net
>>    (,vhost=on/off,iommu_platform,disable-modern=off,disable-legacy=on  )
>> - tested with VFIO and guest assigned with 2 VFs
>> - tested with DPDK on guest with 2 assigned VFs
>>
>> Not tested:
>> - hot-plug/hot-unplug of EP: not implemented
>>
>> History:
>> v2-> v4:
>> - see above
>>
>> v2 -> v3:
>> - rebase on top of 2.10-rc0 and especially
>>    [PATCH qemu v9 0/2] memory/iommu: QOM'fy IOMMU MemoryRegion
>> - add mutex init
>> - fix as->mappings deletion using g_tree_ref/unref
>> - when a dev is attached whereas it is already attached to
>>    another address space, first detach it
>> - fix some error values
>> - page_sizes = TARGET_PAGE_MASK;
>> - I haven't changed the unmap() semantics yet, waiting for the
>>    next virtio-iommu spec revision.
>>
>> v1 -> v2:
>> - fix redifinition of viommu_as typedef
>>
>> Eric Auger (16):
>>    update-linux-headers: import virtio_iommu.h
>>    linux-headers: Update for virtio-iommu
>>    virtio-iommu: add skeleton
>>    virtio-iommu: Decode the command payload
>>    virtio-iommu: Add the iommu regions
>>    virtio-iommu: Register attached devices
>>    virtio-iommu: Implement attach/detach command
>>    virtio-iommu: Implement map/unmap
>>    virtio-iommu: Implement translate
>>    virtio-iommu: Implement probe request
>>    hw/arm/virt: Add 2.11 machine type
>>    hw/arm/virt: Add virtio-iommu to the virt board
>>    memory.h: Add set_page_size_mask IOMMUMemoryRegion callback
>>    hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes
>>    virtio-iommu: Implement set_page_size_mask
>>    hw/vfio/common: Do not print error when viommu translates into an mmio
>>      region
>>
>>   hw/arm/virt.c                                 | 115 +++-
>>   hw/vfio/common.c                              |   7 +-
>>   hw/virtio/Makefile.objs                       |   1 +
>>   hw/virtio/trace-events                        |  24 +
>>   hw/virtio/virtio-iommu.c                      | 923
>> ++++++++++++++++++++++++++
>>   include/exec/memory.h                         |   4 +
>>   include/hw/arm/virt.h                         |   5 +
>>   include/hw/vfio/vfio-common.h                 |   1 +
>>   include/hw/virtio/virtio-iommu.h              |  61 ++
>>   include/standard-headers/linux/virtio_ids.h   |   1 +
>>   include/standard-headers/linux/virtio_iommu.h | 177 +++++
>>   linux-headers/linux/virtio_iommu.h            |   1 +
>>   scripts/update-linux-headers.sh               |   3 +
>>   13 files changed, 1312 insertions(+), 11 deletions(-)
>>   create mode 100644 hw/virtio/virtio-iommu.c
>>   create mode 100644 include/hw/virtio/virtio-iommu.h
>>   create mode 100644 include/standard-headers/linux/virtio_iommu.h
>>   create mode 100644 linux-headers/linux/virtio_iommu.h
>>
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
  2017-09-27 10:53   ` Tomasz Nowicki
  2017-09-27 11:00     ` Bharat Bhushan
@ 2017-09-27 15:40     ` Auger Eric
  1 sibling, 0 replies; 32+ messages in thread
From: Auger Eric @ 2017-09-27 15:40 UTC (permalink / raw)
  To: Tomasz Nowicki, eric.auger.pro, peter.maydell, alex.williamson,
	mst, qemu-arm, qemu-devel, jean-philippe.brucker
  Cc: wei, kevin.tian, marc.zyngier, will.deacon, drjones, peterx,
	linuc.decode, bharat.bhushan, christoffer.dall

Hi Tomasz,
On 27/09/2017 12:53, Tomasz Nowicki wrote:
> Hi Eric,
> 
> On 19.09.2017 09:46, Eric Auger wrote:
>> This patch implements the PROBE request. At the moment,
>> no reserved regions are returned.
>>
>> At the moment reserved regions are stored per device.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
> 
> [...]
> 
>> +
>> +static int virtio_iommu_fill_property(int devid, int type,
>> +                                      viommu_property_buffer *bufstate)
>> +{
>> +    int ret = -ENOSPC;
>> +
>> +    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
>> +        bufstate->error = true;
>> +        goto out;
>> +    }
>> +
>> +    switch (type) {
>> +    case VIRTIO_IOMMU_PROBE_T_NONE:
>> +        ret = virtio_iommu_fill_none_prop(bufstate);
>> +        break;
>> +    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
>> +    {
>> +        viommu_dev *dev = bufstate->dev;
>> +
>> +        g_tree_foreach(dev->reserved_regions,
>> +                       virtio_iommu_fill_resv_mem_prop,
>> +                       bufstate);
>> +        if (!bufstate->error) {
>> +            ret = 0;
>> +        }
>> +        break;
>> +    }
>> +    default:
>> +        ret = -ENOENT;
>> +        break;
>> +    }
>> +out:
>> +    if (ret) {
>> +        error_report("%s property of type=%d could not be filled (%d),"
>> +                     " remaining size = 0x%lx",
>> +                     __func__, type, ret, bufstate->filled);
>> +    }
>> +    return ret;
>> +}
>> +
>> +static int virtio_iommu_probe(VirtIOIOMMU *s,
>> +                              struct virtio_iommu_req_probe *req,
>> +                              uint8_t *buf)
>> +{
>> +    uint32_t devid = le32_to_cpu(req->device);
>> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
>> +    viommu_property_buffer bufstate;
>> +    viommu_dev *dev;
>> +    int ret;
>> +
>> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
>> +    if (!dev) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    bufstate.start = buf;
>> +    bufstate.filled = 0;
>> +    bufstate.dev = dev;
> 
> bufstate.error is not initialized which may cause false alarm in
> virtio_iommu_fill_property()
thanks for spotting that. I owe you several fixes in both vsmmuv3 and
virtio-iommu. Thank you for testing, again!

Best Regards

Eric
> 
>> +
>> +    while ((type = ctz32(prop_types)) != 32) {
>> +        ret = virtio_iommu_fill_property(devid, 1 << type, &bufstate);
>> +        if (ret) {
>> +            break;
>> +        }
>> +        prop_types &= ~(1 << type);
>> +    }
>> +    virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE,
>> &bufstate);
>> +
>> +    return VIRTIO_IOMMU_S_OK;
>> +}
>> +
>>   #define get_payload_size(req) (\
>>   sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
>>   @@ -433,6 +567,24 @@ static int
>> virtio_iommu_handle_unmap(VirtIOIOMMU *s,
>>       return virtio_iommu_unmap(s, &req);
>>   }
> 
> Thanks,
> Tomasz
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request
  2017-09-27 11:00     ` Bharat Bhushan
@ 2017-09-27 15:44       ` Auger Eric
  0 siblings, 0 replies; 32+ messages in thread
From: Auger Eric @ 2017-09-27 15:44 UTC (permalink / raw)
  To: Bharat Bhushan, Tomasz Nowicki, eric.auger.pro, peter.maydell,
	alex.williamson, mst, qemu-arm, qemu-devel,
	jean-philippe.brucker
  Cc: wei, kevin.tian, marc.zyngier, will.deacon, drjones, peterx,
	linuc.decode, christoffer.dall

Hi Bharat,

On 27/09/2017 13:00, Bharat Bhushan wrote:
> 
> 
>> -----Original Message-----
>> From: Tomasz Nowicki [mailto:tnowicki@caviumnetworks.com]
>> Sent: Wednesday, September 27, 2017 4:23 PM
>> To: Eric Auger <eric.auger@redhat.com>; eric.auger.pro@gmail.com;
>> peter.maydell@linaro.org; alex.williamson@redhat.com; mst@redhat.com;
>> qemu-arm@nongnu.org; qemu-devel@nongnu.org; jean-
>> philippe.brucker@arm.com
>> Cc: will.deacon@arm.com; kevin.tian@intel.com; marc.zyngier@arm.com;
>> christoffer.dall@linaro.org; drjones@redhat.com; wei@redhat.com; Bharat
>> Bhushan <bharat.bhushan@nxp.com>; peterx@redhat.com;
>> linuc.decode@gmail.com
>> Subject: Re: [RFC v4 10/16] virtio-iommu: Implement probe request
>>
>> Hi Eric,
>>
>> On 19.09.2017 09:46, Eric Auger wrote:
>>> This patch implements the PROBE request. At the moment, no reserved
>>> regions are returned.
>>>
>>> At the moment reserved regions are stored per device.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>>
>>
>> [...]
>>
>>> +
>>> +static int virtio_iommu_fill_property(int devid, int type,
>>> +                                      viommu_property_buffer
>>> +*bufstate) {
>>> +    int ret = -ENOSPC;
>>> +
>>> +    if (bufstate->filled + 4 >= VIOMMU_PROBE_SIZE) {
>>> +        bufstate->error = true;
>>> +        goto out;
>>> +    }
>>> +
>>> +    switch (type) {
>>> +    case VIRTIO_IOMMU_PROBE_T_NONE:
>>> +        ret = virtio_iommu_fill_none_prop(bufstate);
>>> +        break;
>>> +    case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
>>> +    {
>>> +        viommu_dev *dev = bufstate->dev;
>>> +
>>> +        g_tree_foreach(dev->reserved_regions,
>>> +                       virtio_iommu_fill_resv_mem_prop,
>>> +                       bufstate);
>>> +        if (!bufstate->error) {
>>> +            ret = 0;
>>> +        }
>>> +        break;
>>> +    }
>>> +    default:
>>> +        ret = -ENOENT;
>>> +        break;
>>> +    }
>>> +out:
>>> +    if (ret) {
>>> +        error_report("%s property of type=%d could not be filled (%d),"
>>> +                     " remaining size = 0x%lx",
>>> +                     __func__, type, ret, bufstate->filled);
>>> +    }
>>> +    return ret;
>>> +}
>>> +
>>> +static int virtio_iommu_probe(VirtIOIOMMU *s,
>>> +                              struct virtio_iommu_req_probe *req,
>>> +                              uint8_t *buf) {
>>> +    uint32_t devid = le32_to_cpu(req->device);
>>> +    int16_t prop_types = SUPPORTED_PROBE_PROPERTIES, type;
>>> +    viommu_property_buffer bufstate;
>>> +    viommu_dev *dev;
>>> +    int ret;
>>> +
>>> +    dev = g_tree_lookup(s->devices, GUINT_TO_POINTER(devid));
>>> +    if (!dev) {
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    bufstate.start = buf;
>>> +    bufstate.filled = 0;
>>> +    bufstate.dev = dev;
>>
>> bufstate.error is not initialized which may cause false alarm in
>> virtio_iommu_fill_property()
> 
> I observed below prints 
> 	 "qemu-system-aarch64: virtio_iommu_fill_property property of type=2 could not be filled (-28), remaining size = 0x0 "
> 
> When I initialized the bufstate.error = 0,it goes.

Sure, I will fix that soon.

Best Regards

Eric
> 
> Thanks
> -Bharat
> 
>>
>>> +
>>> +    while ((type = ctz32(prop_types)) != 32) {
>>> +        ret = virtio_iommu_fill_property(devid, 1 << type, &bufstate);
>>> +        if (ret) {
>>> +            break;
>>> +        }
>>> +        prop_types &= ~(1 << type);
>>> +    }
>>> +    virtio_iommu_fill_property(devid, VIRTIO_IOMMU_PROBE_T_NONE,
>>> + &bufstate);
>>> +
>>> +    return VIRTIO_IOMMU_S_OK;
>>> +}
>>> +
>>>   #define get_payload_size(req) (\
>>>   sizeof((req)) - sizeof(struct virtio_iommu_req_tail))
>>>
>>> @@ -433,6 +567,24 @@ static int
>> virtio_iommu_handle_unmap(VirtIOIOMMU *s,
>>>       return virtio_iommu_unmap(s, &req);
>>>   }
>>
>> Thanks,
>> Tomasz

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
                   ` (16 preceding siblings ...)
  2017-09-27 11:07 ` [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Tomasz Nowicki
@ 2017-10-11 14:56 ` Peter Maydell
  2017-10-11 16:08   ` Auger Eric
  17 siblings, 1 reply; 32+ messages in thread
From: Peter Maydell @ 2017-10-11 14:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, Alex Williamson, Michael S. Tsirkin, qemu-arm,
	QEMU Developers, jean-philippe.brucker, Will Deacon, Tian, Kevin,
	Marc Zyngier, Christoffer Dall, Andrew Jones, Wei Huang,
	Tomasz Nowicki, Bharat Bhushan, Peter Xu, linuc.decode

On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
> This series implements the virtio-iommu device.
>
> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
> - probe request support although no reserved region is returned at
>   the moment
> - unmap semantics less strict, as specified in v0.4
> - device registration, attach/detach revisited
> - split into smaller patches to ease review
> - propose a way to inform the IOMMU mr about the page_size_mask
>   of underlying HW IOMMU, if any
> - remove warning associated with the translation of the MSI doorbell
>
> The device gets instantiated using the "-device virtio-iommu-device"
> option. It currently works with ARM virt machine only, as the machine
> must handle the dt binding between the virtio-mmio "iommu" node and
> the PCI host bridge node.

Could this work on x86, or is it inherently arm-only?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-11 14:56 ` Peter Maydell
@ 2017-10-11 16:08   ` Auger Eric
  2017-10-12  9:54     ` Peter Maydell
  0 siblings, 1 reply; 32+ messages in thread
From: Auger Eric @ 2017-10-11 16:08 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Wei Huang, Tian, Kevin, Andrew Jones, Michael S. Tsirkin,
	jean-philippe.brucker, Tomasz Nowicki, Will Deacon,
	QEMU Developers, Peter Xu, Marc Zyngier, Alex Williamson,
	qemu-arm, linuc.decode, Bharat Bhushan, Christoffer Dall,
	eric.auger.pro

Hi Peter,

On 11/10/2017 16:56, Peter Maydell wrote:
> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>> This series implements the virtio-iommu device.
>>
>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>> - probe request support although no reserved region is returned at
>>   the moment
>> - unmap semantics less strict, as specified in v0.4
>> - device registration, attach/detach revisited
>> - split into smaller patches to ease review
>> - propose a way to inform the IOMMU mr about the page_size_mask
>>   of underlying HW IOMMU, if any
>> - remove warning associated with the translation of the MSI doorbell
>>
>> The device gets instantiated using the "-device virtio-iommu-device"
>> option. It currently works with ARM virt machine only, as the machine
>> must handle the dt binding between the virtio-mmio "iommu" node and
>> the PCI host bridge node.
> 
> Could this work on x86, or is it inherently arm-only?

Yes this is the goal. At the moment the ACPI probing is not yet properly
specified but a Q35 prototype was developed in the Red Hat Virt team.
This will be presented at the KVM forum.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-11 16:08   ` Auger Eric
@ 2017-10-12  9:54     ` Peter Maydell
  2017-10-12 10:09       ` Auger Eric
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Maydell @ 2017-10-12  9:54 UTC (permalink / raw)
  To: Auger Eric
  Cc: Wei Huang, Tian, Kevin, Andrew Jones, Michael S. Tsirkin,
	jean-philippe.brucker, Tomasz Nowicki, Will Deacon,
	QEMU Developers, Peter Xu, Marc Zyngier, Alex Williamson,
	qemu-arm, linuc.decode, Bharat Bhushan, Christoffer Dall,
	eric.auger.pro

On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
> Hi Peter,
>
> On 11/10/2017 16:56, Peter Maydell wrote:
>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>> This series implements the virtio-iommu device.
>>>
>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>> - probe request support although no reserved region is returned at
>>>   the moment
>>> - unmap semantics less strict, as specified in v0.4
>>> - device registration, attach/detach revisited
>>> - split into smaller patches to ease review
>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>   of underlying HW IOMMU, if any
>>> - remove warning associated with the translation of the MSI doorbell
>>>
>>> The device gets instantiated using the "-device virtio-iommu-device"
>>> option. It currently works with ARM virt machine only, as the machine
>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>> the PCI host bridge node.
>>
>> Could this work on x86, or is it inherently arm-only?
>
> Yes this is the goal. At the moment the ACPI probing is not yet properly
> specified but a Q35 prototype was developed in the Red Hat Virt team.
> This will be presented at the KVM forum.

Since I have very little familiarity with virtio or iommu code,
I'd be much happier if this was reviewed as a generic virtio-iommu
by the x86/virtio devs and then the arm specific parts done second...

I'm also not clear on what we're expecting the recommended or normal
way to do device passthrough is going to be -- this virtio-mmio,
or presenting the guest with an SMMUv3 interface? Do we really
need to implement both ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-12  9:54     ` Peter Maydell
@ 2017-10-12 10:09       ` Auger Eric
  2017-10-12 10:46         ` Jean-Philippe Brucker
  2017-10-13  7:01         ` Tian, Kevin
  0 siblings, 2 replies; 32+ messages in thread
From: Auger Eric @ 2017-10-12 10:09 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Wei Huang, Tian, Kevin, Andrew Jones, Michael S. Tsirkin,
	jean-philippe.brucker, Tomasz Nowicki, Will Deacon,
	QEMU Developers, Peter Xu, Marc Zyngier, Alex Williamson,
	qemu-arm, linuc.decode, Bharat Bhushan, Christoffer Dall,
	eric.auger.pro

Hi Peter,

On 12/10/2017 11:54, Peter Maydell wrote:
> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>> Hi Peter,
>>
>> On 11/10/2017 16:56, Peter Maydell wrote:
>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>> This series implements the virtio-iommu device.
>>>>
>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>> - probe request support although no reserved region is returned at
>>>>   the moment
>>>> - unmap semantics less strict, as specified in v0.4
>>>> - device registration, attach/detach revisited
>>>> - split into smaller patches to ease review
>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>   of underlying HW IOMMU, if any
>>>> - remove warning associated with the translation of the MSI doorbell
>>>>
>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>> option. It currently works with ARM virt machine only, as the machine
>>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>>> the PCI host bridge node.
>>>
>>> Could this work on x86, or is it inherently arm-only?
>>
>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>> This will be presented at the KVM forum.
> 
> Since I have very little familiarity with virtio or iommu code,
> I'd be much happier if this was reviewed as a generic virtio-iommu
> by the x86/virtio devs and then the arm specific parts done second...

Understood. I was rather expecting you to review the smmuv3 emulation
code which you did, in a comprehensive manner ;-), and many thanks for that.

Note sure this is time yet to get this RFC reviewed as
- the v0.4 virtio-iommu driver it relies on was not officially submitted,
- the virtio-iommu specification review has not really been reviewed,
- the ACPI probing method has not been discussed yet.

Jean-Philippe, please correct me if I am wrong.

So to me, this is pure RFC at the moment.
> 
> I'm also not clear on what we're expecting the recommended or normal
> way to do device passthrough is going to be -- this virtio-mmio,
> or presenting the guest with an SMMUv3 interface? Do we really
> need to implement both ?

I think the KVM forum is the right place to sync as both approaches will
be presented and some pros/cons + performance figures will be given.

As we talk about choosing, there is one alternative that was suggested
on the ML by Alex & Michael but never really get considered yet and
maybe should be: using intel iommu emulation code for ARM. I aknowledge
this deserves a thorough impact study on kernel and FW side but I would
be happy to get your opinion about the QEMU side. Would you have a by-
principle rejection of this idea to instantiate such an Intel device in
mach virt or would it be something you would be ready to consider?

Thanks

Eric


> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-12 10:09       ` Auger Eric
@ 2017-10-12 10:46         ` Jean-Philippe Brucker
  2017-10-13  7:01         ` Tian, Kevin
  1 sibling, 0 replies; 32+ messages in thread
From: Jean-Philippe Brucker @ 2017-10-12 10:46 UTC (permalink / raw)
  To: Auger Eric, Peter Maydell
  Cc: Wei Huang, Tian, Kevin, Andrew Jones, Michael S. Tsirkin,
	Tomasz Nowicki, Will Deacon, QEMU Developers, Peter Xu,
	Marc Zyngier, Alex Williamson, qemu-arm, linuc.decode,
	Bharat Bhushan, Christoffer Dall, eric.auger.pro

On 12/10/17 11:09, Auger Eric wrote:
> Hi Peter,
> 
> On 12/10/2017 11:54, Peter Maydell wrote:
>> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>>> Hi Peter,
>>>
>>> On 11/10/2017 16:56, Peter Maydell wrote:
>>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com> wrote:
>>>>> This series implements the virtio-iommu device.
>>>>>
>>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>>> - probe request support although no reserved region is returned at
>>>>>   the moment
>>>>> - unmap semantics less strict, as specified in v0.4
>>>>> - device registration, attach/detach revisited
>>>>> - split into smaller patches to ease review
>>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>>   of underlying HW IOMMU, if any
>>>>> - remove warning associated with the translation of the MSI doorbell
>>>>>
>>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>>> option. It currently works with ARM virt machine only, as the machine
>>>>> must handle the dt binding between the virtio-mmio "iommu" node and
>>>>> the PCI host bridge node.
>>>>
>>>> Could this work on x86, or is it inherently arm-only?
>>>
>>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>>> This will be presented at the KVM forum.
>>
>> Since I have very little familiarity with virtio or iommu code,
>> I'd be much happier if this was reviewed as a generic virtio-iommu
>> by the x86/virtio devs and then the arm specific parts done second...
> 
> Understood. I was rather expecting you to review the smmuv3 emulation
> code which you did, in a comprehensive manner ;-), and many thanks for that.
> 
> Note sure this is time yet to get this RFC reviewed as
> - the v0.4 virtio-iommu driver it relies on was not officially submitted,
> - the virtio-iommu specification review has not really been reviewed,
> - the ACPI probing method has not been discussed yet.
> 
> Jean-Philippe, please correct me if I am wrong.
> 
> So to me, this is pure RFC at the moment.

Yes, having it as an RFC for the moment allowed to break compatibility
between versions of the specification. The downside is that it probably
doesn't get as many eyeballs as it would without an RFC tag (although I
did receive tonnes of helpful comments). Maybe v0.5, that should be
published shortly, can be a candidate for mainline.

Thanks,
Jean

>> I'm also not clear on what we're expecting the recommended or normal
>> way to do device passthrough is going to be -- this virtio-mmio,
>> or presenting the guest with an SMMUv3 interface? Do we really
>> need to implement both ?
> 
> I think the KVM forum is the right place to sync as both approaches will
> be presented and some pros/cons + performance figures will be given.
> 
> As we talk about choosing, there is one alternative that was suggested
> on the ML by Alex & Michael but never really get considered yet and
> maybe should be: using intel iommu emulation code for ARM. I aknowledge
> this deserves a thorough impact study on kernel and FW side but I would
> be happy to get your opinion about the QEMU side. Would you have a by-
> principle rejection of this idea to instantiate such an Intel device in
> mach virt or would it be something you would be ready to consider?
> 
> Thanks
> 
> Eric
> 
> 
>>
>> thanks
>> -- PMM
>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-12 10:09       ` Auger Eric
  2017-10-12 10:46         ` Jean-Philippe Brucker
@ 2017-10-13  7:01         ` Tian, Kevin
  2017-10-13  7:43           ` Auger Eric
  1 sibling, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2017-10-13  7:01 UTC (permalink / raw)
  To: Auger Eric, Peter Maydell
  Cc: Wei Huang, Andrew Jones, Michael S. Tsirkin,
	jean-philippe.brucker, Tomasz Nowicki, Will Deacon,
	QEMU Developers, Peter Xu, Marc Zyngier, Alex Williamson,
	qemu-arm, linuc.decode, Bharat Bhushan, Christoffer Dall,
	eric.auger.pro

> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: Thursday, October 12, 2017 6:10 PM
> 
> Hi Peter,
> 
> On 12/10/2017 11:54, Peter Maydell wrote:
> > On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
> >> Hi Peter,
> >>
> >> On 11/10/2017 16:56, Peter Maydell wrote:
> >>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com>
> wrote:
> >>>> This series implements the virtio-iommu device.
> >>>>
> >>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
> >>>> - probe request support although no reserved region is returned at
> >>>>   the moment
> >>>> - unmap semantics less strict, as specified in v0.4
> >>>> - device registration, attach/detach revisited
> >>>> - split into smaller patches to ease review
> >>>> - propose a way to inform the IOMMU mr about the page_size_mask
> >>>>   of underlying HW IOMMU, if any
> >>>> - remove warning associated with the translation of the MSI doorbell
> >>>>
> >>>> The device gets instantiated using the "-device virtio-iommu-device"
> >>>> option. It currently works with ARM virt machine only, as the machine
> >>>> must handle the dt binding between the virtio-mmio "iommu" node
> and
> >>>> the PCI host bridge node.
> >>>
> >>> Could this work on x86, or is it inherently arm-only?
> >>
> >> Yes this is the goal. At the moment the ACPI probing is not yet properly
> >> specified but a Q35 prototype was developed in the Red Hat Virt team.
> >> This will be presented at the KVM forum.
> >
> > Since I have very little familiarity with virtio or iommu code,
> > I'd be much happier if this was reviewed as a generic virtio-iommu
> > by the x86/virtio devs and then the arm specific parts done second...
> 
> Understood. I was rather expecting you to review the smmuv3 emulation
> code which you did, in a comprehensive manner ;-), and many thanks for
> that.
> 
> Note sure this is time yet to get this RFC reviewed as
> - the v0.4 virtio-iommu driver it relies on was not officially submitted,
> - the virtio-iommu specification review has not really been reviewed,
> - the ACPI probing method has not been discussed yet.
> 
> Jean-Philippe, please correct me if I am wrong.
> 
> So to me, this is pure RFC at the moment.
> >
> > I'm also not clear on what we're expecting the recommended or normal
> > way to do device passthrough is going to be -- this virtio-mmio,
> > or presenting the guest with an SMMUv3 interface? Do we really
> > need to implement both ?
> 
> I think the KVM forum is the right place to sync as both approaches will
> be presented and some pros/cons + performance figures will be given.
> 
> As we talk about choosing, there is one alternative that was suggested
> on the ML by Alex & Michael but never really get considered yet and
> maybe should be: using intel iommu emulation code for ARM. I
> aknowledge
> this deserves a thorough impact study on kernel and FW side but I would
> be happy to get your opinion about the QEMU side. Would you have a by-
> principle rejection of this idea to instantiate such an Intel device in
> mach virt or would it be something you would be ready to consider?
> 

bear posting a link to Alex/Michael's comment? interesting to know
the rationale...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device
  2017-10-13  7:01         ` Tian, Kevin
@ 2017-10-13  7:43           ` Auger Eric
  0 siblings, 0 replies; 32+ messages in thread
From: Auger Eric @ 2017-10-13  7:43 UTC (permalink / raw)
  To: Tian, Kevin, Peter Maydell
  Cc: Wei Huang, Andrew Jones, Michael S. Tsirkin,
	jean-philippe.brucker, Tomasz Nowicki, Will Deacon,
	QEMU Developers, Peter Xu, Marc Zyngier, Alex Williamson,
	qemu-arm, linuc.decode, Bharat Bhushan, Christoffer Dall,
	eric.auger.pro

Hi Kevin,

On 13/10/2017 09:01, Tian, Kevin wrote:
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: Thursday, October 12, 2017 6:10 PM
>>
>> Hi Peter,
>>
>> On 12/10/2017 11:54, Peter Maydell wrote:
>>> On 11 October 2017 at 17:08, Auger Eric <eric.auger@redhat.com> wrote:
>>>> Hi Peter,
>>>>
>>>> On 11/10/2017 16:56, Peter Maydell wrote:
>>>>> On 19 September 2017 at 08:46, Eric Auger <eric.auger@redhat.com>
>> wrote:
>>>>>> This series implements the virtio-iommu device.
>>>>>>
>>>>>> This v4 is an upgrade to v0.4 spec [1] and applies on QEMU v2.10.0.
>>>>>> - probe request support although no reserved region is returned at
>>>>>>   the moment
>>>>>> - unmap semantics less strict, as specified in v0.4
>>>>>> - device registration, attach/detach revisited
>>>>>> - split into smaller patches to ease review
>>>>>> - propose a way to inform the IOMMU mr about the page_size_mask
>>>>>>   of underlying HW IOMMU, if any
>>>>>> - remove warning associated with the translation of the MSI doorbell
>>>>>>
>>>>>> The device gets instantiated using the "-device virtio-iommu-device"
>>>>>> option. It currently works with ARM virt machine only, as the machine
>>>>>> must handle the dt binding between the virtio-mmio "iommu" node
>> and
>>>>>> the PCI host bridge node.
>>>>>
>>>>> Could this work on x86, or is it inherently arm-only?
>>>>
>>>> Yes this is the goal. At the moment the ACPI probing is not yet properly
>>>> specified but a Q35 prototype was developed in the Red Hat Virt team.
>>>> This will be presented at the KVM forum.
>>>
>>> Since I have very little familiarity with virtio or iommu code,
>>> I'd be much happier if this was reviewed as a generic virtio-iommu
>>> by the x86/virtio devs and then the arm specific parts done second...
>>
>> Understood. I was rather expecting you to review the smmuv3 emulation
>> code which you did, in a comprehensive manner ;-), and many thanks for
>> that.
>>
>> Note sure this is time yet to get this RFC reviewed as
>> - the v0.4 virtio-iommu driver it relies on was not officially submitted,
>> - the virtio-iommu specification review has not really been reviewed,
>> - the ACPI probing method has not been discussed yet.
>>
>> Jean-Philippe, please correct me if I am wrong.
>>
>> So to me, this is pure RFC at the moment.
>>>
>>> I'm also not clear on what we're expecting the recommended or normal
>>> way to do device passthrough is going to be -- this virtio-mmio,
>>> or presenting the guest with an SMMUv3 interface? Do we really
>>> need to implement both ?
>>
>> I think the KVM forum is the right place to sync as both approaches will
>> be presented and some pros/cons + performance figures will be given.
>>
>> As we talk about choosing, there is one alternative that was suggested
>> on the ML by Alex & Michael but never really get considered yet and
>> maybe should be: using intel iommu emulation code for ARM. I
>> aknowledge
>> this deserves a thorough impact study on kernel and FW side but I would
>> be happy to get your opinion about the QEMU side. Would you have a by-
>> principle rejection of this idea to instantiate such an Intel device in
>> mach virt or would it be something you would be ready to consider?
>>
> 
> bear posting a link to Alex/Michael's comment? interesting to know
> the rationale...

This was suggested here for instance:

https://lkml.org/lkml/2017/7/12/579

I think rationale was
- this is an emulated platform so there is more freedom
- vtd emulation code is rather stable
- it has pieces missing on smmu: cachine mode, IOTLB invalidation
command with addr_mask,
- intel iommu driver implements deferred IOTLB Invalidation which boosts
perf

But problems I foresee are:
- MSI handling. ARM MSI doorbells can be anywhere in the GPA address
space whereas Intel has MSIs within the APIC window: [FEE0_0000h –
FEF0_000h]. So I suspect the MSI handling may not work at kernel level.
- intel IOMMU input/output address ranges and page sizes may be
different from ARM ones.
- ARM uses ACPI IORT for binding RC <-> IOMMU <-> MSI controller whereas
Intel uses other tables
- Intel's dmar kernel code must be compiled/enabled on ARM

So personally I don't think this solution is viable but I prefer this
gets discussed on the ML.

Thanks

Eric
> 
> Thanks
> Kevin
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-10-13  7:44 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19  7:46 [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 01/16] update-linux-headers: import virtio_iommu.h Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 02/16] linux-headers: Update for virtio-iommu Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 03/16] virtio-iommu: add skeleton Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 04/16] virtio-iommu: Decode the command payload Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 05/16] virtio-iommu: Add the iommu regions Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 06/16] virtio-iommu: Register attached devices Eric Auger
2017-09-22  7:29   ` Bharat Bhushan
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 07/16] virtio-iommu: Implement attach/detach command Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 08/16] virtio-iommu: Implement map/unmap Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 09/16] virtio-iommu: Implement translate Eric Auger
2017-09-22  6:52   ` Bharat Bhushan
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 10/16] virtio-iommu: Implement probe request Eric Auger
2017-09-27 10:53   ` Tomasz Nowicki
2017-09-27 11:00     ` Bharat Bhushan
2017-09-27 15:44       ` Auger Eric
2017-09-27 15:40     ` Auger Eric
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 11/16] hw/arm/virt: Add 2.11 machine type Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 12/16] hw/arm/virt: Add virtio-iommu to the virt board Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 13/16] memory.h: Add set_page_size_mask IOMMUMemoryRegion callback Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 14/16] hw/vfio/common: Set the IOMMUMemoryRegion supported page sizes Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 15/16] virtio-iommu: Implement set_page_size_mask Eric Auger
2017-09-19  7:46 ` [Qemu-devel] [RFC v4 16/16] hw/vfio/common: Do not print error when viommu translates into an mmio region Eric Auger
2017-09-27 11:07 ` [Qemu-devel] [RFC v4 00/16] VIRTIO-IOMMU device Tomasz Nowicki
2017-09-27 15:38   ` Auger Eric
2017-10-11 14:56 ` Peter Maydell
2017-10-11 16:08   ` Auger Eric
2017-10-12  9:54     ` Peter Maydell
2017-10-12 10:09       ` Auger Eric
2017-10-12 10:46         ` Jean-Philippe Brucker
2017-10-13  7:01         ` Tian, Kevin
2017-10-13  7:43           ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.