qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs
@ 2020-08-18 16:45 Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier Philippe Mathieu-Daudé
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Alex Williamson,
	Stefan Hajnoczi, Philippe Mathieu-Daudé

This series intends to setup the VFIO helper to allow
binding notifiers on different IRQs.

For the NVMe use case, we only care about MSIX interrupts.
To not disrupt other users, introduce the qemu_vfio_pci_init_msix_irqs
function to initialize multiple MSIX IRQs and attach eventfd to
them.

Since RFC v2:
- new patch to report vfio-helpers is not supported on AA64/POWER

(NVMe block driver series will follow).

Based-on: <20200812185014.18267-1-philmd@redhat.com>
"block/nvme: Various cleanups required to use multiple queues"
https://www.mail-archive.com/qemu-devel@nongnu.org/msg729395.html

Philippe Mathieu-Daudé (5):
  block/nvme: Use an array of EventNotifier
  util/vfio-helpers: Report error on unsupported host architectures
  util/vfio-helpers: Store eventfd using int32_t type
  util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs()
  block/nvme: Use qemu_vfio_pci_init_msix_irqs() to initialize our IRQ

 include/qemu/vfio-helpers.h |  2 +
 block/nvme.c                | 30 +++++++++-----
 util/vfio-helpers.c         | 83 +++++++++++++++++++++++++++++++++++--
 3 files changed, 101 insertions(+), 14 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier
  2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
@ 2020-08-18 16:45 ` Philippe Mathieu-Daudé
  2020-08-19  8:08   ` Stefan Hajnoczi
  2020-08-18 16:45 ` [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Alex Williamson,
	Stefan Hajnoczi, Philippe Mathieu-Daudé

In preparation of using multiple IRQ (thus multiple eventfds)
make BDRVNVMeState::irq_notifier an array (for now of a single
element, the admin queue notifier).

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index a61e86a83eb..cdd16d451e7 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -106,6 +106,9 @@ QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells) != 0x1000);
 #define INDEX_ADMIN     0
 #define INDEX_IO(n)     (1 + n)
 
+/* This driver shares a single MSIX IRQ for the admin and I/O queues */
+#define MSIX_IRQ_COUNT  1
+
 struct BDRVNVMeState {
     AioContext *aio_context;
     QEMUVFIOState *vfio;
@@ -120,7 +123,7 @@ struct BDRVNVMeState {
     /* How many uint32_t elements does each doorbell entry take. */
     size_t doorbell_scale;
     bool write_cache_supported;
-    EventNotifier irq_notifier;
+    EventNotifier irq_notifier[MSIX_IRQ_COUNT];
 
     uint64_t nsze; /* Namespace size reported by identify command */
     int nsid;      /* The namespace id to read/write data. */
@@ -631,7 +634,8 @@ static bool nvme_poll_queues(BDRVNVMeState *s)
 
 static void nvme_handle_event(EventNotifier *n)
 {
-    BDRVNVMeState *s = container_of(n, BDRVNVMeState, irq_notifier);
+    BDRVNVMeState *s = container_of(n, BDRVNVMeState,
+                                    irq_notifier[INDEX_ADMIN]);
 
     trace_nvme_handle_event(s);
     event_notifier_test_and_clear(n);
@@ -683,7 +687,8 @@ out_error:
 static bool nvme_poll_cb(void *opaque)
 {
     EventNotifier *e = opaque;
-    BDRVNVMeState *s = container_of(e, BDRVNVMeState, irq_notifier);
+    BDRVNVMeState *s = container_of(e, BDRVNVMeState,
+                                    irq_notifier[INDEX_ADMIN]);
 
     trace_nvme_poll_cb(s);
     return nvme_poll_queues(s);
@@ -705,7 +710,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
     s->device = g_strdup(device);
     s->nsid = namespace;
     s->aio_context = bdrv_get_aio_context(bs);
-    ret = event_notifier_init(&s->irq_notifier, 0);
+    ret = event_notifier_init(&s->irq_notifier[INDEX_ADMIN], 0);
     if (ret) {
         error_setg(errp, "Failed to init event notifier");
         return ret;
@@ -784,12 +789,13 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         }
     }
 
-    ret = qemu_vfio_pci_init_irq(s->vfio, &s->irq_notifier,
+    ret = qemu_vfio_pci_init_irq(s->vfio, s->irq_notifier,
                                  VFIO_PCI_MSIX_IRQ_INDEX, errp);
     if (ret) {
         goto out;
     }
-    aio_set_event_notifier(bdrv_get_aio_context(bs), &s->irq_notifier,
+    aio_set_event_notifier(bdrv_get_aio_context(bs),
+                           &s->irq_notifier[INDEX_ADMIN],
                            false, nvme_handle_event, nvme_poll_cb);
 
     nvme_identify(bs, namespace, &local_err);
@@ -872,9 +878,10 @@ static void nvme_close(BlockDriverState *bs)
         nvme_free_queue_pair(s->queues[i]);
     }
     g_free(s->queues);
-    aio_set_event_notifier(bdrv_get_aio_context(bs), &s->irq_notifier,
+    aio_set_event_notifier(bdrv_get_aio_context(bs),
+                           &s->irq_notifier[INDEX_ADMIN],
                            false, NULL, NULL);
-    event_notifier_cleanup(&s->irq_notifier);
+    event_notifier_cleanup(&s->irq_notifier[INDEX_ADMIN]);
     qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE);
     qemu_vfio_close(s->vfio);
 
@@ -1381,7 +1388,8 @@ static void nvme_detach_aio_context(BlockDriverState *bs)
         q->completion_bh = NULL;
     }
 
-    aio_set_event_notifier(bdrv_get_aio_context(bs), &s->irq_notifier,
+    aio_set_event_notifier(bdrv_get_aio_context(bs),
+                           &s->irq_notifier[INDEX_ADMIN],
                            false, NULL, NULL);
 }
 
@@ -1391,7 +1399,7 @@ static void nvme_attach_aio_context(BlockDriverState *bs,
     BDRVNVMeState *s = bs->opaque;
 
     s->aio_context = new_context;
-    aio_set_event_notifier(new_context, &s->irq_notifier,
+    aio_set_event_notifier(new_context, &s->irq_notifier[INDEX_ADMIN],
                            false, nvme_handle_event, nvme_poll_cb);
 
     for (int i = 0; i < s->nr_queues; i++) {
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures
  2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier Philippe Mathieu-Daudé
@ 2020-08-18 16:45 ` Philippe Mathieu-Daudé
  2020-08-18 17:12   ` Alex Williamson
  2020-08-18 16:45 ` [RFC PATCH v3 3/5] util/vfio-helpers: Store eventfd using int32_t type Philippe Mathieu-Daudé
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, Drew Jones, qemu-block, Laurent Vivier,
	Max Reitz, Eric Auger, Alex Williamson, Stefan Hajnoczi,
	Philippe Mathieu-Daudé,
	David Gibson

The vfio-helpers implementation expects a TYPEv1 IOMMU, see
qemu_vfio_init_pci:

  263     if (!ioctl(s->container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
  264         error_setg_errno(errp, errno, "VFIO IOMMU check failed");

Thus POWER SPAPR IOMMU is obviously not supported.

The implementation only cares about host page size alignment
(usually 4KB on X86), not the IOMMU one, which is be problematic
on Aarch64, when 64MB page size is used. So Aarch64 is not
supported neither.

Report an error when the host architecture is different than X86:

 $ qemu-system-aarch64 \
    -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 \
    -device virtio-blk-pci,drive=drive0
  qemu-system-aarch64: -drive file=nvme://0001:01:00.0/1,if=none,id=drive0: QEMU VFIO utility is not supported on this architecture

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Drew Jones <drjones@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
---
 util/vfio-helpers.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index e399e330e26..60017936e3e 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -420,14 +420,38 @@ static void qemu_vfio_open_common(QEMUVFIOState *s)
     qemu_ram_foreach_block(qemu_vfio_init_ramblock, s);
 }
 
+/**
+ * Return if the host architecture is supported.
+ *
+ * aarch64: IOMMU page alignment not respected
+ * ppc64:   SPAPR IOMMU window not configured
+ * x86-64:  Only architecture validated
+ * other:   Untested
+ */
+static bool qemu_vfio_arch_supported(void)
+{
+    bool supported = false;
+
+#if defined(HOST_X86_64)
+    supported = true;
+#endif
+
+    return supported;
+}
 /**
  * Open a PCI device, e.g. "0000:00:01.0".
  */
 QEMUVFIOState *qemu_vfio_open_pci(const char *device, Error **errp)
 {
     int r;
-    QEMUVFIOState *s = g_new0(QEMUVFIOState, 1);
+    QEMUVFIOState *s;
 
+    if (!qemu_vfio_arch_supported()) {
+        error_setg(errp,
+                   "QEMU VFIO utility is not supported on this architecture");
+        return NULL;
+    }
+    s = g_new0(QEMUVFIOState, 1);
     r = qemu_vfio_init_pci(s, device, errp);
     if (r) {
         g_free(s);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 3/5] util/vfio-helpers: Store eventfd using int32_t type
  2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures Philippe Mathieu-Daudé
@ 2020-08-18 16:45 ` Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs() Philippe Mathieu-Daudé
  2020-08-18 16:45 ` [RFC PATCH v3 5/5] block/nvme: Use qemu_vfio_pci_init_msix_irqs() to initialize our IRQ Philippe Mathieu-Daudé
  4 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Alex Williamson,
	Stefan Hajnoczi, Philippe Mathieu-Daudé

Per the documentation in linux-headers/linux/vfio.h:

 VFIO_DEVICE_SET_IRQS

 * DATA_EVENTFD binds the specified ACTION to the provided __s32 eventfd.

Replace the 'int' by an 'int32_t' to match the documentation.

Fixes: 418026ca43 ("util: Introduce vfio helpers")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 util/vfio-helpers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 60017936e3e..696f2d51712 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -193,7 +193,7 @@ int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
         return -EINVAL;
     }
 
-    irq_set_size = sizeof(*irq_set) + sizeof(int);
+    irq_set_size = sizeof(*irq_set) + sizeof(int32_t);
     irq_set = g_malloc0(irq_set_size);
 
     /* Get to a known IRQ state */
@@ -205,7 +205,7 @@ int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
         .count = 1,
     };
 
-    *(int *)&irq_set->data = event_notifier_get_fd(e);
+    *(int32_t *)&irq_set->data = event_notifier_get_fd(e);
     r = ioctl(s->device, VFIO_DEVICE_SET_IRQS, irq_set);
     g_free(irq_set);
     if (r) {
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs()
  2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
                   ` (2 preceding siblings ...)
  2020-08-18 16:45 ` [RFC PATCH v3 3/5] util/vfio-helpers: Store eventfd using int32_t type Philippe Mathieu-Daudé
@ 2020-08-18 16:45 ` Philippe Mathieu-Daudé
  2020-08-18 17:25   ` Alex Williamson
  2020-08-18 16:45 ` [RFC PATCH v3 5/5] block/nvme: Use qemu_vfio_pci_init_msix_irqs() to initialize our IRQ Philippe Mathieu-Daudé
  4 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Alex Williamson,
	Stefan Hajnoczi, Philippe Mathieu-Daudé

qemu_vfio_pci_init_irq() allows us to initialize any type of IRQ,
but only one. Introduce qemu_vfio_pci_init_msix_irqs() which is
specific to MSIX IRQ type, and allow us to use multiple IRQs
(thus passing multiple eventfd notifiers).

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 include/qemu/vfio-helpers.h |  2 ++
 util/vfio-helpers.c         | 53 +++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/include/qemu/vfio-helpers.h b/include/qemu/vfio-helpers.h
index 1f057c2b9e4..63108ebc8da 100644
--- a/include/qemu/vfio-helpers.h
+++ b/include/qemu/vfio-helpers.h
@@ -28,5 +28,7 @@ void qemu_vfio_pci_unmap_bar(QEMUVFIOState *s, int index, void *bar,
                              uint64_t offset, uint64_t size);
 int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
                            int irq_type, Error **errp);
+int qemu_vfio_pci_init_msix_irqs(QEMUVFIOState *s, EventNotifier *e,
+                                 unsigned irq_count, Error **errp);
 
 #endif
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 696f2d51712..fb3a79a5bcb 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -215,6 +215,59 @@ int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
     return 0;
 }
 
+/**
+ * Initialize device MSIX IRQs and register event notifiers.
+ * @irq_count: number of MSIX IRQs to initialize
+ * @e: Array of @irq_count notifiers (each corresponding to a MSIX IRQ)
+ */
+int qemu_vfio_pci_init_msix_irqs(QEMUVFIOState *s, EventNotifier *e,
+                                 unsigned irq_count, Error **errp)
+{
+    int r;
+    struct vfio_irq_set *irq_set;
+    size_t irq_set_size;
+    struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
+
+    irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX;
+    if (ioctl(s->device, VFIO_DEVICE_GET_IRQ_INFO, &irq_info)) {
+        error_setg_errno(errp, errno, "Failed to get device interrupt info");
+        return -errno;
+    }
+    if (irq_info.count <= irq_count) {
+        error_setg(errp,
+                   "Not enough device interrupts available (only %" PRIu32 ")",
+                   irq_info.count);
+        return -EINVAL;
+    }
+    if (!(irq_info.flags & VFIO_IRQ_INFO_EVENTFD)) {
+        error_setg(errp, "Device interrupt doesn't support eventfd");
+        return -EINVAL;
+    }
+
+    irq_set_size = sizeof(*irq_set) + irq_count * sizeof(int32_t);
+    irq_set = g_malloc0(irq_set_size);
+
+    /* Get to a known IRQ state */
+    *irq_set = (struct vfio_irq_set) {
+        .argsz = irq_set_size,
+        .flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER,
+        .index = irq_info.index,
+        .start = 0,
+        .count = irq_count,
+    };
+
+    for (unsigned i = 0; i < irq_count; i++) {
+        ((int32_t *)&irq_set->data)[i] = event_notifier_get_fd(&e[i]);
+    }
+    r = ioctl(s->device, VFIO_DEVICE_SET_IRQS, irq_set);
+    g_free(irq_set);
+    if (r) {
+        error_setg_errno(errp, errno, "Failed to setup device interrupts");
+        return -errno;
+    }
+    return 0;
+}
+
 static int qemu_vfio_pci_read_config(QEMUVFIOState *s, void *buf,
                                      int size, int ofs)
 {
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v3 5/5] block/nvme: Use qemu_vfio_pci_init_msix_irqs() to initialize our IRQ
  2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
                   ` (3 preceding siblings ...)
  2020-08-18 16:45 ` [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs() Philippe Mathieu-Daudé
@ 2020-08-18 16:45 ` Philippe Mathieu-Daudé
  4 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 16:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Alex Williamson,
	Stefan Hajnoczi, Philippe Mathieu-Daudé

Instead of initializing one MSIX IRQ with the generic
qemu_vfio_pci_init_irq() function, use the MSIX specific one which
will allow us to use multiple IRQs. For now we provide an array of
a single IRQ.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 block/nvme.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index cdd16d451e7..cb86ba2518d 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -789,8 +789,8 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
         }
     }
 
-    ret = qemu_vfio_pci_init_irq(s->vfio, s->irq_notifier,
-                                 VFIO_PCI_MSIX_IRQ_INDEX, errp);
+    ret = qemu_vfio_pci_init_msix_irqs(s->vfio, s->irq_notifier,
+                                       MSIX_IRQ_COUNT, errp);
     if (ret) {
         goto out;
     }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures
  2020-08-18 16:45 ` [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures Philippe Mathieu-Daudé
@ 2020-08-18 17:12   ` Alex Williamson
  2020-08-18 17:26     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2020-08-18 17:12 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, Drew Jones, qemu-block, Laurent Vivier,
	qemu-devel, Max Reitz, Eric Auger, Stefan Hajnoczi, David Gibson

On Tue, 18 Aug 2020 18:45:06 +0200
Philippe Mathieu-Daudé <philmd@redhat.com> wrote:

> The vfio-helpers implementation expects a TYPEv1 IOMMU, see
> qemu_vfio_init_pci:
> 
>   263     if (!ioctl(s->container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
>   264         error_setg_errno(errp, errno, "VFIO IOMMU check failed");
> 
> Thus POWER SPAPR IOMMU is obviously not supported.
> 
> The implementation only cares about host page size alignment
> (usually 4KB on X86), not the IOMMU one, which is be problematic
> on Aarch64, when 64MB page size is used. So Aarch64 is not
> supported neither.
> 
> Report an error when the host architecture is different than X86:
> 
>  $ qemu-system-aarch64 \
>     -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 \
>     -device virtio-blk-pci,drive=drive0
>   qemu-system-aarch64: -drive file=nvme://0001:01:00.0/1,if=none,id=drive0: QEMU VFIO utility is not supported on this architecture
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Drew Jones <drjones@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> ---
>  util/vfio-helpers.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
> index e399e330e26..60017936e3e 100644
> --- a/util/vfio-helpers.c
> +++ b/util/vfio-helpers.c
> @@ -420,14 +420,38 @@ static void qemu_vfio_open_common(QEMUVFIOState *s)
>      qemu_ram_foreach_block(qemu_vfio_init_ramblock, s);
>  }
>  
> +/**
> + * Return if the host architecture is supported.
> + *
> + * aarch64: IOMMU page alignment not respected
> + * ppc64:   SPAPR IOMMU window not configured
> + * x86-64:  Only architecture validated
> + * other:   Untested
> + */
> +static bool qemu_vfio_arch_supported(void)
> +{
> +    bool supported = false;
> +
> +#if defined(HOST_X86_64)
> +    supported = true;
> +#endif
> +
> +    return supported;
> +}

Why does this need to be hard coded to specific architectures rather
than probing for type1 IOMMU support and looking at the iova_pgsizes
from VFIO_IOMMU_GET_INFO to see if there's a compatible size?  It
requires us to get a bit deeper into the device initialization, but we
should still be able to unwind out of the device realize.  Otherwise
we're throwing out aarch64 running of 4KB for no reason, right?  Thanks,

Alex


>  /**
>   * Open a PCI device, e.g. "0000:00:01.0".
>   */
>  QEMUVFIOState *qemu_vfio_open_pci(const char *device, Error **errp)
>  {
>      int r;
> -    QEMUVFIOState *s = g_new0(QEMUVFIOState, 1);
> +    QEMUVFIOState *s;
>  
> +    if (!qemu_vfio_arch_supported()) {
> +        error_setg(errp,
> +                   "QEMU VFIO utility is not supported on this architecture");
> +        return NULL;
> +    }
> +    s = g_new0(QEMUVFIOState, 1);
>      r = qemu_vfio_init_pci(s, device, errp);
>      if (r) {
>          g_free(s);



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs()
  2020-08-18 16:45 ` [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs() Philippe Mathieu-Daudé
@ 2020-08-18 17:25   ` Alex Williamson
  0 siblings, 0 replies; 11+ messages in thread
From: Alex Williamson @ 2020-08-18 17:25 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
	Stefan Hajnoczi

On Tue, 18 Aug 2020 18:45:08 +0200
Philippe Mathieu-Daudé <philmd@redhat.com> wrote:

> qemu_vfio_pci_init_irq() allows us to initialize any type of IRQ,
> but only one. Introduce qemu_vfio_pci_init_msix_irqs() which is
> specific to MSIX IRQ type, and allow us to use multiple IRQs
> (thus passing multiple eventfd notifiers).
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  include/qemu/vfio-helpers.h |  2 ++
>  util/vfio-helpers.c         | 53 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+)
> 
> diff --git a/include/qemu/vfio-helpers.h b/include/qemu/vfio-helpers.h
> index 1f057c2b9e4..63108ebc8da 100644
> --- a/include/qemu/vfio-helpers.h
> +++ b/include/qemu/vfio-helpers.h
> @@ -28,5 +28,7 @@ void qemu_vfio_pci_unmap_bar(QEMUVFIOState *s, int index, void *bar,
>                               uint64_t offset, uint64_t size);
>  int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
>                             int irq_type, Error **errp);
> +int qemu_vfio_pci_init_msix_irqs(QEMUVFIOState *s, EventNotifier *e,
> +                                 unsigned irq_count, Error **errp);
>  
>  #endif
> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
> index 696f2d51712..fb3a79a5bcb 100644
> --- a/util/vfio-helpers.c
> +++ b/util/vfio-helpers.c
> @@ -215,6 +215,59 @@ int qemu_vfio_pci_init_irq(QEMUVFIOState *s, EventNotifier *e,
>      return 0;
>  }
>  
> +/**
> + * Initialize device MSIX IRQs and register event notifiers.
> + * @irq_count: number of MSIX IRQs to initialize
> + * @e: Array of @irq_count notifiers (each corresponding to a MSIX IRQ)
> + */
> +int qemu_vfio_pci_init_msix_irqs(QEMUVFIOState *s, EventNotifier *e,
> +                                 unsigned irq_count, Error **errp)
> +{
> +    int r;
> +    struct vfio_irq_set *irq_set;
> +    size_t irq_set_size;
> +    struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
> +
> +    irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX;

Nit, this could be initialized in the declaration with argsz.

> +    if (ioctl(s->device, VFIO_DEVICE_GET_IRQ_INFO, &irq_info)) {
> +        error_setg_errno(errp, errno, "Failed to get device interrupt info");
> +        return -errno;
> +    }
> +    if (irq_info.count <= irq_count) {


Shouldn't this only test strictly less than?  The API seems to leave
the problem of determining how many vectors might be available as an
exercise for the caller.  Thanks,

Alex


> +        error_setg(errp,
> +                   "Not enough device interrupts available (only %" PRIu32 ")",
> +                   irq_info.count);
> +        return -EINVAL;
> +    }
> +    if (!(irq_info.flags & VFIO_IRQ_INFO_EVENTFD)) {
> +        error_setg(errp, "Device interrupt doesn't support eventfd");
> +        return -EINVAL;
> +    }
> +
> +    irq_set_size = sizeof(*irq_set) + irq_count * sizeof(int32_t);
> +    irq_set = g_malloc0(irq_set_size);
> +
> +    /* Get to a known IRQ state */
> +    *irq_set = (struct vfio_irq_set) {
> +        .argsz = irq_set_size,
> +        .flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER,
> +        .index = irq_info.index,
> +        .start = 0,
> +        .count = irq_count,
> +    };
> +
> +    for (unsigned i = 0; i < irq_count; i++) {
> +        ((int32_t *)&irq_set->data)[i] = event_notifier_get_fd(&e[i]);
> +    }
> +    r = ioctl(s->device, VFIO_DEVICE_SET_IRQS, irq_set);
> +    g_free(irq_set);
> +    if (r) {
> +        error_setg_errno(errp, errno, "Failed to setup device interrupts");
> +        return -errno;
> +    }
> +    return 0;
> +}
> +
>  static int qemu_vfio_pci_read_config(QEMUVFIOState *s, void *buf,
>                                       int size, int ofs)
>  {



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures
  2020-08-18 17:12   ` Alex Williamson
@ 2020-08-18 17:26     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-18 17:26 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Fam Zheng, Kevin Wolf, Drew Jones, qemu-block, Laurent Vivier,
	qemu-devel, Max Reitz, Eric Auger, Stefan Hajnoczi, David Gibson

On 8/18/20 7:12 PM, Alex Williamson wrote:
> On Tue, 18 Aug 2020 18:45:06 +0200
> Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> 
>> The vfio-helpers implementation expects a TYPEv1 IOMMU, see
>> qemu_vfio_init_pci:
>>
>>   263     if (!ioctl(s->container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
>>   264         error_setg_errno(errp, errno, "VFIO IOMMU check failed");
>>
>> Thus POWER SPAPR IOMMU is obviously not supported.
>>
>> The implementation only cares about host page size alignment
>> (usually 4KB on X86), not the IOMMU one, which is be problematic
>> on Aarch64, when 64MB page size is used. So Aarch64 is not
>> supported neither.
>>
>> Report an error when the host architecture is different than X86:
>>
>>  $ qemu-system-aarch64 \
>>     -drive file=nvme://0001:01:00.0/1,if=none,id=drive0 \
>>     -device virtio-blk-pci,drive=drive0
>>   qemu-system-aarch64: -drive file=nvme://0001:01:00.0/1,if=none,id=drive0: QEMU VFIO utility is not supported on this architecture
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>> Cc: Eric Auger <eric.auger@redhat.com>
>> Cc: Drew Jones <drjones@redhat.com>
>> Cc: Laurent Vivier <lvivier@redhat.com>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> ---
>>  util/vfio-helpers.c | 26 +++++++++++++++++++++++++-
>>  1 file changed, 25 insertions(+), 1 deletion(-)
>>
>> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
>> index e399e330e26..60017936e3e 100644
>> --- a/util/vfio-helpers.c
>> +++ b/util/vfio-helpers.c
>> @@ -420,14 +420,38 @@ static void qemu_vfio_open_common(QEMUVFIOState *s)
>>      qemu_ram_foreach_block(qemu_vfio_init_ramblock, s);
>>  }
>>  
>> +/**
>> + * Return if the host architecture is supported.
>> + *
>> + * aarch64: IOMMU page alignment not respected
>> + * ppc64:   SPAPR IOMMU window not configured
>> + * x86-64:  Only architecture validated
>> + * other:   Untested
>> + */
>> +static bool qemu_vfio_arch_supported(void)
>> +{
>> +    bool supported = false;
>> +
>> +#if defined(HOST_X86_64)
>> +    supported = true;
>> +#endif
>> +
>> +    return supported;
>> +}
> 
> Why does this need to be hard coded to specific architectures rather
> than probing for type1 IOMMU support and looking at the iova_pgsizes
> from VFIO_IOMMU_GET_INFO to see if there's a compatible size?  It
> requires us to get a bit deeper into the device initialization, but we
> should still be able to unwind out of the device realize.  Otherwise
> we're throwing out aarch64 running of 4KB for no reason, right?  Thanks,

Ah yes, much clever! Thanks Alex :)

> 
> Alex
> 
> 
>>  /**
>>   * Open a PCI device, e.g. "0000:00:01.0".
>>   */
>>  QEMUVFIOState *qemu_vfio_open_pci(const char *device, Error **errp)
>>  {
>>      int r;
>> -    QEMUVFIOState *s = g_new0(QEMUVFIOState, 1);
>> +    QEMUVFIOState *s;
>>  
>> +    if (!qemu_vfio_arch_supported()) {
>> +        error_setg(errp,
>> +                   "QEMU VFIO utility is not supported on this architecture");
>> +        return NULL;
>> +    }
>> +    s = g_new0(QEMUVFIOState, 1);
>>      r = qemu_vfio_init_pci(s, device, errp);
>>      if (r) {
>>          g_free(s);
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier
  2020-08-18 16:45 ` [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier Philippe Mathieu-Daudé
@ 2020-08-19  8:08   ` Stefan Hajnoczi
  2020-08-19 15:55     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2020-08-19  8:08 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
	Alex Williamson

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Tue, Aug 18, 2020 at 06:45:05PM +0200, Philippe Mathieu-Daudé wrote:
> In preparation of using multiple IRQ (thus multiple eventfds)
> make BDRVNVMeState::irq_notifier an array (for now of a single
> element, the admin queue notifier).
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  block/nvme.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)

This looks like an intermediate step before using multiple irqs. I think
it makes the code confusing because on one hand INDEX_ADMIN gives the
impression that INDEX_IO() should be used for io queues, while on the
other hand only a single EventNotifier is allocated and we actually
can't use INDEX_IO() yet.

If this intermediate patch is really necessary, please don't use
INDEX_ADMIN. Define a new constant instead:

  /* This driver shares a single MSIX IRQ for the admin and I/O queues */
  #define MSIX_SHARED_IRQ_IDX 0

In the future the array index can be changed to INDEX_ADMIN and
INDEX_IO(n) when there are multiple EventNotifiers.

I think that would make the code clearer.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier
  2020-08-19  8:08   ` Stefan Hajnoczi
@ 2020-08-19 15:55     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-08-19 15:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, Kevin Wolf, qemu-block, qemu-devel, Max Reitz,
	Alex Williamson

On 8/19/20 10:08 AM, Stefan Hajnoczi wrote:
> On Tue, Aug 18, 2020 at 06:45:05PM +0200, Philippe Mathieu-Daudé wrote:
>> In preparation of using multiple IRQ (thus multiple eventfds)
>> make BDRVNVMeState::irq_notifier an array (for now of a single
>> element, the admin queue notifier).
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>>  block/nvme.c | 28 ++++++++++++++++++----------
>>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> This looks like an intermediate step before using multiple irqs. I think
> it makes the code confusing because on one hand INDEX_ADMIN gives the
> impression that INDEX_IO() should be used for io queues, while on the
> other hand only a single EventNotifier is allocated and we actually
> can't use INDEX_IO() yet.
> 
> If this intermediate patch is really necessary, please don't use
> INDEX_ADMIN. Define a new constant instead:
> 
>   /* This driver shares a single MSIX IRQ for the admin and I/O queues */
>   #define MSIX_SHARED_IRQ_IDX 0
> 
> In the future the array index can be changed to INDEX_ADMIN and
> INDEX_IO(n) when there are multiple EventNotifiers.
> 
> I think that would make the code clearer.

Very good idea, thanks!



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-19 15:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 16:45 [RFC PATCH v3 0/5] util/vfio-helpers: Add support for multiple IRQs Philippe Mathieu-Daudé
2020-08-18 16:45 ` [RFC PATCH v3 1/5] block/nvme: Use an array of EventNotifier Philippe Mathieu-Daudé
2020-08-19  8:08   ` Stefan Hajnoczi
2020-08-19 15:55     ` Philippe Mathieu-Daudé
2020-08-18 16:45 ` [RFC PATCH v3 2/5] util/vfio-helpers: Report error on unsupported host architectures Philippe Mathieu-Daudé
2020-08-18 17:12   ` Alex Williamson
2020-08-18 17:26     ` Philippe Mathieu-Daudé
2020-08-18 16:45 ` [RFC PATCH v3 3/5] util/vfio-helpers: Store eventfd using int32_t type Philippe Mathieu-Daudé
2020-08-18 16:45 ` [RFC PATCH v3 4/5] util/vfio-helpers: Introduce qemu_vfio_pci_init_msix_irqs() Philippe Mathieu-Daudé
2020-08-18 17:25   ` Alex Williamson
2020-08-18 16:45 ` [RFC PATCH v3 5/5] block/nvme: Use qemu_vfio_pci_init_msix_irqs() to initialize our IRQ Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).