All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/17] vfio-user server in QEMU
@ 2022-04-19 20:44 Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines Jagannathan Raman
                   ` (17 more replies)
  0 siblings, 18 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Hi,

This is v8 of the server side changes to enable vfio-user in QEMU.

Thank you very much for reviewing the last revision of this series!

We've made the following changes in this revision:

[PATCH v8 06/17] vfio-user: build library
  - updated libvfio-user to the latest

[PATCH v8 07/17] vfio-user: define vfio-user-server object
  - changed auto_shutdown to a per-instance property than
    a per-class property

[PATCH v8 12/17] vfio-user: IOMMU support for remote device
  - lock mutex while looking up hash table
  - removed global hash table - added RemoteIommu object to
    house this variable
  - added unplug handler to remove per-device IOMMU entry
    when a PCIDevice is unplugged

[PATCH v8 14/17] vfio-user: handle PCI BAR accesses
  - refactored vfu_object_bar_rw()
  - vfu_object_bar_rw() handles short sections returned by
    memory_region_find()

[PATCH v8 15/17] vfio-user: handle device interrupts
  - removed callbacks for msi_notify() and msix_notify()
  - added callbacks for msi_send_message() and
    msi(x)_get_message() operations

Thank you!

Jagannathan Raman (17):
  tests/avocado: Specify target VM argument to helper routines
  qdev: unplug blocker for devices
  remote/machine: add HotplugHandler for remote machine
  remote/machine: add vfio-user property
  configure: require cmake 3.19 or newer
  vfio-user: build library
  vfio-user: define vfio-user-server object
  vfio-user: instantiate vfio-user context
  vfio-user: find and init PCI device
  vfio-user: run vfio-user context
  vfio-user: handle PCI config space accesses
  vfio-user: IOMMU support for remote device
  vfio-user: handle DMA mappings
  vfio-user: handle PCI BAR accesses
  vfio-user: handle device interrupts
  vfio-user: handle reset of remote device
  vfio-user: avocado tests for vfio-user

 configure                                  |  36 +-
 meson.build                                |  44 +-
 qapi/misc.json                             |  23 +
 qapi/qom.json                              |  20 +-
 include/exec/memory.h                      |   3 +
 include/hw/pci/pci.h                       |  13 +
 include/hw/qdev-core.h                     |  29 +
 include/hw/remote/iommu.h                  |  40 +
 include/hw/remote/machine.h                |   4 +
 include/hw/remote/vfio-user-obj.h          |   6 +
 hw/core/qdev.c                             |  24 +
 hw/pci/msi.c                               |  16 +-
 hw/pci/msix.c                              |  10 +-
 hw/pci/pci.c                               |  13 +
 hw/remote/iommu.c                          | 114 +++
 hw/remote/machine.c                        |  88 +-
 hw/remote/vfio-user-obj.c                  | 891 +++++++++++++++++++++
 softmmu/physmem.c                          |   4 +-
 softmmu/qdev-monitor.c                     |   4 +
 stubs/vfio-user-obj.c                      |   6 +
 tests/qtest/fuzz/generic_fuzz.c            |   9 +-
 .gitlab-ci.d/buildtest.yml                 |   2 +
 .gitmodules                                |   3 +
 Kconfig.host                               |   4 +
 MAINTAINERS                                |   6 +
 hw/remote/Kconfig                          |   4 +
 hw/remote/meson.build                      |   4 +
 hw/remote/trace-events                     |  11 +
 meson_options.txt                          |   3 +
 stubs/meson.build                          |   1 +
 subprojects/libvfio-user                   |   1 +
 tests/avocado/avocado_qemu/__init__.py     |  14 +-
 tests/avocado/vfio-user.py                 | 164 ++++
 tests/docker/dockerfiles/centos8.docker    |   2 +
 tests/docker/dockerfiles/ubuntu2004.docker |   2 +
 35 files changed, 1594 insertions(+), 24 deletions(-)
 create mode 100644 include/hw/remote/iommu.h
 create mode 100644 include/hw/remote/vfio-user-obj.h
 create mode 100644 hw/remote/iommu.c
 create mode 100644 hw/remote/vfio-user-obj.c
 create mode 100644 stubs/vfio-user-obj.c
 create mode 160000 subprojects/libvfio-user
 create mode 100644 tests/avocado/vfio-user.py

-- 
2.20.1



^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 02/17] qdev: unplug blocker for devices Jagannathan Raman
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Specify target VM for exec_command and
exec_command_and_wait_for_pattern routines

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Beraldo Leal <bleal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/avocado/avocado_qemu/__init__.py | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tests/avocado/avocado_qemu/__init__.py b/tests/avocado/avocado_qemu/__init__.py
index ac85e36a4d..18a34a798c 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -198,7 +198,7 @@ def wait_for_console_pattern(test, success_message, failure_message=None,
     """
     _console_interaction(test, success_message, failure_message, None, vm=vm)
 
-def exec_command(test, command):
+def exec_command(test, command, vm=None):
     """
     Send a command to a console (appending CRLF characters), while logging
     the content.
@@ -207,11 +207,14 @@ def exec_command(test, command):
     :type test: :class:`avocado_qemu.QemuSystemTest`
     :param command: the command to send
     :type command: str
+    :param vm: target vm
+    :type vm: :class:`qemu.machine.QEMUMachine`
     """
-    _console_interaction(test, None, None, command + '\r')
+    _console_interaction(test, None, None, command + '\r', vm=vm)
 
 def exec_command_and_wait_for_pattern(test, command,
-                                      success_message, failure_message=None):
+                                      success_message, failure_message=None,
+                                      vm=None):
     """
     Send a command to a console (appending CRLF characters), then wait
     for success_message to appear on the console, while logging the.
@@ -223,8 +226,11 @@ def exec_command_and_wait_for_pattern(test, command,
     :param command: the command to send
     :param success_message: if this message appears, test succeeds
     :param failure_message: if this message appears, test fails
+    :param vm: target vm
+    :type vm: :class:`qemu.machine.QEMUMachine`
     """
-    _console_interaction(test, success_message, failure_message, command + '\r')
+    _console_interaction(test, success_message, failure_message, command + '\r',
+                         vm=vm)
 
 class QemuBaseTest(avocado.Test):
     def _get_unique_tag_val(self, tag_name):
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 02/17] qdev: unplug blocker for devices
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-21 14:55   ` Markus Armbruster
  2022-04-19 20:44 ` [PATCH v8 03/17] remote/machine: add HotplugHandler for remote machine Jagannathan Raman
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Add blocker to prevent hot-unplug of devices

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
 hw/core/qdev.c         | 24 ++++++++++++++++++++++++
 softmmu/qdev-monitor.c |  4 ++++
 3 files changed, 57 insertions(+)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 92c3d65208..1b9fa25e5c 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -193,6 +193,7 @@ struct DeviceState {
     int instance_id_alias;
     int alias_required_for_version;
     ResettableState reset;
+    GSList *unplug_blockers;
 };
 
 struct DeviceListener {
@@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
 void qdev_machine_creation_done(void);
 bool qdev_machine_modified(void);
 
+/*
+ * qdev_add_unplug_blocker: Adds an unplug blocker to a device
+ *
+ * @dev: Device to be blocked from unplug
+ * @reason: Reason for blocking
+ */
+void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
+
+/*
+ * qdev_del_unplug_blocker: Removes an unplug blocker from a device
+ *
+ * @dev: Device to be unblocked
+ * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
+ *          Used as a handle to lookup the blocker for deletion.
+ */
+void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
+
+/*
+ * qdev_unplug_blocked: Confirms if a device is blocked from unplug
+ *
+ * @dev: Device to be tested
+ * @reason: Returns one of the reasons why the device is blocked,
+ *          if any
+ *
+ * Returns: true if device is blocked from unplug, false otherwise
+ */
+bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
+
 /**
  * GpioPolarity: Polarity of a GPIO line
  *
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 84f3019440..0806d8fcaa 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev)
     return NULL;
 }
 
+void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
+{
+    dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
+}
+
+void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
+{
+    dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
+}
+
+bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
+{
+    ERRP_GUARD();
+
+    if (dev->unplug_blockers) {
+        error_propagate(errp, error_copy(dev->unplug_blockers->data));
+        return true;
+    }
+
+    return false;
+}
+
 static bool device_get_realized(Object *obj, Error **errp)
 {
     DeviceState *dev = DEVICE(obj);
@@ -704,6 +726,8 @@ static void device_finalize(Object *obj)
 
     DeviceState *dev = DEVICE(obj);
 
+    g_assert(!dev->unplug_blockers);
+
     QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
         QLIST_REMOVE(ngl, node);
         qemu_free_irqs(ngl->in, ngl->num_in);
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 12fe60c467..9cfd59d17c 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp)
     HotplugHandlerClass *hdc;
     Error *local_err = NULL;
 
+    if (qdev_unplug_blocked(dev, errp)) {
+        return;
+    }
+
     if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) {
         error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name);
         return;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 03/17] remote/machine: add HotplugHandler for remote machine
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 02/17] qdev: unplug blocker for devices Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 04/17] remote/machine: add vfio-user property Jagannathan Raman
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Allow hotplugging of PCI(e) devices to remote machine

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/remote/machine.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index 952105eab5..0c5bd4f923 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -21,6 +21,7 @@
 #include "qapi/error.h"
 #include "hw/pci/pci_host.h"
 #include "hw/remote/iohub.h"
+#include "hw/qdev-core.h"
 
 static void remote_machine_init(MachineState *machine)
 {
@@ -54,14 +55,19 @@ static void remote_machine_init(MachineState *machine)
 
     pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
                  &s->iohub, REMOTE_IOHUB_NB_PIRQS);
+
+    qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
 }
 
 static void remote_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
+    HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
     mc->init = remote_machine_init;
     mc->desc = "Experimental remote machine";
+
+    hc->unplug = qdev_simple_device_unplug_cb;
 }
 
 static const TypeInfo remote_machine = {
@@ -69,6 +75,10 @@ static const TypeInfo remote_machine = {
     .parent = TYPE_MACHINE,
     .instance_size = sizeof(RemoteMachineState),
     .class_init = remote_machine_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_HOTPLUG_HANDLER },
+        { }
+    }
 };
 
 static void remote_machine_register_types(void)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 04/17] remote/machine: add vfio-user property
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (2 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 03/17] remote/machine: add HotplugHandler for remote machine Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 05/17] configure: require cmake 3.19 or newer Jagannathan Raman
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Add vfio-user to x-remote machine. It is a boolean, which indicates if
the machine supports vfio-user protocol. The machine configures the bus
differently vfio-user and multiprocess protocols, so this property
informs it on how to configure the bus.

This property should be short lived. Once vfio-user fully replaces
multiprocess, this property could be removed.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/hw/remote/machine.h |  2 ++
 hw/remote/machine.c         | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h
index 2a2a33c4b2..8d0fa98d33 100644
--- a/include/hw/remote/machine.h
+++ b/include/hw/remote/machine.h
@@ -22,6 +22,8 @@ struct RemoteMachineState {
 
     RemotePCIHost *host;
     RemoteIOHubState iohub;
+
+    bool vfio_user;
 };
 
 /* Used to pass to co-routine device and ioc. */
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index 0c5bd4f923..a9a75e170f 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -59,6 +59,25 @@ static void remote_machine_init(MachineState *machine)
     qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
 }
 
+static bool remote_machine_get_vfio_user(Object *obj, Error **errp)
+{
+    RemoteMachineState *s = REMOTE_MACHINE(obj);
+
+    return s->vfio_user;
+}
+
+static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp)
+{
+    RemoteMachineState *s = REMOTE_MACHINE(obj);
+
+    if (phase_check(PHASE_MACHINE_CREATED)) {
+        error_setg(errp, "Error enabling vfio-user - machine already created");
+        return;
+    }
+
+    s->vfio_user = value;
+}
+
 static void remote_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -68,6 +87,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
     mc->desc = "Experimental remote machine";
 
     hc->unplug = qdev_simple_device_unplug_cb;
+
+    object_class_property_add_bool(oc, "vfio-user",
+                                   remote_machine_get_vfio_user,
+                                   remote_machine_set_vfio_user);
 }
 
 static const TypeInfo remote_machine = {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 05/17] configure: require cmake 3.19 or newer
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (3 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 04/17] remote/machine: add vfio-user property Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 06/17] vfio-user: build library Jagannathan Raman
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

cmake needs to accept the compiler flags specified with
CMAKE_<LANG>_COMPILER variable. It does so starting with
version 3.19

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 configure | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/configure b/configure
index 7c08c18358..7a1a98bddf 100755
--- a/configure
+++ b/configure
@@ -250,6 +250,7 @@ stack_protector=""
 safe_stack=""
 use_containers="yes"
 gdb_bin=$(command -v "gdb-multiarch" || command -v "gdb")
+cmake_required="no"
 
 if test -e "$source_path/.git"
 then
@@ -2777,6 +2778,21 @@ if !(GIT="$git" "$source_path/scripts/git-submodule.sh" "$git_submodules_action"
     exit 1
 fi
 
+# Per cmake spec, CMAKE_<LANG>_COMPILER variable may include "mandatory" compiler
+# flags. QEMU needs to specify these flags to correctly configure the build
+# environment. cmake 3.19 allows specifying these mandatory compiler flags,
+# and as such 3.19 or newer is required to build QEMU.
+if test "$cmake_required" = "yes" ; then
+    cmake_bin=$(command -v "cmake")
+    if [ -z "$cmake_bin" ]; then
+        error_exit "cmake not found"
+    fi
+    cmake_version=$($cmake_bin --version | head -n 1)
+    if ! version_ge ${cmake_version##* } 3.19; then
+        error_exit "QEMU needs cmake 3.19 or newer"
+    fi
+fi
+
 config_host_mak="config-host.mak"
 
 echo "# Automatically generated by configure - do not modify" > $config_host_mak
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 06/17] vfio-user: build library
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (4 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 05/17] configure: require cmake 3.19 or newer Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25  8:22   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 07/17] vfio-user: define vfio-user-server object Jagannathan Raman
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

add the libvfio-user library as a submodule. build it as a cmake
subproject.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 configure                                  | 20 +++++++++-
 meson.build                                | 44 +++++++++++++++++++++-
 .gitlab-ci.d/buildtest.yml                 |  2 +
 .gitmodules                                |  3 ++
 Kconfig.host                               |  4 ++
 MAINTAINERS                                |  1 +
 hw/remote/Kconfig                          |  4 ++
 hw/remote/meson.build                      |  2 +
 meson_options.txt                          |  3 ++
 subprojects/libvfio-user                   |  1 +
 tests/docker/dockerfiles/centos8.docker    |  2 +
 tests/docker/dockerfiles/ubuntu2004.docker |  2 +
 12 files changed, 86 insertions(+), 2 deletions(-)
 create mode 160000 subprojects/libvfio-user

diff --git a/configure b/configure
index 7a1a98bddf..c4fd7a42d4 100755
--- a/configure
+++ b/configure
@@ -333,6 +333,7 @@ meson_args=""
 ninja=""
 gio="$default_feature"
 skip_meson=no
+vfio_user_server="disabled"
 
 # The following Meson options are handled manually (still they
 # are included in the automatically generated help message)
@@ -1044,6 +1045,11 @@ for opt do
   ;;
   --disable-blobs) meson_option_parse --disable-install-blobs ""
   ;;
+  --enable-vfio-user-server) vfio_user_server="enabled"
+                             cmake_required="yes"
+  ;;
+  --disable-vfio-user-server) vfio_user_server="disabled"
+  ;;
   --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc
   ;;
   --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc
@@ -1267,6 +1273,7 @@ cat << EOF
   vhost-vdpa      vhost-vdpa kernel backend support
   opengl          opengl support
   gio             libgio support
+  vfio-user-server    vfio-user server support
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -2622,6 +2629,17 @@ but not implemented on your system"
     fi
 fi
 
+##########################################
+# check for vfio_user_server
+
+case "$vfio_user_server" in
+  auto | enabled )
+    if test "$git_submodules_action" != "ignore"; then
+      git_submodules="${git_submodules} subprojects/libvfio-user"
+    fi
+    ;;
+esac
+
 ##########################################
 # End of CC checks
 # After here, no more $cc or $ld runs
@@ -3185,7 +3203,7 @@ if test "$skip_meson" = no; then
         -Db_pie=$(if test "$pie" = yes; then echo true; else echo false; fi) \
         -Db_coverage=$(if test "$gcov" = yes; then echo true; else echo false; fi) \
         -Db_lto=$lto -Dcfi=$cfi -Dtcg=$tcg -Dxen=$xen \
-        -Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp \
+        -Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp -Dvfio_user_server=$vfio_user_server \
         $(test -n "${LIB_FUZZING_ENGINE+xxx}" && echo "-Dfuzzing_engine=$LIB_FUZZING_ENGINE") \
         $(if test "$default_feature" = no; then echo "-Dauto_features=disabled"; fi) \
         "$@" $cross_arg "$PWD" "$source_path"
diff --git a/meson.build b/meson.build
index 861de93c4f..84bc3a1c4f 100644
--- a/meson.build
+++ b/meson.build
@@ -298,6 +298,11 @@ have_tpm = get_option('tpm') \
   .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \
   .allowed()
 
+if targetos != 'linux' and get_option('vfio_user_server').enabled()
+  error('vfio-user server is supported only on Linux')
+endif
+vfio_user_server_allowed = targetos == 'linux' and not get_option('vfio_user_server').disabled()
+
 # Target-specific libraries and flags
 libm = cc.find_library('m', required: false)
 threads = dependency('threads')
@@ -2111,7 +2116,8 @@ host_kconfig = \
   (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \
   ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \
   ('CONFIG_PVRDMA' in config_host ? ['CONFIG_PVRDMA=y'] : []) + \
-  (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : [])
+  (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \
+  (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : [])
 
 ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ]
 
@@ -2500,6 +2506,41 @@ if get_option('cfi') and slirp_opt == 'system'
          + ' Please configure with --enable-slirp=git')
 endif
 
+vfiouser = not_found
+if have_system and vfio_user_server_allowed
+  have_internal = fs.exists(meson.current_source_dir() / 'subprojects/libvfio-user/Makefile')
+
+  if not have_internal
+    error('libvfio-user source not found - please pull git submodule')
+  endif
+
+  json_c = dependency('json-c', required: false)
+  if not json_c.found()
+    json_c = dependency('libjson-c', required: false)
+  endif
+  if not json_c.found()
+    json_c = dependency('libjson-c-dev', required: false)
+  endif
+
+  if not json_c.found()
+    error('Unable to find json-c package')
+  endif
+
+  cmake = import('cmake')
+
+  vfiouser_subproj = cmake.subproject('libvfio-user')
+
+  vfiouser_sl = vfiouser_subproj.dependency('vfio-user-static')
+
+  # Although cmake links the json-c library with vfio-user-static
+  # target, that info is not available to meson via cmake.subproject.
+  # As such, we have to separately declare the json-c dependency here.
+  # This appears to be a current limitation of using cmake inside meson.
+  # libvfio-user is planning a switch to meson in the future, which
+  # would address this item automatically.
+  vfiouser = declare_dependency(dependencies: [vfiouser_sl, json_c])
+endif
+
 fdt = not_found
 if have_system
   fdt_opt = get_option('fdt')
@@ -3612,6 +3653,7 @@ summary_info += {'target list':       ' '.join(target_dirs)}
 if have_system
   summary_info += {'default devices':   get_option('default_devices')}
   summary_info += {'out of process emulation': multiprocess_allowed}
+  summary_info += {'vfio-user server': vfio_user_server_allowed}
 endif
 summary(summary_info, bool_yn: true, section: 'Targets and accelerators')
 
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 0aea7ab84c..671a5d8fa4 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -42,6 +42,7 @@ build-system-ubuntu:
   variables:
     IMAGE: ubuntu2004
     CONFIGURE_ARGS: --enable-docs --enable-fdt=system --enable-slirp=system
+                    --enable-vfio-user-server
     TARGETS: aarch64-softmmu alpha-softmmu cris-softmmu hppa-softmmu
       microblazeel-softmmu mips64el-softmmu
     MAKE_CHECK_ARGS: check-build
@@ -165,6 +166,7 @@ build-system-centos:
     IMAGE: centos8
     CONFIGURE_ARGS: --disable-nettle --enable-gcrypt --enable-fdt=system
       --enable-modules --enable-trace-backends=dtrace --enable-docs
+      --enable-vfio-user-server
     TARGETS: ppc64-softmmu or1k-softmmu s390x-softmmu
       x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
     MAKE_CHECK_ARGS: check-build
diff --git a/.gitmodules b/.gitmodules
index f4b6a9b401..d66af96dc9 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -67,3 +67,6 @@
 [submodule "tests/lcitool/libvirt-ci"]
 	path = tests/lcitool/libvirt-ci
 	url = https://gitlab.com/libvirt/libvirt-ci.git
+[submodule "subprojects/libvfio-user"]
+	path = subprojects/libvfio-user
+	url = https://github.com/nutanix/libvfio-user.git
diff --git a/Kconfig.host b/Kconfig.host
index 60b9c07b5e..f2da8bcf8a 100644
--- a/Kconfig.host
+++ b/Kconfig.host
@@ -45,3 +45,7 @@ config MULTIPROCESS_ALLOWED
 config FUZZ
     bool
     select SPARSE_MEM
+
+config VFIO_USER_SERVER_ALLOWED
+    bool
+    imply VFIO_USER_SERVER
diff --git a/MAINTAINERS b/MAINTAINERS
index 4ad2451e03..dca81a2d22 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3597,6 +3597,7 @@ F: hw/remote/proxy-memory-listener.c
 F: include/hw/remote/proxy-memory-listener.h
 F: hw/remote/iohub.c
 F: include/hw/remote/iohub.h
+F: subprojects/libvfio-user
 
 EBPF:
 M: Jason Wang <jasowang@redhat.com>
diff --git a/hw/remote/Kconfig b/hw/remote/Kconfig
index 08c16e235f..2d6b4f4cf4 100644
--- a/hw/remote/Kconfig
+++ b/hw/remote/Kconfig
@@ -2,3 +2,7 @@ config MULTIPROCESS
     bool
     depends on PCI && PCI_EXPRESS && KVM
     select REMOTE_PCIHOST
+
+config VFIO_USER_SERVER
+    bool
+    depends on MULTIPROCESS
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
index e6a5574242..dfea6b533b 100644
--- a/hw/remote/meson.build
+++ b/hw/remote/meson.build
@@ -7,6 +7,8 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
 
+remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: vfiouser)
+
 specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('memory.c'))
 specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy-memory-listener.c'))
 
diff --git a/meson_options.txt b/meson_options.txt
index 52b11cead4..bd531fd545 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -91,6 +91,9 @@ option('avx2', type: 'feature', value: 'auto',
 option('avx512f', type: 'feature', value: 'disabled',
        description: 'AVX512F optimizations')
 
+option('vfio_user_server', type: 'feature', value: 'auto',
+       description: 'vfio-user server support')
+
 option('attr', type : 'feature', value : 'auto',
        description: 'attr/xattr support')
 option('auth_pam', type : 'feature', value : 'auto',
diff --git a/subprojects/libvfio-user b/subprojects/libvfio-user
new file mode 160000
index 0000000000..030d2f6e79
--- /dev/null
+++ b/subprojects/libvfio-user
@@ -0,0 +1 @@
+Subproject commit 030d2f6e7978b8ca7577b81d4f48e2771bcd8f47
diff --git a/tests/docker/dockerfiles/centos8.docker b/tests/docker/dockerfiles/centos8.docker
index 3ede55d09b..b6b4aa9626 100644
--- a/tests/docker/dockerfiles/centos8.docker
+++ b/tests/docker/dockerfiles/centos8.docker
@@ -23,6 +23,7 @@ RUN dnf update -y && \
         capstone-devel \
         ccache \
         clang \
+        cmake \
         ctags \
         cyrus-sasl-devel \
         daxctl-devel \
@@ -45,6 +46,7 @@ RUN dnf update -y && \
         gtk3-devel \
         hostname \
         jemalloc-devel \
+        json-c-devel \
         libaio-devel \
         libasan \
         libattr-devel \
diff --git a/tests/docker/dockerfiles/ubuntu2004.docker b/tests/docker/dockerfiles/ubuntu2004.docker
index b9d06cb040..2422cc5574 100644
--- a/tests/docker/dockerfiles/ubuntu2004.docker
+++ b/tests/docker/dockerfiles/ubuntu2004.docker
@@ -18,6 +18,7 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
             ca-certificates \
             ccache \
             clang \
+            cmake \
             dbus \
             debianutils \
             diffutils \
@@ -58,6 +59,7 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
             libiscsi-dev \
             libjemalloc-dev \
             libjpeg-turbo8-dev \
+            libjson-c-dev \
             liblttng-ust-dev \
             liblzo2-dev \
             libncursesw5-dev \
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 07/17] vfio-user: define vfio-user-server object
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (5 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 06/17] vfio-user: build library Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25  8:27   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 08/17] vfio-user: instantiate vfio-user context Jagannathan Raman
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Define vfio-user object which is remote process server for QEMU. Setup
object initialization functions and properties necessary to instantiate
the object

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 qapi/qom.json               |  20 +++-
 include/hw/remote/machine.h |   2 +
 hw/remote/machine.c         |  27 +++++
 hw/remote/vfio-user-obj.c   | 211 ++++++++++++++++++++++++++++++++++++
 MAINTAINERS                 |   1 +
 hw/remote/meson.build       |   1 +
 hw/remote/trace-events      |   3 +
 7 files changed, 263 insertions(+), 2 deletions(-)
 create mode 100644 hw/remote/vfio-user-obj.c

diff --git a/qapi/qom.json b/qapi/qom.json
index eeb5395ff3..e7b1758a11 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -703,6 +703,20 @@
 { 'struct': 'RemoteObjectProperties',
   'data': { 'fd': 'str', 'devid': 'str' } }
 
+##
+# @VfioUserServerProperties:
+#
+# Properties for x-vfio-user-server objects.
+#
+# @socket: socket to be used by the libvfiouser library
+#
+# @device: the id of the device to be emulated at the server
+#
+# Since: 7.1
+##
+{ 'struct': 'VfioUserServerProperties',
+  'data': { 'socket': 'SocketAddress', 'device': 'str' } }
+
 ##
 # @RngProperties:
 #
@@ -842,7 +856,8 @@
     'tls-creds-psk',
     'tls-creds-x509',
     'tls-cipher-suites',
-    { 'name': 'x-remote-object', 'features': [ 'unstable' ] }
+    { 'name': 'x-remote-object', 'features': [ 'unstable' ] },
+    { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }
   ] }
 
 ##
@@ -905,7 +920,8 @@
       'tls-creds-psk':              'TlsCredsPskProperties',
       'tls-creds-x509':             'TlsCredsX509Properties',
       'tls-cipher-suites':          'TlsCredsProperties',
-      'x-remote-object':            'RemoteObjectProperties'
+      'x-remote-object':            'RemoteObjectProperties',
+      'x-vfio-user-server':         'VfioUserServerProperties'
   } }
 
 ##
diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h
index 8d0fa98d33..ac32fda387 100644
--- a/include/hw/remote/machine.h
+++ b/include/hw/remote/machine.h
@@ -24,6 +24,8 @@ struct RemoteMachineState {
     RemoteIOHubState iohub;
 
     bool vfio_user;
+
+    bool auto_shutdown;
 };
 
 /* Used to pass to co-routine device and ioc. */
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index a9a75e170f..ed91659794 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -78,6 +78,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp)
     s->vfio_user = value;
 }
 
+static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp)
+{
+    RemoteMachineState *s = REMOTE_MACHINE(obj);
+
+    return s->auto_shutdown;
+}
+
+static void remote_machine_set_auto_shutdown(Object *obj, bool value,
+                                             Error **errp)
+{
+    RemoteMachineState *s = REMOTE_MACHINE(obj);
+
+    s->auto_shutdown = value;
+}
+
+static void remote_machine_instance_init(Object *obj)
+{
+    RemoteMachineState *s = REMOTE_MACHINE(obj);
+
+    s->auto_shutdown = true;
+}
+
 static void remote_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -91,12 +113,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
     object_class_property_add_bool(oc, "vfio-user",
                                    remote_machine_get_vfio_user,
                                    remote_machine_set_vfio_user);
+
+    object_class_property_add_bool(oc, "auto-shutdown",
+                                   remote_machine_get_auto_shutdown,
+                                   remote_machine_set_auto_shutdown);
 }
 
 static const TypeInfo remote_machine = {
     .name = TYPE_REMOTE_MACHINE,
     .parent = TYPE_MACHINE,
     .instance_size = sizeof(RemoteMachineState),
+    .instance_init = remote_machine_instance_init,
     .class_init = remote_machine_class_init,
     .interfaces = (InterfaceInfo[]) {
         { TYPE_HOTPLUG_HANDLER },
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
new file mode 100644
index 0000000000..c4d59b4d9d
--- /dev/null
+++ b/hw/remote/vfio-user-obj.c
@@ -0,0 +1,211 @@
+/**
+ * QEMU vfio-user-server server object
+ *
+ * Copyright © 2022 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL-v2, version 2 or later.
+ *
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+/**
+ * Usage: add options:
+ *     -machine x-remote,vfio-user=on,auto-shutdown=on
+ *     -device <PCI-device>,id=<pci-dev-id>
+ *     -object x-vfio-user-server,id=<id>,type=unix,path=<socket-path>,
+ *             device=<pci-dev-id>
+ *
+ * Note that x-vfio-user-server object must be used with x-remote machine only.
+ * This server could only support PCI devices for now.
+ *
+ * type - SocketAddress type - presently "unix" alone is supported. Required
+ *        option
+ *
+ * path - named unix socket, it will be created by the server. It is
+ *        a required option
+ *
+ * device - id of a device on the server, a required option. PCI devices
+ *          alone are supported presently.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "qom/object.h"
+#include "qom/object_interfaces.h"
+#include "qemu/error-report.h"
+#include "trace.h"
+#include "sysemu/runstate.h"
+#include "hw/boards.h"
+#include "hw/remote/machine.h"
+#include "qapi/error.h"
+#include "qapi/qapi-visit-sockets.h"
+
+#define TYPE_VFU_OBJECT "x-vfio-user-server"
+OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
+
+/**
+ * VFU_OBJECT_ERROR - reports an error message. If auto_shutdown
+ * is set, it aborts the machine on error. Otherwise, it logs an
+ * error message without aborting.
+ */
+#define VFU_OBJECT_ERROR(o, fmt, ...)                                     \
+    {                                                                     \
+        if (vfu_object_auto_shutdown()) {                                 \
+            error_setg(&error_abort, (fmt), ## __VA_ARGS__);              \
+        } else {                                                          \
+            error_report((fmt), ## __VA_ARGS__);                          \
+        }                                                                 \
+    }                                                                     \
+
+struct VfuObjectClass {
+    ObjectClass parent_class;
+
+    unsigned int nr_devs;
+};
+
+struct VfuObject {
+    /* private */
+    Object parent;
+
+    SocketAddress *socket;
+
+    char *device;
+
+    Error *err;
+};
+
+static bool vfu_object_auto_shutdown(void)
+{
+    bool auto_shutdown = true;
+    Error *local_err = NULL;
+
+    if (!current_machine) {
+        return auto_shutdown;
+    }
+
+    auto_shutdown = object_property_get_bool(OBJECT(current_machine),
+                                             "auto-shutdown",
+                                             &local_err);
+
+    /*
+     * local_err would be set if no such property exists - safe to ignore.
+     * Unlikely scenario as auto-shutdown is always defined for
+     * TYPE_REMOTE_MACHINE, and  TYPE_VFU_OBJECT only works with
+     * TYPE_REMOTE_MACHINE
+     */
+    if (local_err) {
+        auto_shutdown = true;
+        error_free(local_err);
+    }
+
+    return auto_shutdown;
+}
+
+static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
+                                  void *opaque, Error **errp)
+{
+    VfuObject *o = VFU_OBJECT(obj);
+
+    qapi_free_SocketAddress(o->socket);
+
+    o->socket = NULL;
+
+    visit_type_SocketAddress(v, name, &o->socket, errp);
+
+    if (o->socket->type != SOCKET_ADDRESS_TYPE_UNIX) {
+        error_setg(errp, "vfu: Unsupported socket type - %s",
+                   SocketAddressType_str(o->socket->type));
+        qapi_free_SocketAddress(o->socket);
+        o->socket = NULL;
+        return;
+    }
+
+    trace_vfu_prop("socket", o->socket->u.q_unix.path);
+}
+
+static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
+{
+    VfuObject *o = VFU_OBJECT(obj);
+
+    g_free(o->device);
+
+    o->device = g_strdup(str);
+
+    trace_vfu_prop("device", str);
+}
+
+static void vfu_object_init(Object *obj)
+{
+    VfuObjectClass *k = VFU_OBJECT_GET_CLASS(obj);
+    VfuObject *o = VFU_OBJECT(obj);
+
+    k->nr_devs++;
+
+    if (!object_dynamic_cast(OBJECT(current_machine), TYPE_REMOTE_MACHINE)) {
+        error_setg(&o->err, "vfu: %s only compatible with %s machine",
+                   TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE);
+        return;
+    }
+}
+
+static void vfu_object_finalize(Object *obj)
+{
+    VfuObjectClass *k = VFU_OBJECT_GET_CLASS(obj);
+    VfuObject *o = VFU_OBJECT(obj);
+
+    k->nr_devs--;
+
+    qapi_free_SocketAddress(o->socket);
+
+    o->socket = NULL;
+
+    g_free(o->device);
+
+    o->device = NULL;
+
+    if (!k->nr_devs && vfu_object_auto_shutdown()) {
+        qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
+    }
+}
+
+static void vfu_object_class_init(ObjectClass *klass, void *data)
+{
+    VfuObjectClass *k = VFU_OBJECT_CLASS(klass);
+
+    k->nr_devs = 0;
+
+    object_class_property_add(klass, "socket", "SocketAddress", NULL,
+                              vfu_object_set_socket, NULL, NULL);
+    object_class_property_set_description(klass, "socket",
+                                          "SocketAddress "
+                                          "(ex: type=unix,path=/tmp/sock). "
+                                          "Only UNIX is presently supported");
+    object_class_property_add_str(klass, "device", NULL,
+                                  vfu_object_set_device);
+    object_class_property_set_description(klass, "device",
+                                          "device ID - only PCI devices "
+                                          "are presently supported");
+}
+
+static const TypeInfo vfu_object_info = {
+    .name = TYPE_VFU_OBJECT,
+    .parent = TYPE_OBJECT,
+    .instance_size = sizeof(VfuObject),
+    .instance_init = vfu_object_init,
+    .instance_finalize = vfu_object_finalize,
+    .class_size = sizeof(VfuObjectClass),
+    .class_init = vfu_object_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_USER_CREATABLE },
+        { }
+    }
+};
+
+static void vfu_register_types(void)
+{
+    type_register_static(&vfu_object_info);
+}
+
+type_init(vfu_register_types);
diff --git a/MAINTAINERS b/MAINTAINERS
index dca81a2d22..37afdecc61 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3598,6 +3598,7 @@ F: include/hw/remote/proxy-memory-listener.h
 F: hw/remote/iohub.c
 F: include/hw/remote/iohub.h
 F: subprojects/libvfio-user
+F: hw/remote/vfio-user-obj.c
 
 EBPF:
 M: Jason Wang <jasowang@redhat.com>
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
index dfea6b533b..534ac5df79 100644
--- a/hw/remote/meson.build
+++ b/hw/remote/meson.build
@@ -6,6 +6,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
+remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c'))
 
 remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: vfiouser)
 
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 0b23974f90..7da12f0d96 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -2,3 +2,6 @@
 
 mpqemu_send_io_error(int cmd, int size, int nfds) "send command %d size %d, %d file descriptors to remote process"
 mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, %d file descriptors to remote process"
+
+# vfio-user-obj.c
+vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 08/17] vfio-user: instantiate vfio-user context
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (6 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 07/17] vfio-user: define vfio-user-server object Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 09/17] vfio-user: find and init PCI device Jagannathan Raman
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

create a context with the vfio-user library to run a PCI device

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 hw/remote/vfio-user-obj.c | 82 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index c4d59b4d9d..d46acd5b63 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -41,6 +41,9 @@
 #include "hw/remote/machine.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qemu/notify.h"
+#include "sysemu/sysemu.h"
+#include "libvfio-user.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -74,8 +77,14 @@ struct VfuObject {
     char *device;
 
     Error *err;
+
+    Notifier machine_done;
+
+    vfu_ctx_t *vfu_ctx;
 };
 
+static void vfu_object_init_ctx(VfuObject *o, Error **errp);
+
 static bool vfu_object_auto_shutdown(void)
 {
     bool auto_shutdown = true;
@@ -108,6 +117,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
 {
     VfuObject *o = VFU_OBJECT(obj);
 
+    if (o->vfu_ctx) {
+        error_setg(errp, "vfu: Unable to set socket property - server busy");
+        return;
+    }
+
     qapi_free_SocketAddress(o->socket);
 
     o->socket = NULL;
@@ -123,17 +137,69 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name,
     }
 
     trace_vfu_prop("socket", o->socket->u.q_unix.path);
+
+    vfu_object_init_ctx(o, errp);
 }
 
 static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
 {
     VfuObject *o = VFU_OBJECT(obj);
 
+    if (o->vfu_ctx) {
+        error_setg(errp, "vfu: Unable to set device property - server busy");
+        return;
+    }
+
     g_free(o->device);
 
     o->device = g_strdup(str);
 
     trace_vfu_prop("device", str);
+
+    vfu_object_init_ctx(o, errp);
+}
+
+/*
+ * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
+ * properties. It also depends on devices instantiated in QEMU. These
+ * dependencies are not available during the instance_init phase of this
+ * object's life-cycle. As such, the server is initialized after the
+ * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT
+ * when the machine is setup, and the dependencies are available.
+ */
+static void vfu_object_machine_done(Notifier *notifier, void *data)
+{
+    VfuObject *o = container_of(notifier, VfuObject, machine_done);
+    Error *err = NULL;
+
+    vfu_object_init_ctx(o, &err);
+
+    if (err) {
+        error_propagate(&error_abort, err);
+    }
+}
+
+static void vfu_object_init_ctx(VfuObject *o, Error **errp)
+{
+    ERRP_GUARD();
+
+    if (o->vfu_ctx || !o->socket || !o->device ||
+            !phase_check(PHASE_MACHINE_READY)) {
+        return;
+    }
+
+    if (o->err) {
+        error_propagate(errp, o->err);
+        o->err = NULL;
+        return;
+    }
+
+    o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0,
+                                o, VFU_DEV_TYPE_PCI);
+    if (o->vfu_ctx == NULL) {
+        error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
+        return;
+    }
 }
 
 static void vfu_object_init(Object *obj)
@@ -148,6 +214,12 @@ static void vfu_object_init(Object *obj)
                    TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE);
         return;
     }
+
+    if (!phase_check(PHASE_MACHINE_READY)) {
+        o->machine_done.notify = vfu_object_machine_done;
+        qemu_add_machine_init_done_notifier(&o->machine_done);
+    }
+
 }
 
 static void vfu_object_finalize(Object *obj)
@@ -161,6 +233,11 @@ static void vfu_object_finalize(Object *obj)
 
     o->socket = NULL;
 
+    if (o->vfu_ctx) {
+        vfu_destroy_ctx(o->vfu_ctx);
+        o->vfu_ctx = NULL;
+    }
+
     g_free(o->device);
 
     o->device = NULL;
@@ -168,6 +245,11 @@ static void vfu_object_finalize(Object *obj)
     if (!k->nr_devs && vfu_object_auto_shutdown()) {
         qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
     }
+
+    if (o->machine_done.notify) {
+        qemu_remove_machine_init_done_notifier(&o->machine_done);
+        o->machine_done.notify = NULL;
+    }
 }
 
 static void vfu_object_class_init(ObjectClass *klass, void *data)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 09/17] vfio-user: find and init PCI device
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (7 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 08/17] vfio-user: instantiate vfio-user context Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 10/17] vfio-user: run vfio-user context Jagannathan Raman
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Find the PCI device with specified id. Initialize the device context
with the QEMU PCI device

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/remote/vfio-user-obj.c | 67 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index d46acd5b63..15f6fe3a1a 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -44,6 +44,8 @@
 #include "qemu/notify.h"
 #include "sysemu/sysemu.h"
 #include "libvfio-user.h"
+#include "hw/qdev-core.h"
+#include "hw/pci/pci.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -81,6 +83,10 @@ struct VfuObject {
     Notifier machine_done;
 
     vfu_ctx_t *vfu_ctx;
+
+    PCIDevice *pci_dev;
+
+    Error *unplug_blocker;
 };
 
 static void vfu_object_init_ctx(VfuObject *o, Error **errp);
@@ -182,6 +188,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data)
 static void vfu_object_init_ctx(VfuObject *o, Error **errp)
 {
     ERRP_GUARD();
+    DeviceState *dev = NULL;
+    vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL;
+    int ret;
 
     if (o->vfu_ctx || !o->socket || !o->device ||
             !phase_check(PHASE_MACHINE_READY)) {
@@ -200,6 +209,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
         error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
         return;
     }
+
+    dev = qdev_find_recursive(sysbus_get_default(), o->device);
+    if (dev == NULL) {
+        error_setg(errp, "vfu: Device %s not found", o->device);
+        goto fail;
+    }
+
+    if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
+        error_setg(errp, "vfu: %s not a PCI device", o->device);
+        goto fail;
+    }
+
+    o->pci_dev = PCI_DEVICE(dev);
+
+    object_ref(OBJECT(o->pci_dev));
+
+    if (pci_is_express(o->pci_dev)) {
+        pci_type = VFU_PCI_TYPE_EXPRESS;
+    }
+
+    ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0);
+    if (ret < 0) {
+        error_setg(errp,
+                   "vfu: Failed to attach PCI device %s to context - %s",
+                   o->device, strerror(errno));
+        goto fail;
+    }
+
+    error_setg(&o->unplug_blocker,
+               "vfu: %s for %s must be deleted before unplugging",
+               TYPE_VFU_OBJECT, o->device);
+    qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
+
+    return;
+
+fail:
+    vfu_destroy_ctx(o->vfu_ctx);
+    if (o->unplug_blocker && o->pci_dev) {
+        qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
+        error_free(o->unplug_blocker);
+        o->unplug_blocker = NULL;
+    }
+    if (o->pci_dev) {
+        object_unref(OBJECT(o->pci_dev));
+        o->pci_dev = NULL;
+    }
+    o->vfu_ctx = NULL;
 }
 
 static void vfu_object_init(Object *obj)
@@ -242,6 +298,17 @@ static void vfu_object_finalize(Object *obj)
 
     o->device = NULL;
 
+    if (o->unplug_blocker && o->pci_dev) {
+        qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
+        error_free(o->unplug_blocker);
+        o->unplug_blocker = NULL;
+    }
+
+    if (o->pci_dev) {
+        object_unref(OBJECT(o->pci_dev));
+        o->pci_dev = NULL;
+    }
+
     if (!k->nr_devs && vfu_object_auto_shutdown()) {
         qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
     }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 10/17] vfio-user: run vfio-user context
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (8 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 09/17] vfio-user: find and init PCI device Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-21 14:59   ` Markus Armbruster
  2022-04-19 20:44 ` [PATCH v8 11/17] vfio-user: handle PCI config space accesses Jagannathan Raman
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Setup a handler to run vfio-user context. The context is driven by
messages to the file descriptor associated with it - get the fd for
the context and hook up the handler with it

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 qapi/misc.json            | 23 ++++++++++
 hw/remote/vfio-user-obj.c | 95 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index b83cc39029..f3cc4a4854 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -553,3 +553,26 @@
 ##
 { 'event': 'RTC_CHANGE',
   'data': { 'offset': 'int', 'qom-path': 'str' } }
+
+##
+# @VFU_CLIENT_HANGUP:
+#
+# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
+# communication channel
+#
+# @id: ID of the TYPE_VFIO_USER_SERVER object
+#
+# @device: ID of attached PCI device
+#
+# Since: 7.1
+#
+# Example:
+#
+# <- { "event": "VFU_CLIENT_HANGUP",
+#      "data": { "id": "vfu1",
+#                "device": "lsi1" },
+#      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+#
+##
+{ 'event': 'VFU_CLIENT_HANGUP',
+  'data': { 'id': 'str', 'device': 'str' } }
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 15f6fe3a1a..06d99a8698 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -27,6 +27,9 @@
  *
  * device - id of a device on the server, a required option. PCI devices
  *          alone are supported presently.
+ *
+ * notes - x-vfio-user-server could block IO and monitor during the
+ *         initialization phase.
  */
 
 #include "qemu/osdep.h"
@@ -41,11 +44,14 @@
 #include "hw/remote/machine.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-sockets.h"
+#include "qapi/qapi-events-misc.h"
 #include "qemu/notify.h"
+#include "qemu/thread.h"
 #include "sysemu/sysemu.h"
 #include "libvfio-user.h"
 #include "hw/qdev-core.h"
 #include "hw/pci/pci.h"
+#include "qemu/timer.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -87,6 +93,8 @@ struct VfuObject {
     PCIDevice *pci_dev;
 
     Error *unplug_blocker;
+
+    int vfu_poll_fd;
 };
 
 static void vfu_object_init_ctx(VfuObject *o, Error **errp);
@@ -165,6 +173,69 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp)
     vfu_object_init_ctx(o, errp);
 }
 
+static void vfu_object_ctx_run(void *opaque)
+{
+    VfuObject *o = opaque;
+    const char *id = NULL;
+    int ret = -1;
+
+    while (ret != 0) {
+        ret = vfu_run_ctx(o->vfu_ctx);
+        if (ret < 0) {
+            if (errno == EINTR) {
+                continue;
+            } else if (errno == ENOTCONN) {
+                id = object_get_canonical_path_component(OBJECT(o));
+                qapi_event_send_vfu_client_hangup(id, o->device);
+                qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
+                o->vfu_poll_fd = -1;
+                object_unparent(OBJECT(o));
+                break;
+            } else {
+                VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s",
+                                 o->device, strerror(errno));
+                break;
+            }
+        }
+    }
+}
+
+static void vfu_object_attach_ctx(void *opaque)
+{
+    VfuObject *o = opaque;
+    GPollFD pfds[1];
+    int ret;
+
+    qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
+
+    pfds[0].fd = o->vfu_poll_fd;
+    pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+
+retry_attach:
+    ret = vfu_attach_ctx(o->vfu_ctx);
+    if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
+        /**
+         * vfu_object_attach_ctx can block QEMU's main loop
+         * during attach - the monitor and other IO
+         * could be unresponsive during this time.
+         */
+        (void)qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS);
+        goto retry_attach;
+    } else if (ret < 0) {
+        VFU_OBJECT_ERROR(o, "vfu: Failed to attach device %s to context - %s",
+                         o->device, strerror(errno));
+        return;
+    }
+
+    o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx);
+    if (o->vfu_poll_fd < 0) {
+        VFU_OBJECT_ERROR(o, "vfu: Failed to get poll fd %s", o->device);
+        return;
+    }
+
+    qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o);
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -203,7 +274,8 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
         return;
     }
 
-    o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0,
+    o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path,
+                                LIBVFIO_USER_FLAG_ATTACH_NB,
                                 o, VFU_DEV_TYPE_PCI);
     if (o->vfu_ctx == NULL) {
         error_setg(errp, "vfu: Failed to create context - %s", strerror(errno));
@@ -242,6 +314,21 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
                TYPE_VFU_OBJECT, o->device);
     qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
 
+    ret = vfu_realize_ctx(o->vfu_ctx);
+    if (ret < 0) {
+        error_setg(errp, "vfu: Failed to realize device %s- %s",
+                   o->device, strerror(errno));
+        goto fail;
+    }
+
+    o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx);
+    if (o->vfu_poll_fd < 0) {
+        error_setg(errp, "vfu: Failed to get poll fd %s", o->device);
+        goto fail;
+    }
+
+    qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_attach_ctx, NULL, o);
+
     return;
 
 fail:
@@ -276,6 +363,7 @@ static void vfu_object_init(Object *obj)
         qemu_add_machine_init_done_notifier(&o->machine_done);
     }
 
+    o->vfu_poll_fd = -1;
 }
 
 static void vfu_object_finalize(Object *obj)
@@ -289,6 +377,11 @@ static void vfu_object_finalize(Object *obj)
 
     o->socket = NULL;
 
+    if (o->vfu_poll_fd != -1) {
+        qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
+        o->vfu_poll_fd = -1;
+    }
+
     if (o->vfu_ctx) {
         vfu_destroy_ctx(o->vfu_ctx);
         o->vfu_ctx = NULL;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 11/17] vfio-user: handle PCI config space accesses
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (9 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 10/17] vfio-user: run vfio-user context Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-19 20:44 ` [PATCH v8 12/17] vfio-user: IOMMU support for remote device Jagannathan Raman
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Define and register handlers for PCI config space accesses

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/remote/vfio-user-obj.c | 51 +++++++++++++++++++++++++++++++++++++++
 hw/remote/trace-events    |  2 ++
 2 files changed, 53 insertions(+)

diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 06d99a8698..7b863dec4f 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -47,6 +47,7 @@
 #include "qapi/qapi-events-misc.h"
 #include "qemu/notify.h"
 #include "qemu/thread.h"
+#include "qemu/main-loop.h"
 #include "sysemu/sysemu.h"
 #include "libvfio-user.h"
 #include "hw/qdev-core.h"
@@ -236,6 +237,45 @@ retry_attach:
     qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o);
 }
 
+static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf,
+                                     size_t count, loff_t offset,
+                                     const bool is_write)
+{
+    VfuObject *o = vfu_get_private(vfu_ctx);
+    uint32_t pci_access_width = sizeof(uint32_t);
+    size_t bytes = count;
+    uint32_t val = 0;
+    char *ptr = buf;
+    int len;
+
+    /*
+     * Writes to the BAR registers would trigger an update to the
+     * global Memory and IO AddressSpaces. But the remote device
+     * never uses the global AddressSpaces, therefore overlapping
+     * memory regions are not a problem
+     */
+    while (bytes > 0) {
+        len = (bytes > pci_access_width) ? pci_access_width : bytes;
+        if (is_write) {
+            memcpy(&val, ptr, len);
+            pci_host_config_write_common(o->pci_dev, offset,
+                                         pci_config_size(o->pci_dev),
+                                         val, len);
+            trace_vfu_cfg_write(offset, val);
+        } else {
+            val = pci_host_config_read_common(o->pci_dev, offset,
+                                              pci_config_size(o->pci_dev), len);
+            memcpy(ptr, &val, len);
+            trace_vfu_cfg_read(offset, val);
+        }
+        offset += len;
+        ptr += len;
+        bytes -= len;
+    }
+
+    return count;
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -314,6 +354,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
                TYPE_VFU_OBJECT, o->device);
     qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker);
 
+    ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX,
+                           pci_config_size(o->pci_dev), &vfu_object_cfg_access,
+                           VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB,
+                           NULL, 0, -1, 0);
+    if (ret < 0) {
+        error_setg(errp,
+                   "vfu: Failed to setup config space handlers for %s- %s",
+                   o->device, strerror(errno));
+        goto fail;
+    }
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 7da12f0d96..2ef7884346 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d,
 
 # vfio-user-obj.c
 vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
+vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
+vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (10 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 11/17] vfio-user: handle PCI config space accesses Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-20 11:15   ` Jag Raman
  2022-04-25  9:31   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 13/17] vfio-user: handle DMA mappings Jagannathan Raman
                   ` (5 subsequent siblings)
  17 siblings, 2 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Assign separate address space for each device in the remote processes.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/hw/remote/iommu.h |  40 +++++++++++++
 hw/remote/iommu.c         | 114 ++++++++++++++++++++++++++++++++++++++
 hw/remote/machine.c       |  13 ++++-
 MAINTAINERS               |   2 +
 hw/remote/meson.build     |   1 +
 5 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/remote/iommu.h
 create mode 100644 hw/remote/iommu.c

diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
new file mode 100644
index 0000000000..33b68a8f4b
--- /dev/null
+++ b/include/hw/remote/iommu.h
@@ -0,0 +1,40 @@
+/**
+ * Copyright © 2022 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef REMOTE_IOMMU_H
+#define REMOTE_IOMMU_H
+
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci.h"
+
+#ifndef INT2VOIDP
+#define INT2VOIDP(i) (void *)(uintptr_t)(i)
+#endif
+
+typedef struct RemoteIommuElem {
+    MemoryRegion *mr;
+
+    AddressSpace as;
+} RemoteIommuElem;
+
+#define TYPE_REMOTE_IOMMU "x-remote-iommu"
+OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
+
+struct RemoteIommu {
+    Object parent;
+
+    GHashTable *elem_by_devfn;
+
+    QemuMutex lock;
+};
+
+void remote_iommu_setup(PCIBus *pci_bus);
+
+void remote_iommu_unplug_dev(PCIDevice *pci_dev);
+
+#endif
diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
new file mode 100644
index 0000000000..16c6b0834e
--- /dev/null
+++ b/hw/remote/iommu.c
@@ -0,0 +1,114 @@
+/**
+ * IOMMU for remote device
+ *
+ * Copyright © 2022 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "hw/remote/iommu.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+
+static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
+                                              void *opaque, int devfn)
+{
+    RemoteIommu *iommu = opaque;
+    RemoteIommuElem *elem = NULL;
+
+    qemu_mutex_lock(&iommu->lock);
+
+    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
+
+    if (!elem) {
+        elem = g_malloc0(sizeof(RemoteIommuElem));
+        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
+    }
+
+    if (!elem->mr) {
+        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
+        memory_region_set_size(elem->mr, UINT64_MAX);
+        address_space_init(&elem->as, elem->mr, NULL);
+    }
+
+    qemu_mutex_unlock(&iommu->lock);
+
+    return &elem->as;
+}
+
+void remote_iommu_unplug_dev(PCIDevice *pci_dev)
+{
+    AddressSpace *as = pci_device_iommu_address_space(pci_dev);
+    RemoteIommuElem *elem = NULL;
+
+    if (as == &address_space_memory) {
+        return;
+    }
+
+    elem = container_of(as, RemoteIommuElem, as);
+
+    address_space_destroy(&elem->as);
+
+    object_unref(elem->mr);
+
+    elem->mr = NULL;
+}
+
+static void remote_iommu_init(Object *obj)
+{
+    RemoteIommu *iommu = REMOTE_IOMMU(obj);
+
+    iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free);
+
+    qemu_mutex_init(&iommu->lock);
+}
+
+static void remote_iommu_finalize(Object *obj)
+{
+    RemoteIommu *iommu = REMOTE_IOMMU(obj);
+
+    qemu_mutex_destroy(&iommu->lock);
+
+    if (iommu->elem_by_devfn) {
+        g_hash_table_destroy(iommu->elem_by_devfn);
+        iommu->elem_by_devfn = NULL;
+    }
+}
+
+void remote_iommu_setup(PCIBus *pci_bus)
+{
+    RemoteIommu *iommu = NULL;
+
+    g_assert(pci_bus);
+
+    iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU));
+
+    pci_setup_iommu(pci_bus, remote_iommu_find_add_as, iommu);
+
+    object_property_add_child(OBJECT(pci_bus), "remote-iommu", OBJECT(iommu));
+
+    object_unref(OBJECT(iommu));
+}
+
+static const TypeInfo remote_iommu_info = {
+    .name = TYPE_REMOTE_IOMMU,
+    .parent = TYPE_OBJECT,
+    .instance_size = sizeof(RemoteIommu),
+    .instance_init = remote_iommu_init,
+    .instance_finalize = remote_iommu_finalize,
+};
+
+static void remote_iommu_register_types(void)
+{
+    type_register_static(&remote_iommu_info);
+}
+
+type_init(remote_iommu_register_types)
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index ed91659794..cca5d25f50 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -21,6 +21,7 @@
 #include "qapi/error.h"
 #include "hw/pci/pci_host.h"
 #include "hw/remote/iohub.h"
+#include "hw/remote/iommu.h"
 #include "hw/qdev-core.h"
 
 static void remote_machine_init(MachineState *machine)
@@ -100,6 +101,16 @@ static void remote_machine_instance_init(Object *obj)
     s->auto_shutdown = true;
 }
 
+static void remote_machine_dev_unplug_cb(HotplugHandler *hotplug_dev,
+                                         DeviceState *dev, Error **errp)
+{
+    qdev_unrealize(dev);
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
+        remote_iommu_unplug_dev(PCI_DEVICE(dev));
+    }
+}
+
 static void remote_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -108,7 +119,7 @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
     mc->init = remote_machine_init;
     mc->desc = "Experimental remote machine";
 
-    hc->unplug = qdev_simple_device_unplug_cb;
+    hc->unplug = remote_machine_dev_unplug_cb;
 
     object_class_property_add_bool(oc, "vfio-user",
                                    remote_machine_get_vfio_user,
diff --git a/MAINTAINERS b/MAINTAINERS
index 37afdecc61..3e62ccab0a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3599,6 +3599,8 @@ F: hw/remote/iohub.c
 F: include/hw/remote/iohub.h
 F: subprojects/libvfio-user
 F: hw/remote/vfio-user-obj.c
+F: hw/remote/iommu.c
+F: include/hw/remote/iommu.h
 
 EBPF:
 M: Jason Wang <jasowang@redhat.com>
diff --git a/hw/remote/meson.build b/hw/remote/meson.build
index 534ac5df79..bcef83c8cc 100644
--- a/hw/remote/meson.build
+++ b/hw/remote/meson.build
@@ -6,6 +6,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
 remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
+remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iommu.c'))
 remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c'))
 
 remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: vfiouser)
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 13/17] vfio-user: handle DMA mappings
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (11 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 12/17] vfio-user: IOMMU support for remote device Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25  9:56   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 14/17] vfio-user: handle PCI BAR accesses Jagannathan Raman
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Define and register callbacks to manage the RAM regions used for
device DMA

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/remote/machine.c       |  5 ++++
 hw/remote/vfio-user-obj.c | 55 +++++++++++++++++++++++++++++++++++++++
 hw/remote/trace-events    |  2 ++
 3 files changed, 62 insertions(+)

diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index cca5d25f50..7002d46980 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -23,6 +23,7 @@
 #include "hw/remote/iohub.h"
 #include "hw/remote/iommu.h"
 #include "hw/qdev-core.h"
+#include "hw/remote/iommu.h"
 
 static void remote_machine_init(MachineState *machine)
 {
@@ -52,6 +53,10 @@ static void remote_machine_init(MachineState *machine)
 
     pci_host = PCI_HOST_BRIDGE(rem_host);
 
+    if (s->vfio_user) {
+        remote_iommu_setup(pci_host->bus);
+    }
+
     remote_iohub_init(&s->iohub);
 
     pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 7b863dec4f..425e45e8b2 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -276,6 +276,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf,
     return count;
 }
 
+static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
+{
+    VfuObject *o = vfu_get_private(vfu_ctx);
+    AddressSpace *dma_as = NULL;
+    MemoryRegion *subregion = NULL;
+    g_autofree char *name = NULL;
+    struct iovec *iov = &info->iova;
+
+    if (!info->vaddr) {
+        return;
+    }
+
+    name = g_strdup_printf("mem-%s-%"PRIx64"", o->device,
+                           (uint64_t)info->vaddr);
+
+    subregion = g_new0(MemoryRegion, 1);
+
+    memory_region_init_ram_ptr(subregion, NULL, name,
+                               iov->iov_len, info->vaddr);
+
+    dma_as = pci_device_iommu_address_space(o->pci_dev);
+
+    memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion);
+
+    trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len);
+}
+
+static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
+{
+    VfuObject *o = vfu_get_private(vfu_ctx);
+    AddressSpace *dma_as = NULL;
+    MemoryRegion *mr = NULL;
+    ram_addr_t offset;
+
+    mr = memory_region_from_host(info->vaddr, &offset);
+    if (!mr) {
+        return;
+    }
+
+    dma_as = pci_device_iommu_address_space(o->pci_dev);
+
+    memory_region_del_subregion(dma_as->root, mr);
+
+    object_unparent((OBJECT(mr)));
+
+    trace_vfu_dma_unregister((uint64_t)info->iova.iov_base);
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -365,6 +413,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
         goto fail;
     }
 
+    ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister);
+    if (ret < 0) {
+        error_setg(errp, "vfu: Failed to setup DMA handlers for %s",
+                   o->device);
+        goto fail;
+    }
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 2ef7884346..f945c7e33b 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d,
 vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s"
 vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
 vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
+vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes"
+vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 14/17] vfio-user: handle PCI BAR accesses
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (12 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 13/17] vfio-user: handle DMA mappings Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25 10:05   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 15/17] vfio-user: handle device interrupts Jagannathan Raman
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Determine the BARs used by the PCI device and register handlers to
manage the access to the same.

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/exec/memory.h           |   3 +
 hw/remote/vfio-user-obj.c       | 189 ++++++++++++++++++++++++++++++++
 softmmu/physmem.c               |   4 +-
 tests/qtest/fuzz/generic_fuzz.c |   9 +-
 hw/remote/trace-events          |   3 +
 5 files changed, 202 insertions(+), 6 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 4d5997e6bb..4b061e62d5 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache,
                                             hwaddr addr, const void *buf,
                                             hwaddr len);
 
+int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr);
+bool prepare_mmio_access(MemoryRegion *mr);
+
 static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
 {
     if (is_write) {
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 425e45e8b2..f75197cbe3 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -53,6 +53,7 @@
 #include "hw/qdev-core.h"
 #include "hw/pci/pci.h"
 #include "qemu/timer.h"
+#include "exec/memory.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -324,6 +325,192 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
     trace_vfu_dma_unregister((uint64_t)info->iova.iov_base);
 }
 
+static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset,
+                            hwaddr size, const bool is_write)
+{
+    uint8_t *ptr = buf;
+    bool release_lock = false;
+    uint8_t *ram_ptr = NULL;
+    MemTxResult result;
+    int access_size;
+    uint64_t val;
+
+    if (memory_access_is_direct(mr, is_write)) {
+        /**
+         * Some devices expose a PCI expansion ROM, which could be buffer
+         * based as compared to other regions which are primarily based on
+         * MemoryRegionOps. memory_region_find() would already check
+         * for buffer overflow, we don't need to repeat it here.
+         */
+        ram_ptr = memory_region_get_ram_ptr(mr);
+
+        if (is_write) {
+            memcpy((ram_ptr + offset), buf, size);
+        } else {
+            memcpy(buf, (ram_ptr + offset), size);
+        }
+
+        return 0;
+    }
+
+    while (size) {
+        /**
+         * The read/write logic used below is similar to the ones in
+         * flatview_read/write_continue()
+         */
+        release_lock = prepare_mmio_access(mr);
+
+        access_size = memory_access_size(mr, size, offset);
+
+        if (is_write) {
+            val = ldn_he_p(ptr, access_size);
+
+            result = memory_region_dispatch_write(mr, offset, val,
+                                                  size_memop(access_size),
+                                                  MEMTXATTRS_UNSPECIFIED);
+        } else {
+            result = memory_region_dispatch_read(mr, offset, &val,
+                                                 size_memop(access_size),
+                                                 MEMTXATTRS_UNSPECIFIED);
+
+            stn_he_p(ptr, access_size, val);
+        }
+
+        if (release_lock) {
+            qemu_mutex_unlock_iothread();
+            release_lock = false;
+        }
+
+        if (result != MEMTX_OK) {
+            return -1;
+        }
+
+        size -= access_size;
+        ptr += access_size;
+        offset += access_size;
+    }
+
+    return 0;
+}
+
+static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar,
+                                hwaddr bar_offset, char * const buf,
+                                hwaddr len, const bool is_write)
+{
+    MemoryRegionSection section = { 0 };
+    uint8_t *ptr = (uint8_t *)buf;
+    MemoryRegion *section_mr = NULL;
+    uint64_t section_size;
+    hwaddr section_offset;
+    hwaddr size = 0;
+
+    while (len) {
+        section = memory_region_find(pci_dev->io_regions[pci_bar].memory,
+                                     bar_offset, len);
+
+        if (!section.mr) {
+            warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset);
+            return size;
+        }
+
+        section_mr = section.mr;
+        section_offset = section.offset_within_region;
+        section_size = int128_get64(section.size);
+
+        if (is_write && section_mr->readonly) {
+            warn_report("vfu: attempting to write to readonly region in "
+                        "bar %d - [0x%"PRIx64" - 0x%"PRIx64"]",
+                        pci_bar, bar_offset,
+                        (bar_offset + section_size));
+            memory_region_unref(section_mr);
+            return size;
+        }
+
+        if (vfu_object_mr_rw(section_mr, ptr, section_offset,
+                             section_size, is_write)) {
+            warn_report("vfu: failed to %s "
+                        "[0x%"PRIx64" - 0x%"PRIx64"] in bar %d",
+                        is_write ? "write to" : "read from", bar_offset,
+                        (bar_offset + section_size), pci_bar);
+            return size;
+        }
+
+        size += section_size;
+        bar_offset += section_size;
+        ptr += section_size;
+        len -= section_size;
+
+        memory_region_unref(section_mr);
+    }
+
+    return size;
+}
+
+/**
+ * VFU_OBJECT_BAR_HANDLER - macro for defining handlers for PCI BARs.
+ *
+ * To create handler for BAR number 2, VFU_OBJECT_BAR_HANDLER(2) would
+ * define vfu_object_bar2_handler
+ */
+#define VFU_OBJECT_BAR_HANDLER(BAR_NO)                                         \
+    static ssize_t vfu_object_bar##BAR_NO##_handler(vfu_ctx_t *vfu_ctx,        \
+                                        char * const buf, size_t count,        \
+                                        loff_t offset, const bool is_write)    \
+    {                                                                          \
+        VfuObject *o = vfu_get_private(vfu_ctx);                               \
+        PCIDevice *pci_dev = o->pci_dev;                                       \
+                                                                               \
+        return vfu_object_bar_rw(pci_dev, BAR_NO, offset,                      \
+                                 buf, count, is_write);                        \
+    }                                                                          \
+
+VFU_OBJECT_BAR_HANDLER(0)
+VFU_OBJECT_BAR_HANDLER(1)
+VFU_OBJECT_BAR_HANDLER(2)
+VFU_OBJECT_BAR_HANDLER(3)
+VFU_OBJECT_BAR_HANDLER(4)
+VFU_OBJECT_BAR_HANDLER(5)
+VFU_OBJECT_BAR_HANDLER(6)
+
+static vfu_region_access_cb_t *vfu_object_bar_handlers[PCI_NUM_REGIONS] = {
+    &vfu_object_bar0_handler,
+    &vfu_object_bar1_handler,
+    &vfu_object_bar2_handler,
+    &vfu_object_bar3_handler,
+    &vfu_object_bar4_handler,
+    &vfu_object_bar5_handler,
+    &vfu_object_bar6_handler,
+};
+
+/**
+ * vfu_object_register_bars - Identify active BAR regions of pdev and setup
+ *                            callbacks to handle read/write accesses
+ */
+static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
+{
+    int flags = VFU_REGION_FLAG_RW;
+    int i;
+
+    for (i = 0; i < PCI_NUM_REGIONS; i++) {
+        if (!pdev->io_regions[i].size) {
+            continue;
+        }
+
+        if ((i == VFU_PCI_DEV_ROM_REGION_IDX) ||
+            pdev->io_regions[i].memory->readonly) {
+            flags &= ~VFU_REGION_FLAG_WRITE;
+        }
+
+        vfu_setup_region(vfu_ctx, VFU_PCI_DEV_BAR0_REGION_IDX + i,
+                         (size_t)pdev->io_regions[i].size,
+                         vfu_object_bar_handlers[i],
+                         flags, NULL, 0, -1, 0);
+
+        trace_vfu_bar_register(i, pdev->io_regions[i].addr,
+                               pdev->io_regions[i].size);
+    }
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -420,6 +607,8 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
         goto fail;
     }
 
+    vfu_object_register_bars(o->vfu_ctx, o->pci_dev);
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 4e1b27a20e..f9a68d1fe5 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2719,7 +2719,7 @@ void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size)
     invalidate_and_set_dirty(mr, addr, size);
 }
 
-static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
+int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
 {
     unsigned access_size_max = mr->ops->valid.max_access_size;
 
@@ -2746,7 +2746,7 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
     return l;
 }
 
-static bool prepare_mmio_access(MemoryRegion *mr)
+bool prepare_mmio_access(MemoryRegion *mr)
 {
     bool release_lock = false;
 
diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c
index dd7e25851c..77547fc1d8 100644
--- a/tests/qtest/fuzz/generic_fuzz.c
+++ b/tests/qtest/fuzz/generic_fuzz.c
@@ -144,7 +144,7 @@ static void *pattern_alloc(pattern p, size_t len)
     return buf;
 }
 
-static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
+static int fuzz_memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
 {
     unsigned access_size_max = mr->ops->valid.max_access_size;
 
@@ -242,11 +242,12 @@ void fuzz_dma_read_cb(size_t addr, size_t len, MemoryRegion *mr)
 
         /*
          *  If mr1 isn't RAM, address_space_translate doesn't update l. Use
-         *  memory_access_size to identify the number of bytes that it is safe
-         *  to write without accidentally writing to another MemoryRegion.
+         *  fuzz_memory_access_size to identify the number of bytes that it
+         *  is safe to write without accidentally writing to another
+         *  MemoryRegion.
          */
         if (!memory_region_is_ram(mr1)) {
-            l = memory_access_size(mr1, l, addr1);
+            l = fuzz_memory_access_size(mr1, l, addr1);
         }
         if (memory_region_is_ram(mr1) ||
             memory_region_is_romd(mr1) ||
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index f945c7e33b..847d50d88f 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -9,3 +9,6 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x"
 vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x"
 vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes"
 vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
+vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64""
+vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64""
+vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 15/17] vfio-user: handle device interrupts
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (13 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 14/17] vfio-user: handle PCI BAR accesses Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25 10:27   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 16/17] vfio-user: handle reset of remote device Jagannathan Raman
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Forward remote device's interrupts to the guest

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 include/hw/pci/pci.h              |  13 ++++
 include/hw/remote/vfio-user-obj.h |   6 ++
 hw/pci/msi.c                      |  16 ++--
 hw/pci/msix.c                     |  10 ++-
 hw/pci/pci.c                      |  13 ++++
 hw/remote/machine.c               |  14 +++-
 hw/remote/vfio-user-obj.c         | 123 ++++++++++++++++++++++++++++++
 stubs/vfio-user-obj.c             |   6 ++
 MAINTAINERS                       |   1 +
 hw/remote/trace-events            |   1 +
 stubs/meson.build                 |   1 +
 11 files changed, 193 insertions(+), 11 deletions(-)
 create mode 100644 include/hw/remote/vfio-user-obj.h
 create mode 100644 stubs/vfio-user-obj.c

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 3a32b8dd40..7595c05c98 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -16,6 +16,7 @@ extern bool pci_available;
 #define PCI_SLOT(devfn)         (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn)         ((devfn) & 0x07)
 #define PCI_BUILD_BDF(bus, devfn)     ((bus << 8) | (devfn))
+#define PCI_BDF_TO_DEVFN(x)     ((x) & 0xff)
 #define PCI_BUS_MAX             256
 #define PCI_DEVFN_MAX           256
 #define PCI_SLOT_MAX            32
@@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num,
                                 pcibus_t addr, pcibus_t size, int type);
 typedef void PCIUnregisterFunc(PCIDevice *pci_dev);
 
+typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg);
+typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector);
+typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector);
+
 typedef struct PCIIORegion {
     pcibus_t addr; /* current PCI mapping address. -1 means not mapped */
 #define PCI_BAR_UNMAPPED (~(pcibus_t)0)
@@ -321,6 +326,14 @@ struct PCIDevice {
     /* Space to store MSIX table & pending bit array */
     uint8_t *msix_table;
     uint8_t *msix_pba;
+
+    /* May be used by INTx or MSI during interrupt notification */
+    void *irq_opaque;
+
+    MSITriggerFunc *msi_trigger;
+    MSIPrepareMessageFunc *msi_prepare_message;
+    MSIxPrepareMessageFunc *msix_prepare_message;
+
     /* MemoryRegion container for msix exclusive BAR setup */
     MemoryRegion msix_exclusive_bar;
     /* Memory Regions for MSIX table and pending bit entries. */
diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h
new file mode 100644
index 0000000000..87ab78b875
--- /dev/null
+++ b/include/hw/remote/vfio-user-obj.h
@@ -0,0 +1,6 @@
+#ifndef VFIO_USER_OBJ_H
+#define VFIO_USER_OBJ_H
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus);
+
+#endif
diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index 47d2b0f33c..d556e17a09 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg)
     pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data);
 }
 
-MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
+static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector)
 {
     uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
     bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
@@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
     return msg;
 }
 
+MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector)
+{
+    return dev->msi_prepare_message(dev, vector);
+}
+
 bool msi_enabled(const PCIDevice *dev)
 {
     return msi_present(dev) &&
@@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
                      0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors));
     }
 
+    dev->msi_prepare_message = msi_prepare_message;
+
     return 0;
 }
 
@@ -256,6 +263,7 @@ void msi_uninit(struct PCIDevice *dev)
     cap_size = msi_cap_sizeof(flags);
     pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size);
     dev->cap_present &= ~QEMU_PCI_CAP_MSI;
+    dev->msi_prepare_message = NULL;
 
     MSI_DEV_PRINTF(dev, "uninit\n");
 }
@@ -334,11 +342,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector)
 
 void msi_send_message(PCIDevice *dev, MSIMessage msg)
 {
-    MemTxAttrs attrs = {};
-
-    attrs.requester_id = pci_requester_id(dev);
-    address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
-                         attrs, NULL);
+    dev->msi_trigger(dev, msg);
 }
 
 /* Normally called by pci_default_write_config(). */
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index ae9331cd0b..6f85192d6f 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -31,7 +31,7 @@
 #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8)
 #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8)
 
-MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
+static MSIMessage msix_prepare_message(PCIDevice *dev, unsigned vector)
 {
     uint8_t *table_entry = dev->msix_table + vector * PCI_MSIX_ENTRY_SIZE;
     MSIMessage msg;
@@ -41,6 +41,11 @@ MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
     return msg;
 }
 
+MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
+{
+    return dev->msix_prepare_message(dev, vector);
+}
+
 /*
  * Special API for POWER to configure the vectors through
  * a side channel. Should never be used by devices.
@@ -344,6 +349,8 @@ int msix_init(struct PCIDevice *dev, unsigned short nentries,
                           "msix-pba", pba_size);
     memory_region_add_subregion(pba_bar, pba_offset, &dev->msix_pba_mmio);
 
+    dev->msix_prepare_message = msix_prepare_message;
+
     return 0;
 }
 
@@ -429,6 +436,7 @@ void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
     g_free(dev->msix_entry_used);
     dev->msix_entry_used = NULL;
     dev->cap_present &= ~QEMU_PCI_CAP_MSIX;
+    dev->msix_prepare_message = NULL;
 }
 
 void msix_uninit_exclusive_bar(PCIDevice *dev)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index dae9119bfe..48d4bc5f69 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -307,6 +307,15 @@ void pci_device_deassert_intx(PCIDevice *dev)
     }
 }
 
+static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg)
+{
+    MemTxAttrs attrs = {};
+
+    attrs.requester_id = pci_requester_id(dev);
+    address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
+                         attrs, NULL);
+}
+
 static void pci_reset_regions(PCIDevice *dev)
 {
     int r;
@@ -1202,6 +1211,8 @@ static void pci_qdev_unrealize(DeviceState *dev)
 
     pci_device_deassert_intx(pci_dev);
     do_pci_unregister_device(pci_dev);
+
+    pci_dev->msi_trigger = NULL;
 }
 
 void pci_register_bar(PCIDevice *pci_dev, int region_num,
@@ -2236,6 +2247,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
     }
 
     pci_set_power(pci_dev, true);
+
+    pci_dev->msi_trigger = pci_msi_trigger;
 }
 
 PCIDevice *pci_new_multifunction(int devfn, bool multifunction,
diff --git a/hw/remote/machine.c b/hw/remote/machine.c
index 7002d46980..37538edb2f 100644
--- a/hw/remote/machine.c
+++ b/hw/remote/machine.c
@@ -24,6 +24,8 @@
 #include "hw/remote/iommu.h"
 #include "hw/qdev-core.h"
 #include "hw/remote/iommu.h"
+#include "hw/remote/vfio-user-obj.h"
+#include "hw/pci/msi.h"
 
 static void remote_machine_init(MachineState *machine)
 {
@@ -55,12 +57,16 @@ static void remote_machine_init(MachineState *machine)
 
     if (s->vfio_user) {
         remote_iommu_setup(pci_host->bus);
-    }
 
-    remote_iohub_init(&s->iohub);
+        msi_nonbroken = true;
+
+        vfu_object_set_bus_irq(pci_host->bus);
+    } else {
+        remote_iohub_init(&s->iohub);
 
-    pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
-                 &s->iohub, REMOTE_IOHUB_NB_PIRQS);
+        pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq,
+                     &s->iohub, REMOTE_IOHUB_NB_PIRQS);
+    }
 
     qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s));
 }
diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index f75197cbe3..70b4d8b9ce 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -54,6 +54,9 @@
 #include "hw/pci/pci.h"
 #include "qemu/timer.h"
 #include "exec/memory.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/remote/vfio-user-obj.h"
 
 #define TYPE_VFU_OBJECT "x-vfio-user-server"
 OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT)
@@ -97,6 +100,10 @@ struct VfuObject {
     Error *unplug_blocker;
 
     int vfu_poll_fd;
+
+    MSITriggerFunc *default_msi_trigger;
+    MSIPrepareMessageFunc *default_msi_prepare_message;
+    MSIxPrepareMessageFunc *default_msix_prepare_message;
 };
 
 static void vfu_object_init_ctx(VfuObject *o, Error **errp);
@@ -511,6 +518,111 @@ static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev)
     }
 }
 
+static int vfu_object_map_irq(PCIDevice *pci_dev, int intx)
+{
+    int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
+                                pci_dev->devfn);
+
+    return pci_bdf;
+}
+
+static void vfu_object_set_irq(void *opaque, int pirq, int level)
+{
+    PCIBus *pci_bus = opaque;
+    PCIDevice *pci_dev = NULL;
+    vfu_ctx_t *vfu_ctx = NULL;
+    int pci_bus_num, devfn;
+
+    if (level) {
+        pci_bus_num = PCI_BUS_NUM(pirq);
+        devfn = PCI_BDF_TO_DEVFN(pirq);
+
+        /*
+         * pci_find_device() performs at O(1) if the device is attached
+         * to the root PCI bus. Whereas, if the device is attached to a
+         * secondary PCI bus (such as when a root port is involved),
+         * finding the parent PCI bus could take O(n)
+         */
+        pci_dev = pci_find_device(pci_bus, pci_bus_num, devfn);
+
+        vfu_ctx = pci_dev->irq_opaque;
+
+        g_assert(vfu_ctx);
+
+        vfu_irq_trigger(vfu_ctx, 0);
+    }
+}
+
+static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
+                                             unsigned int vector)
+{
+    MSIMessage msg;
+
+    msg.address = 0;
+    msg.data = vector;
+
+    return msg;
+}
+
+static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
+{
+    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
+
+    vfu_irq_trigger(vfu_ctx, msg.data);
+}
+
+static void vfu_object_setup_msi_cbs(VfuObject *o)
+{
+    o->default_msi_trigger = o->pci_dev->msi_trigger;
+    o->default_msi_prepare_message = o->pci_dev->msi_prepare_message;
+    o->default_msix_prepare_message = o->pci_dev->msix_prepare_message;
+
+    o->pci_dev->msi_trigger = vfu_object_msi_trigger;
+    o->pci_dev->msi_prepare_message = vfu_object_msi_prepare_msg;
+    o->pci_dev->msix_prepare_message = vfu_object_msi_prepare_msg;
+}
+
+static void vfu_object_restore_msi_cbs(VfuObject *o)
+{
+    o->pci_dev->msi_trigger = o->default_msi_trigger;
+    o->pci_dev->msi_prepare_message = o->default_msi_prepare_message;
+    o->pci_dev->msix_prepare_message = o->default_msix_prepare_message;
+}
+
+static int vfu_object_setup_irqs(VfuObject *o, PCIDevice *pci_dev)
+{
+    vfu_ctx_t *vfu_ctx = o->vfu_ctx;
+    int ret;
+
+    ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_INTX_IRQ, 1);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (msix_nr_vectors_allocated(pci_dev)) {
+        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSIX_IRQ,
+                                       msix_nr_vectors_allocated(pci_dev));
+    } else if (msi_nr_vectors_allocated(pci_dev)) {
+        ret = vfu_setup_device_nr_irqs(vfu_ctx, VFU_DEV_MSI_IRQ,
+                                       msi_nr_vectors_allocated(pci_dev));
+    }
+
+    if (ret < 0) {
+        return ret;
+    }
+
+    vfu_object_setup_msi_cbs(o);
+
+    pci_dev->irq_opaque = vfu_ctx;
+
+    return 0;
+}
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
+{
+    pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1);
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -609,6 +721,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
 
     vfu_object_register_bars(o->vfu_ctx, o->pci_dev);
 
+    ret = vfu_object_setup_irqs(o, o->pci_dev);
+    if (ret < 0) {
+        error_setg(errp, "vfu: Failed to setup interrupts for %s",
+                   o->device);
+        goto fail;
+    }
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
@@ -634,6 +753,8 @@ fail:
         o->unplug_blocker = NULL;
     }
     if (o->pci_dev) {
+        vfu_object_restore_msi_cbs(o);
+        o->pci_dev->irq_opaque = NULL;
         object_unref(OBJECT(o->pci_dev));
         o->pci_dev = NULL;
     }
@@ -693,6 +814,8 @@ static void vfu_object_finalize(Object *obj)
     }
 
     if (o->pci_dev) {
+        vfu_object_restore_msi_cbs(o);
+        o->pci_dev->irq_opaque = NULL;
         object_unref(OBJECT(o->pci_dev));
         o->pci_dev = NULL;
     }
diff --git a/stubs/vfio-user-obj.c b/stubs/vfio-user-obj.c
new file mode 100644
index 0000000000..79100d768e
--- /dev/null
+++ b/stubs/vfio-user-obj.c
@@ -0,0 +1,6 @@
+#include "qemu/osdep.h"
+#include "hw/remote/vfio-user-obj.h"
+
+void vfu_object_set_bus_irq(PCIBus *pci_bus)
+{
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index 3e62ccab0a..ad51ec0dc8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3599,6 +3599,7 @@ F: hw/remote/iohub.c
 F: include/hw/remote/iohub.h
 F: subprojects/libvfio-user
 F: hw/remote/vfio-user-obj.c
+F: include/hw/remote/vfio-user-obj.h
 F: hw/remote/iommu.c
 F: include/hw/remote/iommu.h
 
diff --git a/hw/remote/trace-events b/hw/remote/trace-events
index 847d50d88f..c167b3c7a5 100644
--- a/hw/remote/trace-events
+++ b/hw/remote/trace-events
@@ -12,3 +12,4 @@ vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64""
 vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64""
 vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64""
 vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""
+vfu_interrupt(int pirq) "vfu: sending interrupt to device - PIRQ %d"
diff --git a/stubs/meson.build b/stubs/meson.build
index 6f80fec761..d8f3fd5c44 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -60,3 +60,4 @@ if have_system
 else
   stub_ss.add(files('qdev.c'))
 endif
+stub_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_false: files('vfio-user-obj.c'))
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 16/17] vfio-user: handle reset of remote device
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (14 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 15/17] vfio-user: handle device interrupts Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25 10:27   ` Stefan Hajnoczi
  2022-04-19 20:44 ` [PATCH v8 17/17] vfio-user: avocado tests for vfio-user Jagannathan Raman
  2022-04-25 10:32 ` [PATCH v8 00/17] vfio-user server in QEMU Stefan Hajnoczi
  17 siblings, 1 reply; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Adds handler to reset a remote device

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 hw/remote/vfio-user-obj.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c
index 70b4d8b9ce..8ca823aa01 100644
--- a/hw/remote/vfio-user-obj.c
+++ b/hw/remote/vfio-user-obj.c
@@ -623,6 +623,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus)
     pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1);
 }
 
+static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type)
+{
+    VfuObject *o = vfu_get_private(vfu_ctx);
+
+    /* vfu_object_ctx_run() handles lost connection */
+    if (type == VFU_RESET_LOST_CONN) {
+        return 0;
+    }
+
+    qdev_reset_all(DEVICE(o->pci_dev));
+
+    return 0;
+}
+
 /*
  * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device'
  * properties. It also depends on devices instantiated in QEMU. These
@@ -728,6 +742,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp)
         goto fail;
     }
 
+    ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset);
+    if (ret < 0) {
+        error_setg(errp, "vfu: Failed to setup reset callback");
+        goto fail;
+    }
+
     ret = vfu_realize_ctx(o->vfu_ctx);
     if (ret < 0) {
         error_setg(errp, "vfu: Failed to realize device %s- %s",
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 17/17] vfio-user: avocado tests for vfio-user
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (15 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 16/17] vfio-user: handle reset of remote device Jagannathan Raman
@ 2022-04-19 20:44 ` Jagannathan Raman
  2022-04-25 10:32 ` [PATCH v8 00/17] vfio-user server in QEMU Stefan Hajnoczi
  17 siblings, 0 replies; 46+ messages in thread
From: Jagannathan Raman @ 2022-04-19 20:44 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, thanos.makatos,
	kanth.ghatraju, stefanha, marcandre.lureau, pbonzini, jag.raman,
	eblake, dgilbert

Avocado tests for libvfio-user in QEMU - tests startup,
hotplug and migration of the server object

Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
---
 MAINTAINERS                |   1 +
 tests/avocado/vfio-user.py | 164 +++++++++++++++++++++++++++++++++++++
 2 files changed, 165 insertions(+)
 create mode 100644 tests/avocado/vfio-user.py

diff --git a/MAINTAINERS b/MAINTAINERS
index ad51ec0dc8..8676f546e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3602,6 +3602,7 @@ F: hw/remote/vfio-user-obj.c
 F: include/hw/remote/vfio-user-obj.h
 F: hw/remote/iommu.c
 F: include/hw/remote/iommu.h
+F: tests/avocado/vfio-user.py
 
 EBPF:
 M: Jason Wang <jasowang@redhat.com>
diff --git a/tests/avocado/vfio-user.py b/tests/avocado/vfio-user.py
new file mode 100644
index 0000000000..ced304d770
--- /dev/null
+++ b/tests/avocado/vfio-user.py
@@ -0,0 +1,164 @@
+# vfio-user protocol sanity test
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+
+import os
+import socket
+import uuid
+
+from avocado_qemu import QemuSystemTest
+from avocado_qemu import wait_for_console_pattern
+from avocado_qemu import exec_command
+from avocado_qemu import exec_command_and_wait_for_pattern
+
+from avocado.utils import network
+from avocado.utils import wait
+
+class VfioUser(QemuSystemTest):
+    """
+    :avocado: tags=vfiouser
+    """
+    KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
+    timeout = 20
+
+    def _get_free_port(self):
+        port = network.find_free_port()
+        if port is None:
+            self.cancel('Failed to find a free port')
+        return port
+
+    def validate_vm_launch(self, vm):
+        wait_for_console_pattern(self, 'as init process',
+                                 'Kernel panic - not syncing', vm=vm)
+        exec_command(self, 'mount -t sysfs sysfs /sys', vm=vm)
+        exec_command_and_wait_for_pattern(self,
+                                          'cat /sys/bus/pci/devices/*/uevent',
+                                          'PCI_ID=1000:0060', vm=vm)
+
+    def launch_server_startup(self, socket, *opts):
+        server_vm = self.get_vm()
+        server_vm.add_args('-machine', 'x-remote,vfio-user=on')
+        server_vm.add_args('-nodefaults')
+        server_vm.add_args('-device', 'megasas,id=sas1')
+        server_vm.add_args('-object', 'x-vfio-user-server,id=vfioobj1,'
+                           'type=unix,path='+socket+',device=sas1')
+        for opt in opts:
+            server_vm.add_args(opt)
+        server_vm.launch()
+        return server_vm
+
+    def launch_server_hotplug(self, socket):
+        server_vm = self.get_vm()
+        server_vm.add_args('-machine', 'x-remote,vfio-user=on')
+        server_vm.add_args('-nodefaults')
+        server_vm.launch()
+        server_vm.qmp('device_add', args_dict=None, conv_keys=None,
+                      driver='megasas', id='sas1')
+        obj_add_opts = {'qom-type': 'x-vfio-user-server',
+                        'id': 'vfioobj', 'device': 'sas1',
+                        'socket': {'type': 'unix', 'path': socket}}
+        server_vm.qmp('object-add', args_dict=obj_add_opts)
+        return server_vm
+
+    def launch_client(self, kernel_path, initrd_path, kernel_command_line,
+                      machine_type, socket, *opts):
+        client_vm = self.get_vm()
+        client_vm.set_console()
+        client_vm.add_args('-machine', machine_type)
+        client_vm.add_args('-accel', 'kvm')
+        client_vm.add_args('-cpu', 'host')
+        client_vm.add_args('-object',
+                           'memory-backend-memfd,id=sysmem-file,size=2G')
+        client_vm.add_args('--numa', 'node,memdev=sysmem-file')
+        client_vm.add_args('-m', '2048')
+        client_vm.add_args('-kernel', kernel_path,
+                           '-initrd', initrd_path,
+                           '-append', kernel_command_line)
+        client_vm.add_args('-device',
+                           'vfio-user-pci,socket='+socket)
+        for opt in opts:
+            client_vm.add_args(opt)
+        client_vm.launch()
+        return client_vm
+
+    def do_test_startup(self, kernel_url, initrd_url, kernel_command_line,
+                machine_type):
+        self.require_accelerator('kvm')
+
+        kernel_path = self.fetch_asset(kernel_url)
+        initrd_path = self.fetch_asset(initrd_url)
+        socket = os.path.join('/tmp', str(uuid.uuid4()))
+        if os.path.exists(socket):
+            os.remove(socket)
+        self.launch_server_startup(socket)
+        client = self.launch_client(kernel_path, initrd_path,
+                                    kernel_command_line, machine_type, socket)
+        self.validate_vm_launch(client)
+
+    def do_test_hotplug(self, kernel_url, initrd_url, kernel_command_line,
+                machine_type):
+        self.require_accelerator('kvm')
+
+        kernel_path = self.fetch_asset(kernel_url)
+        initrd_path = self.fetch_asset(initrd_url)
+        socket = os.path.join('/tmp', str(uuid.uuid4()))
+        if os.path.exists(socket):
+            os.remove(socket)
+        self.launch_server_hotplug(socket)
+        client = self.launch_client(kernel_path, initrd_path,
+                                    kernel_command_line, machine_type, socket)
+        self.validate_vm_launch(client)
+
+    def test_vfio_user_x86_64(self):
+        """
+        :avocado: tags=arch:x86_64
+        :avocado: tags=distro:centos
+        """
+        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/x86_64/os/images'
+                      '/pxeboot/vmlinuz')
+        initrd_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/x86_64/os/images'
+                      '/pxeboot/initrd.img')
+        kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
+                               'console=ttyS0 rdinit=/bin/bash')
+        machine_type = 'pc'
+        self.do_test_startup(kernel_url, initrd_url, kernel_command_line,
+                             machine_type)
+
+    def test_vfio_user_aarch64(self):
+        """
+        :avocado: tags=arch:aarch64
+        :avocado: tags=distro:ubuntu
+        """
+        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/aarch64/os/images'
+                      '/pxeboot/vmlinuz')
+        initrd_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/aarch64/os/images'
+                      '/pxeboot/initrd.img')
+        kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
+                               'rdinit=/bin/bash console=ttyAMA0')
+        machine_type = 'virt,gic-version=3'
+        self.do_test_startup(kernel_url, initrd_url, kernel_command_line,
+                             machine_type)
+
+    def test_vfio_user_hotplug_x86_64(self):
+        """
+        :avocado: tags=arch:x86_64
+        :avocado: tags=distro:centos
+        """
+        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/x86_64/os/images'
+                      '/pxeboot/vmlinuz')
+        initrd_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/31/Everything/x86_64/os/images'
+                      '/pxeboot/initrd.img')
+        kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
+                               'console=ttyS0 rdinit=/bin/bash')
+        machine_type = 'pc'
+        self.do_test_hotplug(kernel_url, initrd_url, kernel_command_line,
+                             machine_type)
+
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-19 20:44 ` [PATCH v8 12/17] vfio-user: IOMMU support for remote device Jagannathan Raman
@ 2022-04-20 11:15   ` Jag Raman
  2022-04-25  9:38     ` Stefan Hajnoczi
  2022-04-25  9:31   ` Stefan Hajnoczi
  1 sibling, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-20 11:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: eduardo, Elena Ufimtseva, Thomas Huth, John Johnson,
	Daniel P. Berrangé,
	Beraldo Leal, John Levon, Michael S. Tsirkin, Markus Armbruster,
	Juan Quintela, Philippe Mathieu-Daudé,
	Igor Mammedov, Thanos Makatos, Kanth Ghatraju, Stefan Hajnoczi,
	Marc-André Lureau, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert



> On Apr 19, 2022, at 4:45 PM, Jag Raman <jag.raman@oracle.com> wrote:
> 
> Assign separate address space for each device in the remote processes.
> 
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
> include/hw/remote/iommu.h |  40 +++++++++++++
> hw/remote/iommu.c         | 114 ++++++++++++++++++++++++++++++++++++++
> hw/remote/machine.c       |  13 ++++-
> MAINTAINERS               |   2 +
> hw/remote/meson.build     |   1 +
> 5 files changed, 169 insertions(+), 1 deletion(-)
> create mode 100644 include/hw/remote/iommu.h
> create mode 100644 hw/remote/iommu.c
> 
> diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
> new file mode 100644
> index 0000000000..33b68a8f4b
> --- /dev/null
> +++ b/include/hw/remote/iommu.h
> @@ -0,0 +1,40 @@
> +/**
> + * Copyright © 2022 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef REMOTE_IOMMU_H
> +#define REMOTE_IOMMU_H
> +
> +#include "hw/pci/pci_bus.h"
> +#include "hw/pci/pci.h"
> +
> +#ifndef INT2VOIDP
> +#define INT2VOIDP(i) (void *)(uintptr_t)(i)
> +#endif
> +
> +typedef struct RemoteIommuElem {
> +    MemoryRegion *mr;
> +
> +    AddressSpace as;
> +} RemoteIommuElem;
> +
> +#define TYPE_REMOTE_IOMMU "x-remote-iommu"
> +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
> +
> +struct RemoteIommu {
> +    Object parent;
> +
> +    GHashTable *elem_by_devfn;
> +
> +    QemuMutex lock;
> +};
> +
> +void remote_iommu_setup(PCIBus *pci_bus);
> +
> +void remote_iommu_unplug_dev(PCIDevice *pci_dev);
> +
> +#endif
> diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
> new file mode 100644
> index 0000000000..16c6b0834e
> --- /dev/null
> +++ b/hw/remote/iommu.c
> @@ -0,0 +1,114 @@
> +/**
> + * IOMMU for remote device
> + *
> + * Copyright © 2022 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "hw/remote/iommu.h"
> +#include "hw/pci/pci_bus.h"
> +#include "hw/pci/pci.h"
> +#include "exec/memory.h"
> +#include "exec/address-spaces.h"
> +#include "trace.h"
> +
> +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
> +                                              void *opaque, int devfn)
> +{
> +    RemoteIommu *iommu = opaque;
> +    RemoteIommuElem *elem = NULL;
> +
> +    qemu_mutex_lock(&iommu->lock);
> +
> +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
> +
> +    if (!elem) {
> +        elem = g_malloc0(sizeof(RemoteIommuElem));
> +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
> +    }
> +
> +    if (!elem->mr) {
> +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
> +        memory_region_set_size(elem->mr, UINT64_MAX);
> +        address_space_init(&elem->as, elem->mr, NULL);

Hi,

I’d like to add a note here.

We tried to add "elem->mr” as a child of PCIDevice. That way, when PCIDevice is
unplugged, the child is also finalized.

However, there was some issue with hotplug. During the hotplug, there’s a window
during initialization where we couldn’t lookup the PCIDevice by “devfn”.

do_pci_register_device() -> pci_init_bus_master() -> pci_device_iommu_address_space()
happens before do_pci_register_device() -> “bus->devices[devfn] = pci_dev”. As such,
pci_find_device() doesn’t work at this time.

Thank you!
--
Jag

> +    }
> +
> +    qemu_mutex_unlock(&iommu->lock);
> +
> +    return &elem->as;
> +}
> +
> +void remote_iommu_unplug_dev(PCIDevice *pci_dev)
> +{
> +    AddressSpace *as = pci_device_iommu_address_space(pci_dev);
> +    RemoteIommuElem *elem = NULL;
> +
> +    if (as == &address_space_memory) {
> +        return;
> +    }
> +
> +    elem = container_of(as, RemoteIommuElem, as);
> +
> +    address_space_destroy(&elem->as);
> +
> +    object_unref(elem->mr);
> +
> +    elem->mr = NULL;
> +}
> +
> +static void remote_iommu_init(Object *obj)
> +{
> +    RemoteIommu *iommu = REMOTE_IOMMU(obj);
> +
> +    iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free);
> +
> +    qemu_mutex_init(&iommu->lock);
> +}
> +
> +static void remote_iommu_finalize(Object *obj)
> +{
> +    RemoteIommu *iommu = REMOTE_IOMMU(obj);
> +
> +    qemu_mutex_destroy(&iommu->lock);
> +
> +    if (iommu->elem_by_devfn) {
> +        g_hash_table_destroy(iommu->elem_by_devfn);
> +        iommu->elem_by_devfn = NULL;
> +    }
> +}
> +
> +void remote_iommu_setup(PCIBus *pci_bus)
> +{
> +    RemoteIommu *iommu = NULL;
> +
> +    g_assert(pci_bus);
> +
> +    iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU));
> +
> +    pci_setup_iommu(pci_bus, remote_iommu_find_add_as, iommu);
> +
> +    object_property_add_child(OBJECT(pci_bus), "remote-iommu", OBJECT(iommu));
> +
> +    object_unref(OBJECT(iommu));
> +}
> +
> +static const TypeInfo remote_iommu_info = {
> +    .name = TYPE_REMOTE_IOMMU,
> +    .parent = TYPE_OBJECT,
> +    .instance_size = sizeof(RemoteIommu),
> +    .instance_init = remote_iommu_init,
> +    .instance_finalize = remote_iommu_finalize,
> +};
> +
> +static void remote_iommu_register_types(void)
> +{
> +    type_register_static(&remote_iommu_info);
> +}
> +
> +type_init(remote_iommu_register_types)
> diff --git a/hw/remote/machine.c b/hw/remote/machine.c
> index ed91659794..cca5d25f50 100644
> --- a/hw/remote/machine.c
> +++ b/hw/remote/machine.c
> @@ -21,6 +21,7 @@
> #include "qapi/error.h"
> #include "hw/pci/pci_host.h"
> #include "hw/remote/iohub.h"
> +#include "hw/remote/iommu.h"
> #include "hw/qdev-core.h"
> 
> static void remote_machine_init(MachineState *machine)
> @@ -100,6 +101,16 @@ static void remote_machine_instance_init(Object *obj)
>     s->auto_shutdown = true;
> }
> 
> +static void remote_machine_dev_unplug_cb(HotplugHandler *hotplug_dev,
> +                                         DeviceState *dev, Error **errp)
> +{
> +    qdev_unrealize(dev);
> +
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
> +        remote_iommu_unplug_dev(PCI_DEVICE(dev));
> +    }
> +}
> +
> static void remote_machine_class_init(ObjectClass *oc, void *data)
> {
>     MachineClass *mc = MACHINE_CLASS(oc);
> @@ -108,7 +119,7 @@ static void remote_machine_class_init(ObjectClass *oc, void *data)
>     mc->init = remote_machine_init;
>     mc->desc = "Experimental remote machine";
> 
> -    hc->unplug = qdev_simple_device_unplug_cb;
> +    hc->unplug = remote_machine_dev_unplug_cb;
> 
>     object_class_property_add_bool(oc, "vfio-user",
>                                    remote_machine_get_vfio_user,
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 37afdecc61..3e62ccab0a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3599,6 +3599,8 @@ F: hw/remote/iohub.c
> F: include/hw/remote/iohub.h
> F: subprojects/libvfio-user
> F: hw/remote/vfio-user-obj.c
> +F: hw/remote/iommu.c
> +F: include/hw/remote/iommu.h
> 
> EBPF:
> M: Jason Wang <jasowang@redhat.com>
> diff --git a/hw/remote/meson.build b/hw/remote/meson.build
> index 534ac5df79..bcef83c8cc 100644
> --- a/hw/remote/meson.build
> +++ b/hw/remote/meson.build
> @@ -6,6 +6,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c'))
> remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c'))
> remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c'))
> remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c'))
> +remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iommu.c'))
> remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c'))
> 
> remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: vfiouser)
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 02/17] qdev: unplug blocker for devices
  2022-04-19 20:44 ` [PATCH v8 02/17] qdev: unplug blocker for devices Jagannathan Raman
@ 2022-04-21 14:55   ` Markus Armbruster
  2022-04-21 17:49     ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2022-04-21 14:55 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, dgilbert, stefanha,
	marcandre.lureau, pbonzini, eblake

Jagannathan Raman <jag.raman@oracle.com> writes:

> Add blocker to prevent hot-unplug of devices

Why do you need this?  I'm not doubting you do, I just want to read your
reasons here :)

>
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
>  hw/core/qdev.c         | 24 ++++++++++++++++++++++++
>  softmmu/qdev-monitor.c |  4 ++++
>  3 files changed, 57 insertions(+)
>
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 92c3d65208..1b9fa25e5c 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -193,6 +193,7 @@ struct DeviceState {
>      int instance_id_alias;
>      int alias_required_for_version;
>      ResettableState reset;
> +    GSList *unplug_blockers;
>  };
>  
>  struct DeviceListener {
> @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
>  void qdev_machine_creation_done(void);
>  bool qdev_machine_modified(void);
>  
> +/*
> + * qdev_add_unplug_blocker: Adds an unplug blocker to a device
> + *
> + * @dev: Device to be blocked from unplug
> + * @reason: Reason for blocking
> + */
> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
> +
> +/*
> + * qdev_del_unplug_blocker: Removes an unplug blocker from a device
> + *
> + * @dev: Device to be unblocked
> + * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
> + *          Used as a handle to lookup the blocker for deletion.
> + */
> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
> +
> +/*
> + * qdev_unplug_blocked: Confirms if a device is blocked from unplug
> + *
> + * @dev: Device to be tested
> + * @reason: Returns one of the reasons why the device is blocked,
> + *          if any
> + *
> + * Returns: true if device is blocked from unplug, false otherwise
> + */
> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
> +
>  /**
>   * GpioPolarity: Polarity of a GPIO line
>   *
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 84f3019440..0806d8fcaa 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev)
>      return NULL;
>  }
>  
> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
> +{
> +    dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
> +}
> +
> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
> +{
> +    dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
> +}
> +
> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
> +{
> +    ERRP_GUARD();
> +
> +    if (dev->unplug_blockers) {
> +        error_propagate(errp, error_copy(dev->unplug_blockers->data));
> +        return true;
> +    }
> +
> +    return false;
> +}

This cites the most recently added blocker as reason.  Your function
comment covers it: "Returns one of the reasons".  Okay.

> +
>  static bool device_get_realized(Object *obj, Error **errp)
>  {
>      DeviceState *dev = DEVICE(obj);
> @@ -704,6 +726,8 @@ static void device_finalize(Object *obj)
>  
>      DeviceState *dev = DEVICE(obj);
>  
> +    g_assert(!dev->unplug_blockers);
> +
>      QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>          QLIST_REMOVE(ngl, node);
>          qemu_free_irqs(ngl->in, ngl->num_in);
> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
> index 12fe60c467..9cfd59d17c 100644
> --- a/softmmu/qdev-monitor.c
> +++ b/softmmu/qdev-monitor.c
> @@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp)
>      HotplugHandlerClass *hdc;
>      Error *local_err = NULL;
>  
> +    if (qdev_unplug_blocked(dev, errp)) {
> +        return;
> +    }
> +
>      if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) {
>          error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name);
>          return;



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 10/17] vfio-user: run vfio-user context
  2022-04-19 20:44 ` [PATCH v8 10/17] vfio-user: run vfio-user context Jagannathan Raman
@ 2022-04-21 14:59   ` Markus Armbruster
  2022-04-21 17:52     ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2022-04-21 14:59 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, dgilbert, stefanha,
	marcandre.lureau, pbonzini, eblake

Jagannathan Raman <jag.raman@oracle.com> writes:

> Setup a handler to run vfio-user context. The context is driven by
> messages to the file descriptor associated with it - get the fd for
> the context and hook up the handler with it
>
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  qapi/misc.json            | 23 ++++++++++
>  hw/remote/vfio-user-obj.c | 95 ++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 117 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/misc.json b/qapi/misc.json
> index b83cc39029..f3cc4a4854 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -553,3 +553,26 @@
>  ##
>  { 'event': 'RTC_CHANGE',
>    'data': { 'offset': 'int', 'qom-path': 'str' } }
> +
> +##
> +# @VFU_CLIENT_HANGUP:
> +#
> +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
> +# communication channel
> +#
> +# @id: ID of the TYPE_VFIO_USER_SERVER object
> +#
> +# @device: ID of attached PCI device

Is this the ID set with -device id=... and such?

> +#
> +# Since: 7.1
> +#
> +# Example:
> +#
> +# <- { "event": "VFU_CLIENT_HANGUP",
> +#      "data": { "id": "vfu1",
> +#                "device": "lsi1" },
> +#      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> +#
> +##
> +{ 'event': 'VFU_CLIENT_HANGUP',
> +  'data': { 'id': 'str', 'device': 'str' } }



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 02/17] qdev: unplug blocker for devices
  2022-04-21 14:55   ` Markus Armbruster
@ 2022-04-21 17:49     ` Jag Raman
  2022-04-22  5:18       ` Markus Armbruster
  0 siblings, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-21 17:49 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, mst, quintela, f4bug, qemu-devel, thanos.makatos,
	Kanth Ghatraju, dgilbert, Stefan Hajnoczi, marcandre.lureau,
	pbonzini, eblake



> On Apr 21, 2022, at 10:55 AM, Markus Armbruster <armbru@redhat.com> wrote:
> 
> Jagannathan Raman <jag.raman@oracle.com> writes:
> 
>> Add blocker to prevent hot-unplug of devices
> 
> Why do you need this?  I'm not doubting you do, I just want to read your
> reasons here :)

Hi Markus, :)

The x-vfio-user-server depends on an attached PCIDevice. As long as x-vfio-user-server
is used, we don’t want the PCIDevice to be unplugged. This blocker prevents an user
from removing PCIDevice while the vfio-user server is in use.

> 
>> 
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>

I recall receiving a “Reviewed-by” from Stefan previously.

I’m very sorry I didn’t add that here. I’ll go over all the patches once again to confirm that
the “Reviewed-by” status reflects accurately.

>> ---
>> include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
>> hw/core/qdev.c         | 24 ++++++++++++++++++++++++
>> softmmu/qdev-monitor.c |  4 ++++
>> 3 files changed, 57 insertions(+)
>> 
>> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>> index 92c3d65208..1b9fa25e5c 100644
>> --- a/include/hw/qdev-core.h
>> +++ b/include/hw/qdev-core.h
>> @@ -193,6 +193,7 @@ struct DeviceState {
>>     int instance_id_alias;
>>     int alias_required_for_version;
>>     ResettableState reset;
>> +    GSList *unplug_blockers;
>> };
>> 
>> struct DeviceListener {
>> @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
>> void qdev_machine_creation_done(void);
>> bool qdev_machine_modified(void);
>> 
>> +/*
>> + * qdev_add_unplug_blocker: Adds an unplug blocker to a device
>> + *
>> + * @dev: Device to be blocked from unplug
>> + * @reason: Reason for blocking
>> + */
>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
>> +
>> +/*
>> + * qdev_del_unplug_blocker: Removes an unplug blocker from a device
>> + *
>> + * @dev: Device to be unblocked
>> + * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
>> + *          Used as a handle to lookup the blocker for deletion.
>> + */
>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
>> +
>> +/*
>> + * qdev_unplug_blocked: Confirms if a device is blocked from unplug
>> + *
>> + * @dev: Device to be tested
>> + * @reason: Returns one of the reasons why the device is blocked,
>> + *          if any
>> + *
>> + * Returns: true if device is blocked from unplug, false otherwise
>> + */
>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
>> +
>> /**
>>  * GpioPolarity: Polarity of a GPIO line
>>  *
>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> index 84f3019440..0806d8fcaa 100644
>> --- a/hw/core/qdev.c
>> +++ b/hw/core/qdev.c
>> @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev)
>>     return NULL;
>> }
>> 
>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
>> +{
>> +    dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
>> +}
>> +
>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
>> +{
>> +    dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
>> +}
>> +
>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
>> +{
>> +    ERRP_GUARD();
>> +
>> +    if (dev->unplug_blockers) {
>> +        error_propagate(errp, error_copy(dev->unplug_blockers->data));
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}
> 
> This cites the most recently added blocker as reason.  Your function
> comment covers it: "Returns one of the reasons".  Okay.

I could change the comment to say that it returns the recently added reason.

Thank you!
--
Jag

> 
>> +
>> static bool device_get_realized(Object *obj, Error **errp)
>> {
>>     DeviceState *dev = DEVICE(obj);
>> @@ -704,6 +726,8 @@ static void device_finalize(Object *obj)
>> 
>>     DeviceState *dev = DEVICE(obj);
>> 
>> +    g_assert(!dev->unplug_blockers);
>> +
>>     QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
>>         QLIST_REMOVE(ngl, node);
>>         qemu_free_irqs(ngl->in, ngl->num_in);
>> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
>> index 12fe60c467..9cfd59d17c 100644
>> --- a/softmmu/qdev-monitor.c
>> +++ b/softmmu/qdev-monitor.c
>> @@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp)
>>     HotplugHandlerClass *hdc;
>>     Error *local_err = NULL;
>> 
>> +    if (qdev_unplug_blocked(dev, errp)) {
>> +        return;
>> +    }
>> +
>>     if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) {
>>         error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name);
>>         return;
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 10/17] vfio-user: run vfio-user context
  2022-04-21 14:59   ` Markus Armbruster
@ 2022-04-21 17:52     ` Jag Raman
  2022-04-22  5:14       ` Markus Armbruster
  0 siblings, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-21 17:52 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, dgilbert,
	Stefan Hajnoczi, marcandre.lureau, Paolo Bonzini, eblake



> On Apr 21, 2022, at 10:59 AM, Markus Armbruster <armbru@redhat.com> wrote:
> 
> Jagannathan Raman <jag.raman@oracle.com> writes:
> 
>> Setup a handler to run vfio-user context. The context is driven by
>> messages to the file descriptor associated with it - get the fd for
>> the context and hook up the handler with it
>> 
>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>> qapi/misc.json            | 23 ++++++++++
>> hw/remote/vfio-user-obj.c | 95 ++++++++++++++++++++++++++++++++++++++-
>> 2 files changed, 117 insertions(+), 1 deletion(-)
>> 
>> diff --git a/qapi/misc.json b/qapi/misc.json
>> index b83cc39029..f3cc4a4854 100644
>> --- a/qapi/misc.json
>> +++ b/qapi/misc.json
>> @@ -553,3 +553,26 @@
>> ##
>> { 'event': 'RTC_CHANGE',
>>   'data': { 'offset': 'int', 'qom-path': 'str' } }
>> +
>> +##
>> +# @VFU_CLIENT_HANGUP:
>> +#
>> +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
>> +# communication channel
>> +#
>> +# @id: ID of the TYPE_VFIO_USER_SERVER object
>> +#
>> +# @device: ID of attached PCI device
> 
> Is this the ID set with -device id=... and such?

Yes, that is correct. It’s the ID set with the “-device id=…” option/

Thank you!
--
Jag

> 
>> +#
>> +# Since: 7.1
>> +#
>> +# Example:
>> +#
>> +# <- { "event": "VFU_CLIENT_HANGUP",
>> +#      "data": { "id": "vfu1",
>> +#                "device": "lsi1" },
>> +#      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
>> +#
>> +##
>> +{ 'event': 'VFU_CLIENT_HANGUP',
>> +  'data': { 'id': 'str', 'device': 'str' } }
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 10/17] vfio-user: run vfio-user context
  2022-04-21 17:52     ` Jag Raman
@ 2022-04-22  5:14       ` Markus Armbruster
  2022-04-22 14:18         ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2022-04-22  5:14 UTC (permalink / raw)
  To: Jag Raman
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, dgilbert,
	Stefan Hajnoczi, marcandre.lureau, Paolo Bonzini, eblake

Jag Raman <jag.raman@oracle.com> writes:

>> On Apr 21, 2022, at 10:59 AM, Markus Armbruster <armbru@redhat.com> wrote:
>> 
>> Jagannathan Raman <jag.raman@oracle.com> writes:
>> 
>>> Setup a handler to run vfio-user context. The context is driven by
>>> messages to the file descriptor associated with it - get the fd for
>>> the context and hook up the handler with it
>>> 
>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>> qapi/misc.json            | 23 ++++++++++
>>> hw/remote/vfio-user-obj.c | 95 ++++++++++++++++++++++++++++++++++++++-
>>> 2 files changed, 117 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/qapi/misc.json b/qapi/misc.json
>>> index b83cc39029..f3cc4a4854 100644
>>> --- a/qapi/misc.json
>>> +++ b/qapi/misc.json
>>> @@ -553,3 +553,26 @@
>>> ##
>>> { 'event': 'RTC_CHANGE',
>>>   'data': { 'offset': 'int', 'qom-path': 'str' } }
>>> +
>>> +##
>>> +# @VFU_CLIENT_HANGUP:
>>> +#
>>> +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
>>> +# communication channel
>>> +#
>>> +# @id: ID of the TYPE_VFIO_USER_SERVER object
>>> +#
>>> +# @device: ID of attached PCI device
>> 
>> Is this the ID set with -device id=... and such?
>
> Yes, that is correct. It’s the ID set with the “-device id=…” option/

What happens when the device was added *without* id=...?  DeviceState
member @id is null then.

I figure we need to make @device optional here, present if the device
has an ID.  I recommend to also add a member @qom-path, like we did for
MEMORY_DEVICE_SIZE_CHANGE in commit d89dd28f0e2.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 02/17] qdev: unplug blocker for devices
  2022-04-21 17:49     ` Jag Raman
@ 2022-04-22  5:18       ` Markus Armbruster
  2022-04-22 14:18         ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2022-04-22  5:18 UTC (permalink / raw)
  To: Jag Raman
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, mst, quintela, f4bug, qemu-devel, thanos.makatos,
	Kanth Ghatraju, dgilbert, Stefan Hajnoczi, marcandre.lureau,
	pbonzini, eblake

Jag Raman <jag.raman@oracle.com> writes:

>> On Apr 21, 2022, at 10:55 AM, Markus Armbruster <armbru@redhat.com> wrote:
>> 
>> Jagannathan Raman <jag.raman@oracle.com> writes:
>> 
>>> Add blocker to prevent hot-unplug of devices
>> 
>> Why do you need this?  I'm not doubting you do, I just want to read your
>> reasons here :)
>
> Hi Markus, :)
>
> The x-vfio-user-server depends on an attached PCIDevice. As long as x-vfio-user-server
> is used, we don’t want the PCIDevice to be unplugged. This blocker prevents an user
> from removing PCIDevice while the vfio-user server is in use.

Please work that into your commit message.  Perhaps along the lines of

    One of the next commits will do <stuff>.  <badness> will happen when
    the PCI device is unplugged.  Create the means to prevent that.

>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>
> I recall receiving a “Reviewed-by” from Stefan previously.
>
> I’m very sorry I didn’t add that here. I’ll go over all the patches once again to confirm that
> the “Reviewed-by” status reflects accurately.
>
>>> ---
>>> include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
>>> hw/core/qdev.c         | 24 ++++++++++++++++++++++++
>>> softmmu/qdev-monitor.c |  4 ++++
>>> 3 files changed, 57 insertions(+)
>>> 
>>> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>>> index 92c3d65208..1b9fa25e5c 100644
>>> --- a/include/hw/qdev-core.h
>>> +++ b/include/hw/qdev-core.h
>>> @@ -193,6 +193,7 @@ struct DeviceState {
>>>     int instance_id_alias;
>>>     int alias_required_for_version;
>>>     ResettableState reset;
>>> +    GSList *unplug_blockers;
>>> };
>>> 
>>> struct DeviceListener {
>>> @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
>>> void qdev_machine_creation_done(void);
>>> bool qdev_machine_modified(void);
>>> 
>>> +/*
>>> + * qdev_add_unplug_blocker: Adds an unplug blocker to a device
>>> + *
>>> + * @dev: Device to be blocked from unplug
>>> + * @reason: Reason for blocking
>>> + */
>>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
>>> +
>>> +/*
>>> + * qdev_del_unplug_blocker: Removes an unplug blocker from a device
>>> + *
>>> + * @dev: Device to be unblocked
>>> + * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
>>> + *          Used as a handle to lookup the blocker for deletion.
>>> + */
>>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
>>> +
>>> +/*
>>> + * qdev_unplug_blocked: Confirms if a device is blocked from unplug
>>> + *
>>> + * @dev: Device to be tested
>>> + * @reason: Returns one of the reasons why the device is blocked,
>>> + *          if any
>>> + *
>>> + * Returns: true if device is blocked from unplug, false otherwise
>>> + */
>>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
>>> +
>>> /**
>>>  * GpioPolarity: Polarity of a GPIO line
>>>  *
>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>> index 84f3019440..0806d8fcaa 100644
>>> --- a/hw/core/qdev.c
>>> +++ b/hw/core/qdev.c
>>> @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev)
>>>     return NULL;
>>> }
>>> 
>>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
>>> +{
>>> +    dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
>>> +}
>>> +
>>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
>>> +{
>>> +    dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
>>> +}
>>> +
>>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
>>> +{
>>> +    ERRP_GUARD();
>>> +
>>> +    if (dev->unplug_blockers) {
>>> +        error_propagate(errp, error_copy(dev->unplug_blockers->data));
>>> +        return true;
>>> +    }
>>> +
>>> +    return false;
>>> +}
>> 
>> This cites the most recently added blocker as reason.  Your function
>> comment covers it: "Returns one of the reasons".  Okay.
>
> I could change the comment to say that it returns the recently added reason.

Up to you.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 10/17] vfio-user: run vfio-user context
  2022-04-22  5:14       ` Markus Armbruster
@ 2022-04-22 14:18         ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-22 14:18 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, dgilbert,
	Stefan Hajnoczi, marcandre.lureau, Paolo Bonzini, eblake



> On Apr 22, 2022, at 1:14 AM, Markus Armbruster <armbru@redhat.com> wrote:
> 
> Jag Raman <jag.raman@oracle.com> writes:
> 
>>> On Apr 21, 2022, at 10:59 AM, Markus Armbruster <armbru@redhat.com> wrote:
>>> 
>>> Jagannathan Raman <jag.raman@oracle.com> writes:
>>> 
>>>> Setup a handler to run vfio-user context. The context is driven by
>>>> messages to the file descriptor associated with it - get the fd for
>>>> the context and hook up the handler with it
>>>> 
>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>>>> ---
>>>> qapi/misc.json            | 23 ++++++++++
>>>> hw/remote/vfio-user-obj.c | 95 ++++++++++++++++++++++++++++++++++++++-
>>>> 2 files changed, 117 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/qapi/misc.json b/qapi/misc.json
>>>> index b83cc39029..f3cc4a4854 100644
>>>> --- a/qapi/misc.json
>>>> +++ b/qapi/misc.json
>>>> @@ -553,3 +553,26 @@
>>>> ##
>>>> { 'event': 'RTC_CHANGE',
>>>>  'data': { 'offset': 'int', 'qom-path': 'str' } }
>>>> +
>>>> +##
>>>> +# @VFU_CLIENT_HANGUP:
>>>> +#
>>>> +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the
>>>> +# communication channel
>>>> +#
>>>> +# @id: ID of the TYPE_VFIO_USER_SERVER object
>>>> +#
>>>> +# @device: ID of attached PCI device
>>> 
>>> Is this the ID set with -device id=... and such?
>> 
>> Yes, that is correct. It’s the ID set with the “-device id=…” option/
> 
> What happens when the device was added *without* id=...?  DeviceState
> member @id is null then.

I’m sorry, the @id member in the event is the same as DeviceState->id, but
it’s not directly derived from it. It derived from  VfuObject->device, which is
a property of TYPE_VFU_OBJECT that the user sets.

This will not be NULL, as  TYPE_VFU_OBJECT will not initialize without it.

> 
> I figure we need to make @device optional here, present if the device
> has an ID.  I recommend to also add a member @qom-path, like we did for
> MEMORY_DEVICE_SIZE_CHANGE in commit d89dd28f0e2.

OK, will add it.

Thank you!
--
Jag

> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 02/17] qdev: unplug blocker for devices
  2022-04-22  5:18       ` Markus Armbruster
@ 2022-04-22 14:18         ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-22 14:18 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, mst, quintela, f4bug, qemu-devel, thanos.makatos,
	Kanth Ghatraju, dgilbert, Stefan Hajnoczi, marcandre.lureau,
	pbonzini, eblake



> On Apr 22, 2022, at 1:18 AM, Markus Armbruster <armbru@redhat.com> wrote:
> 
> Jag Raman <jag.raman@oracle.com> writes:
> 
>>> On Apr 21, 2022, at 10:55 AM, Markus Armbruster <armbru@redhat.com> wrote:
>>> 
>>> Jagannathan Raman <jag.raman@oracle.com> writes:
>>> 
>>>> Add blocker to prevent hot-unplug of devices
>>> 
>>> Why do you need this?  I'm not doubting you do, I just want to read your
>>> reasons here :)
>> 
>> Hi Markus, :)
>> 
>> The x-vfio-user-server depends on an attached PCIDevice. As long as x-vfio-user-server
>> is used, we don’t want the PCIDevice to be unplugged. This blocker prevents an user
>> from removing PCIDevice while the vfio-user server is in use.
> 
> Please work that into your commit message.  Perhaps along the lines of
> 
>    One of the next commits will do <stuff>.  <badness> will happen when
>    the PCI device is unplugged.  Create the means to prevent that.

OK, will do.

Thank you!
--
Jag

> 
>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>> 
>> I recall receiving a “Reviewed-by” from Stefan previously.
>> 
>> I’m very sorry I didn’t add that here. I’ll go over all the patches once again to confirm that
>> the “Reviewed-by” status reflects accurately.
>> 
>>>> ---
>>>> include/hw/qdev-core.h | 29 +++++++++++++++++++++++++++++
>>>> hw/core/qdev.c         | 24 ++++++++++++++++++++++++
>>>> softmmu/qdev-monitor.c |  4 ++++
>>>> 3 files changed, 57 insertions(+)
>>>> 
>>>> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>>>> index 92c3d65208..1b9fa25e5c 100644
>>>> --- a/include/hw/qdev-core.h
>>>> +++ b/include/hw/qdev-core.h
>>>> @@ -193,6 +193,7 @@ struct DeviceState {
>>>>    int instance_id_alias;
>>>>    int alias_required_for_version;
>>>>    ResettableState reset;
>>>> +    GSList *unplug_blockers;
>>>> };
>>>> 
>>>> struct DeviceListener {
>>>> @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
>>>> void qdev_machine_creation_done(void);
>>>> bool qdev_machine_modified(void);
>>>> 
>>>> +/*
>>>> + * qdev_add_unplug_blocker: Adds an unplug blocker to a device
>>>> + *
>>>> + * @dev: Device to be blocked from unplug
>>>> + * @reason: Reason for blocking
>>>> + */
>>>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason);
>>>> +
>>>> +/*
>>>> + * qdev_del_unplug_blocker: Removes an unplug blocker from a device
>>>> + *
>>>> + * @dev: Device to be unblocked
>>>> + * @reason: Pointer to the Error used with qdev_add_unplug_blocker.
>>>> + *          Used as a handle to lookup the blocker for deletion.
>>>> + */
>>>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason);
>>>> +
>>>> +/*
>>>> + * qdev_unplug_blocked: Confirms if a device is blocked from unplug
>>>> + *
>>>> + * @dev: Device to be tested
>>>> + * @reason: Returns one of the reasons why the device is blocked,
>>>> + *          if any
>>>> + *
>>>> + * Returns: true if device is blocked from unplug, false otherwise
>>>> + */
>>>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp);
>>>> +
>>>> /**
>>>> * GpioPolarity: Polarity of a GPIO line
>>>> *
>>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>>> index 84f3019440..0806d8fcaa 100644
>>>> --- a/hw/core/qdev.c
>>>> +++ b/hw/core/qdev.c
>>>> @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev)
>>>>    return NULL;
>>>> }
>>>> 
>>>> +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason)
>>>> +{
>>>> +    dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason);
>>>> +}
>>>> +
>>>> +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason)
>>>> +{
>>>> +    dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason);
>>>> +}
>>>> +
>>>> +bool qdev_unplug_blocked(DeviceState *dev, Error **errp)
>>>> +{
>>>> +    ERRP_GUARD();
>>>> +
>>>> +    if (dev->unplug_blockers) {
>>>> +        error_propagate(errp, error_copy(dev->unplug_blockers->data));
>>>> +        return true;
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>> 
>>> This cites the most recently added blocker as reason.  Your function
>>> comment covers it: "Returns one of the reasons".  Okay.
>> 
>> I could change the comment to say that it returns the recently added reason.
> 
> Up to you.
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 06/17] vfio-user: build library
  2022-04-19 20:44 ` [PATCH v8 06/17] vfio-user: build library Jagannathan Raman
@ 2022-04-25  8:22   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25  8:22 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

On Tue, Apr 19, 2022 at 04:44:11PM -0400, Jagannathan Raman wrote:
> add the libvfio-user library as a submodule. build it as a cmake
> subproject.
> 
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  configure                                  | 20 +++++++++-
>  meson.build                                | 44 +++++++++++++++++++++-
>  .gitlab-ci.d/buildtest.yml                 |  2 +
>  .gitmodules                                |  3 ++
>  Kconfig.host                               |  4 ++
>  MAINTAINERS                                |  1 +
>  hw/remote/Kconfig                          |  4 ++
>  hw/remote/meson.build                      |  2 +
>  meson_options.txt                          |  3 ++
>  subprojects/libvfio-user                   |  1 +
>  tests/docker/dockerfiles/centos8.docker    |  2 +
>  tests/docker/dockerfiles/ubuntu2004.docker |  2 +
>  12 files changed, 86 insertions(+), 2 deletions(-)
>  create mode 160000 subprojects/libvfio-user

libvfio-user is BSD-3-Clause and libjson-c is MIT (Expat).

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/17] vfio-user: define vfio-user-server object
  2022-04-19 20:44 ` [PATCH v8 07/17] vfio-user: define vfio-user-server object Jagannathan Raman
@ 2022-04-25  8:27   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25  8:27 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]

On Tue, Apr 19, 2022 at 04:44:12PM -0400, Jagannathan Raman wrote:
> diff --git a/qapi/qom.json b/qapi/qom.json
> index eeb5395ff3..e7b1758a11 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -703,6 +703,20 @@
>  { 'struct': 'RemoteObjectProperties',
>    'data': { 'fd': 'str', 'devid': 'str' } }
>  
> +##
> +# @VfioUserServerProperties:
> +#
> +# Properties for x-vfio-user-server objects.
> +#
> +# @socket: socket to be used by the libvfiouser library

For consistency: libvfio-user library

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-19 20:44 ` [PATCH v8 12/17] vfio-user: IOMMU support for remote device Jagannathan Raman
  2022-04-20 11:15   ` Jag Raman
@ 2022-04-25  9:31   ` Stefan Hajnoczi
  2022-04-25 17:26     ` Jag Raman
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25  9:31 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 2136 bytes --]

On Tue, Apr 19, 2022 at 04:44:17PM -0400, Jagannathan Raman wrote:
> +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
> +                                              void *opaque, int devfn)
> +{
> +    RemoteIommu *iommu = opaque;
> +    RemoteIommuElem *elem = NULL;
> +
> +    qemu_mutex_lock(&iommu->lock);
> +
> +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
> +
> +    if (!elem) {
> +        elem = g_malloc0(sizeof(RemoteIommuElem));
> +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
> +    }
> +
> +    if (!elem->mr) {
> +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
> +        memory_region_set_size(elem->mr, UINT64_MAX);
> +        address_space_init(&elem->as, elem->mr, NULL);
> +    }
> +
> +    qemu_mutex_unlock(&iommu->lock);
> +
> +    return &elem->as;
> +}

A few comments that can be added to this file to explain the design:

- Each vfio-user server handles one PCIDevice on a PCIBus. There is one
  RemoteIommu per PCIBus, so the RemoteIommu tracks multiple PCIDevices
  by maintaining a ->elem_by_devfn mapping.

- memory_region_init_iommu() is not used because vfio-user MemoryRegions
  will be added to the elem->mr container instead. This is more natural
  than implementing the IOMMUMemoryRegionClass APIs since vfio-user
  provides something that is close to a full-fledged MemoryRegion and
  not like an IOMMU mapping.

- When a device is hot unplugged, the elem->mr reference is dropped so
  all vfio-user MemoryRegions associated with this vfio-user server are
  destroyed.

> +static void remote_iommu_finalize(Object *obj)
> +{
> +    RemoteIommu *iommu = REMOTE_IOMMU(obj);
> +
> +    qemu_mutex_destroy(&iommu->lock);
> +
> +    if (iommu->elem_by_devfn) {

->init() and ->finalize() are a pair, so I don't think ->finalize() will
ever be called with a NULL ->elem_by_devfn.

If ->elem_by_devfn can be NULL then there would probably need to be a
check around qemu_mutex_destroy(&iommu->lock) too.

> +        g_hash_table_destroy(iommu->elem_by_devfn);
> +        iommu->elem_by_devfn = NULL;
> +    }
> +}

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-20 11:15   ` Jag Raman
@ 2022-04-25  9:38     ` Stefan Hajnoczi
  2022-04-25 17:30       ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25  9:38 UTC (permalink / raw)
  To: Jag Raman
  Cc: eduardo, Elena Ufimtseva, Thomas Huth, John Johnson,
	Daniel P. Berrangé,
	Beraldo Leal, John Levon, Michael S. Tsirkin, Markus Armbruster,
	Juan Quintela, Philippe Mathieu-Daudé,
	qemu-devel, Igor Mammedov, Thanos Makatos, Kanth Ghatraju,
	Marc-André Lureau, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 4334 bytes --]

On Wed, Apr 20, 2022 at 11:15:16AM +0000, Jag Raman wrote:
> 
> 
> > On Apr 19, 2022, at 4:45 PM, Jag Raman <jag.raman@oracle.com> wrote:
> > 
> > Assign separate address space for each device in the remote processes.
> > 
> > Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> > Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> > Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> > ---
> > include/hw/remote/iommu.h |  40 +++++++++++++
> > hw/remote/iommu.c         | 114 ++++++++++++++++++++++++++++++++++++++
> > hw/remote/machine.c       |  13 ++++-
> > MAINTAINERS               |   2 +
> > hw/remote/meson.build     |   1 +
> > 5 files changed, 169 insertions(+), 1 deletion(-)
> > create mode 100644 include/hw/remote/iommu.h
> > create mode 100644 hw/remote/iommu.c
> > 
> > diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
> > new file mode 100644
> > index 0000000000..33b68a8f4b
> > --- /dev/null
> > +++ b/include/hw/remote/iommu.h
> > @@ -0,0 +1,40 @@
> > +/**
> > + * Copyright © 2022 Oracle and/or its affiliates.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef REMOTE_IOMMU_H
> > +#define REMOTE_IOMMU_H
> > +
> > +#include "hw/pci/pci_bus.h"
> > +#include "hw/pci/pci.h"
> > +
> > +#ifndef INT2VOIDP
> > +#define INT2VOIDP(i) (void *)(uintptr_t)(i)
> > +#endif
> > +
> > +typedef struct RemoteIommuElem {
> > +    MemoryRegion *mr;
> > +
> > +    AddressSpace as;
> > +} RemoteIommuElem;
> > +
> > +#define TYPE_REMOTE_IOMMU "x-remote-iommu"
> > +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
> > +
> > +struct RemoteIommu {
> > +    Object parent;
> > +
> > +    GHashTable *elem_by_devfn;
> > +
> > +    QemuMutex lock;
> > +};
> > +
> > +void remote_iommu_setup(PCIBus *pci_bus);
> > +
> > +void remote_iommu_unplug_dev(PCIDevice *pci_dev);
> > +
> > +#endif
> > diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
> > new file mode 100644
> > index 0000000000..16c6b0834e
> > --- /dev/null
> > +++ b/hw/remote/iommu.c
> > @@ -0,0 +1,114 @@
> > +/**
> > + * IOMMU for remote device
> > + *
> > + * Copyright © 2022 Oracle and/or its affiliates.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu-common.h"
> > +
> > +#include "hw/remote/iommu.h"
> > +#include "hw/pci/pci_bus.h"
> > +#include "hw/pci/pci.h"
> > +#include "exec/memory.h"
> > +#include "exec/address-spaces.h"
> > +#include "trace.h"
> > +
> > +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
> > +                                              void *opaque, int devfn)
> > +{
> > +    RemoteIommu *iommu = opaque;
> > +    RemoteIommuElem *elem = NULL;
> > +
> > +    qemu_mutex_lock(&iommu->lock);
> > +
> > +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
> > +
> > +    if (!elem) {
> > +        elem = g_malloc0(sizeof(RemoteIommuElem));
> > +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
> > +    }
> > +
> > +    if (!elem->mr) {
> > +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
> > +        memory_region_set_size(elem->mr, UINT64_MAX);
> > +        address_space_init(&elem->as, elem->mr, NULL);
> 
> Hi,
> 
> I’d like to add a note here.
> 
> We tried to add "elem->mr” as a child of PCIDevice. That way, when PCIDevice is
> unplugged, the child is also finalized.

Do you mean via a memory_region_init()-family function where a parent
object is given? Or do you mean by adding a QOM child property?

> However, there was some issue with hotplug. During the hotplug, there’s a window
> during initialization where we couldn’t lookup the PCIDevice by “devfn”.
> 
> do_pci_register_device() -> pci_init_bus_master() -> pci_device_iommu_address_space()
> happens before do_pci_register_device() -> “bus->devices[devfn] = pci_dev”. As such,
> pci_find_device() doesn’t work at this time.

I don't follow. What calls pci_find_device()?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 13/17] vfio-user: handle DMA mappings
  2022-04-19 20:44 ` [PATCH v8 13/17] vfio-user: handle DMA mappings Jagannathan Raman
@ 2022-04-25  9:56   ` Stefan Hajnoczi
  2022-04-25 17:34     ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25  9:56 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

On Tue, Apr 19, 2022 at 04:44:18PM -0400, Jagannathan Raman wrote:
> +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
> +{
> +    VfuObject *o = vfu_get_private(vfu_ctx);
> +    AddressSpace *dma_as = NULL;
> +    MemoryRegion *mr = NULL;
> +    ram_addr_t offset;
> +
> +    mr = memory_region_from_host(info->vaddr, &offset);
> +    if (!mr) {
> +        return;
> +    }
> +
> +    dma_as = pci_device_iommu_address_space(o->pci_dev);
> +
> +    memory_region_del_subregion(dma_as->root, mr);
> +
> +    object_unparent((OBJECT(mr)));

Where is obj->parent set?

If it is not set then this call is a nop and mr is not freed:

  void object_unparent(Object *obj)
  {
      if (obj->parent) {
          object_property_del_child(obj->parent, obj);
      }
  }

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 14/17] vfio-user: handle PCI BAR accesses
  2022-04-19 20:44 ` [PATCH v8 14/17] vfio-user: handle PCI BAR accesses Jagannathan Raman
@ 2022-04-25 10:05   ` Stefan Hajnoczi
  2022-04-25 17:36     ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25 10:05 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 1798 bytes --]

On Tue, Apr 19, 2022 at 04:44:19PM -0400, Jagannathan Raman wrote:
> +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar,
> +                                hwaddr bar_offset, char * const buf,
> +                                hwaddr len, const bool is_write)
> +{
> +    MemoryRegionSection section = { 0 };
> +    uint8_t *ptr = (uint8_t *)buf;
> +    MemoryRegion *section_mr = NULL;
> +    uint64_t section_size;
> +    hwaddr section_offset;
> +    hwaddr size = 0;
> +
> +    while (len) {
> +        section = memory_region_find(pci_dev->io_regions[pci_bar].memory,
> +                                     bar_offset, len);
> +
> +        if (!section.mr) {
> +            warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset);
> +            return size;
> +        }
> +
> +        section_mr = section.mr;
> +        section_offset = section.offset_within_region;
> +        section_size = int128_get64(section.size);
> +
> +        if (is_write && section_mr->readonly) {
> +            warn_report("vfu: attempting to write to readonly region in "
> +                        "bar %d - [0x%"PRIx64" - 0x%"PRIx64"]",
> +                        pci_bar, bar_offset,
> +                        (bar_offset + section_size));
> +            memory_region_unref(section_mr);
> +            return size;
> +        }
> +
> +        if (vfu_object_mr_rw(section_mr, ptr, section_offset,
> +                             section_size, is_write)) {
> +            warn_report("vfu: failed to %s "
> +                        "[0x%"PRIx64" - 0x%"PRIx64"] in bar %d",
> +                        is_write ? "write to" : "read from", bar_offset,
> +                        (bar_offset + section_size), pci_bar);
> +            return size;

Missing memory_region_unref(section_mr).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 15/17] vfio-user: handle device interrupts
  2022-04-19 20:44 ` [PATCH v8 15/17] vfio-user: handle device interrupts Jagannathan Raman
@ 2022-04-25 10:27   ` Stefan Hajnoczi
  2022-04-25 17:40     ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25 10:27 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

On Tue, Apr 19, 2022 at 04:44:20PM -0400, Jagannathan Raman wrote:
> +static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
> +                                             unsigned int vector)
> +{
> +    MSIMessage msg;
> +
> +    msg.address = 0;
> +    msg.data = vector;
> +
> +    return msg;
> +}
> +
> +static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
> +{
> +    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
> +
> +    vfu_irq_trigger(vfu_ctx, msg.data);
> +}

Why did you switch to vfu_object_msi_prepare_msg() +
vfu_object_msi_trigger() in this revision?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 16/17] vfio-user: handle reset of remote device
  2022-04-19 20:44 ` [PATCH v8 16/17] vfio-user: handle reset of remote device Jagannathan Raman
@ 2022-04-25 10:27   ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25 10:27 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 450 bytes --]

On Tue, Apr 19, 2022 at 04:44:21PM -0400, Jagannathan Raman wrote:
> Adds handler to reset a remote device
> 
> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> ---
>  hw/remote/vfio-user-obj.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/17] vfio-user server in QEMU
  2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
                   ` (16 preceding siblings ...)
  2022-04-19 20:44 ` [PATCH v8 17/17] vfio-user: avocado tests for vfio-user Jagannathan Raman
@ 2022-04-25 10:32 ` Stefan Hajnoczi
  2022-04-25 17:40   ` Jag Raman
  17 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-25 10:32 UTC (permalink / raw)
  To: Jagannathan Raman
  Cc: eduardo, elena.ufimtseva, thuth, john.g.johnson, berrange, bleal,
	john.levon, mst, armbru, quintela, f4bug, qemu-devel,
	thanos.makatos, kanth.ghatraju, marcandre.lureau, pbonzini,
	eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 408 bytes --]

On Tue, Apr 19, 2022 at 04:44:05PM -0400, Jagannathan Raman wrote:
> This is v8 of the server side changes to enable vfio-user in QEMU.
> 
> Thank you very much for reviewing the last revision of this series!

I posted some minor comments. I hope the next revision or the one after
it will be merged because the code will benefit from being in-tree where
more people can easily try it out.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-25  9:31   ` Stefan Hajnoczi
@ 2022-04-25 17:26     ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:26 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju,
	Marc-André Lureau, Paolo Bonzini, eblake, dgilbert



> On Apr 25, 2022, at 5:31 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, Apr 19, 2022 at 04:44:17PM -0400, Jagannathan Raman wrote:
>> +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
>> +                                              void *opaque, int devfn)
>> +{
>> +    RemoteIommu *iommu = opaque;
>> +    RemoteIommuElem *elem = NULL;
>> +
>> +    qemu_mutex_lock(&iommu->lock);
>> +
>> +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
>> +
>> +    if (!elem) {
>> +        elem = g_malloc0(sizeof(RemoteIommuElem));
>> +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
>> +    }
>> +
>> +    if (!elem->mr) {
>> +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
>> +        memory_region_set_size(elem->mr, UINT64_MAX);
>> +        address_space_init(&elem->as, elem->mr, NULL);
>> +    }
>> +
>> +    qemu_mutex_unlock(&iommu->lock);
>> +
>> +    return &elem->as;
>> +}
> 
> A few comments that can be added to this file to explain the design:
> 
> - Each vfio-user server handles one PCIDevice on a PCIBus. There is one
>  RemoteIommu per PCIBus, so the RemoteIommu tracks multiple PCIDevices
>  by maintaining a ->elem_by_devfn mapping.
> 
> - memory_region_init_iommu() is not used because vfio-user MemoryRegions
>  will be added to the elem->mr container instead. This is more natural
>  than implementing the IOMMUMemoryRegionClass APIs since vfio-user
>  provides something that is close to a full-fledged MemoryRegion and
>  not like an IOMMU mapping.
> 
> - When a device is hot unplugged, the elem->mr reference is dropped so
>  all vfio-user MemoryRegions associated with this vfio-user server are
>  destroyed.

OK, will add this comment.

> 
>> +static void remote_iommu_finalize(Object *obj)
>> +{
>> +    RemoteIommu *iommu = REMOTE_IOMMU(obj);
>> +
>> +    qemu_mutex_destroy(&iommu->lock);
>> +
>> +    if (iommu->elem_by_devfn) {
> 
> ->init() and ->finalize() are a pair, so I don't think ->finalize() will
> ever be called with a NULL ->elem_by_devfn.

OK, got it.

Thanks!
--
Jag

> 
> If ->elem_by_devfn can be NULL then there would probably need to be a
> check around qemu_mutex_destroy(&iommu->lock) too.
> 
>> +        g_hash_table_destroy(iommu->elem_by_devfn);
>> +        iommu->elem_by_devfn = NULL;
>> +    }
>> +}



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-25  9:38     ` Stefan Hajnoczi
@ 2022-04-25 17:30       ` Jag Raman
  2022-04-28  9:47         ` Stefan Hajnoczi
  0 siblings, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, Thomas Huth, John Johnson,
	Daniel P. Berrangé,
	Beraldo Leal, John Levon, Michael S. Tsirkin, Markus Armbruster,
	Juan Quintela, Philippe Mathieu-Daudé,
	qemu-devel, Igor Mammedov, Thanos Makatos, Kanth Ghatraju,
	Marc-André Lureau, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert



> On Apr 25, 2022, at 5:38 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Wed, Apr 20, 2022 at 11:15:16AM +0000, Jag Raman wrote:
>> 
>> 
>>> On Apr 19, 2022, at 4:45 PM, Jag Raman <jag.raman@oracle.com> wrote:
>>> 
>>> Assign separate address space for each device in the remote processes.
>>> 
>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>> ---
>>> include/hw/remote/iommu.h |  40 +++++++++++++
>>> hw/remote/iommu.c         | 114 ++++++++++++++++++++++++++++++++++++++
>>> hw/remote/machine.c       |  13 ++++-
>>> MAINTAINERS               |   2 +
>>> hw/remote/meson.build     |   1 +
>>> 5 files changed, 169 insertions(+), 1 deletion(-)
>>> create mode 100644 include/hw/remote/iommu.h
>>> create mode 100644 hw/remote/iommu.c
>>> 
>>> diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
>>> new file mode 100644
>>> index 0000000000..33b68a8f4b
>>> --- /dev/null
>>> +++ b/include/hw/remote/iommu.h
>>> @@ -0,0 +1,40 @@
>>> +/**
>>> + * Copyright © 2022 Oracle and/or its affiliates.
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#ifndef REMOTE_IOMMU_H
>>> +#define REMOTE_IOMMU_H
>>> +
>>> +#include "hw/pci/pci_bus.h"
>>> +#include "hw/pci/pci.h"
>>> +
>>> +#ifndef INT2VOIDP
>>> +#define INT2VOIDP(i) (void *)(uintptr_t)(i)
>>> +#endif
>>> +
>>> +typedef struct RemoteIommuElem {
>>> +    MemoryRegion *mr;
>>> +
>>> +    AddressSpace as;
>>> +} RemoteIommuElem;
>>> +
>>> +#define TYPE_REMOTE_IOMMU "x-remote-iommu"
>>> +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
>>> +
>>> +struct RemoteIommu {
>>> +    Object parent;
>>> +
>>> +    GHashTable *elem_by_devfn;
>>> +
>>> +    QemuMutex lock;
>>> +};
>>> +
>>> +void remote_iommu_setup(PCIBus *pci_bus);
>>> +
>>> +void remote_iommu_unplug_dev(PCIDevice *pci_dev);
>>> +
>>> +#endif
>>> diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
>>> new file mode 100644
>>> index 0000000000..16c6b0834e
>>> --- /dev/null
>>> +++ b/hw/remote/iommu.c
>>> @@ -0,0 +1,114 @@
>>> +/**
>>> + * IOMMU for remote device
>>> + *
>>> + * Copyright © 2022 Oracle and/or its affiliates.
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qemu-common.h"
>>> +
>>> +#include "hw/remote/iommu.h"
>>> +#include "hw/pci/pci_bus.h"
>>> +#include "hw/pci/pci.h"
>>> +#include "exec/memory.h"
>>> +#include "exec/address-spaces.h"
>>> +#include "trace.h"
>>> +
>>> +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
>>> +                                              void *opaque, int devfn)
>>> +{
>>> +    RemoteIommu *iommu = opaque;
>>> +    RemoteIommuElem *elem = NULL;
>>> +
>>> +    qemu_mutex_lock(&iommu->lock);
>>> +
>>> +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
>>> +
>>> +    if (!elem) {
>>> +        elem = g_malloc0(sizeof(RemoteIommuElem));
>>> +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
>>> +    }
>>> +
>>> +    if (!elem->mr) {
>>> +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
>>> +        memory_region_set_size(elem->mr, UINT64_MAX);
>>> +        address_space_init(&elem->as, elem->mr, NULL);
>> 
>> Hi,
>> 
>> I’d like to add a note here.
>> 
>> We tried to add "elem->mr” as a child of PCIDevice. That way, when PCIDevice is
>> unplugged, the child is also finalized.
> 
> Do you mean via a memory_region_init()-family function where a parent
> object is given? Or do you mean by adding a QOM child property?

I mean by adding “elem->mr” as a QOM child property of PCIDevice.

> 
>> However, there was some issue with hotplug. During the hotplug, there’s a window
>> during initialization where we couldn’t lookup the PCIDevice by “devfn”.
>> 
>> do_pci_register_device() -> pci_init_bus_master() -> pci_device_iommu_address_space()
>> happens before do_pci_register_device() -> “bus->devices[devfn] = pci_dev”. As such,
>> pci_find_device() doesn’t work at this time.
> 
> I don't follow. What calls pci_find_device()?

To add the MemoryRegion as a child of PCIDevice, remote_iommu_find_add_as()
would need to lookup the PCIDevice using devfn. The function that looks up
PCIDevice by devfn is pci_find_device().

The above note explains why we didn’t lookup the PCIDevice using pci_find_device()
and then adding the MemoryRegion as its child.

Thank you!
--
Jag

> 
> Stefan


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 13/17] vfio-user: handle DMA mappings
  2022-04-25  9:56   ` Stefan Hajnoczi
@ 2022-04-25 17:34     ` Jag Raman
  2022-04-26 19:53       ` Jag Raman
  0 siblings, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:34 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	Paolo Bonzini, eblake, dgilbert



> On Apr 25, 2022, at 5:56 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, Apr 19, 2022 at 04:44:18PM -0400, Jagannathan Raman wrote:
>> +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
>> +{
>> +    VfuObject *o = vfu_get_private(vfu_ctx);
>> +    AddressSpace *dma_as = NULL;
>> +    MemoryRegion *mr = NULL;
>> +    ram_addr_t offset;
>> +
>> +    mr = memory_region_from_host(info->vaddr, &offset);
>> +    if (!mr) {
>> +        return;
>> +    }
>> +
>> +    dma_as = pci_device_iommu_address_space(o->pci_dev);
>> +
>> +    memory_region_del_subregion(dma_as->root, mr);
>> +
>> +    object_unparent((OBJECT(mr)));
> 
> Where is obj->parent set?

Yeah, it should be object_unref().

Thank you!
--
Jag

> 
> If it is not set then this call is a nop and mr is not freed:
> 
>  void object_unparent(Object *obj)
>  {
>      if (obj->parent) {
>          object_property_del_child(obj->parent, obj);
>      }
>  }



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 14/17] vfio-user: handle PCI BAR accesses
  2022-04-25 10:05   ` Stefan Hajnoczi
@ 2022-04-25 17:36     ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	pbonzini, eblake, dgilbert



> On Apr 25, 2022, at 6:05 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, Apr 19, 2022 at 04:44:19PM -0400, Jagannathan Raman wrote:
>> +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar,
>> +                                hwaddr bar_offset, char * const buf,
>> +                                hwaddr len, const bool is_write)
>> +{
>> +    MemoryRegionSection section = { 0 };
>> +    uint8_t *ptr = (uint8_t *)buf;
>> +    MemoryRegion *section_mr = NULL;
>> +    uint64_t section_size;
>> +    hwaddr section_offset;
>> +    hwaddr size = 0;
>> +
>> +    while (len) {
>> +        section = memory_region_find(pci_dev->io_regions[pci_bar].memory,
>> +                                     bar_offset, len);
>> +
>> +        if (!section.mr) {
>> +            warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset);
>> +            return size;
>> +        }
>> +
>> +        section_mr = section.mr;
>> +        section_offset = section.offset_within_region;
>> +        section_size = int128_get64(section.size);
>> +
>> +        if (is_write && section_mr->readonly) {
>> +            warn_report("vfu: attempting to write to readonly region in "
>> +                        "bar %d - [0x%"PRIx64" - 0x%"PRIx64"]",
>> +                        pci_bar, bar_offset,
>> +                        (bar_offset + section_size));
>> +            memory_region_unref(section_mr);
>> +            return size;
>> +        }
>> +
>> +        if (vfu_object_mr_rw(section_mr, ptr, section_offset,
>> +                             section_size, is_write)) {
>> +            warn_report("vfu: failed to %s "
>> +                        "[0x%"PRIx64" - 0x%"PRIx64"] in bar %d",
>> +                        is_write ? "write to" : "read from", bar_offset,
>> +                        (bar_offset + section_size), pci_bar);
>> +            return size;
> 
> Missing memory_region_unref(section_mr).

Ah got it.

Thanks!
--
Jag



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 15/17] vfio-user: handle device interrupts
  2022-04-25 10:27   ` Stefan Hajnoczi
@ 2022-04-25 17:40     ` Jag Raman
  2022-04-28  9:54       ` Stefan Hajnoczi
  0 siblings, 1 reply; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	pbonzini, eblake, dgilbert



> On Apr 25, 2022, at 6:27 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, Apr 19, 2022 at 04:44:20PM -0400, Jagannathan Raman wrote:
>> +static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
>> +                                             unsigned int vector)
>> +{
>> +    MSIMessage msg;
>> +
>> +    msg.address = 0;
>> +    msg.data = vector;
>> +
>> +    return msg;
>> +}
>> +
>> +static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
>> +{
>> +    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
>> +
>> +    vfu_irq_trigger(vfu_ctx, msg.data);
>> +}
> 
> Why did you switch to vfu_object_msi_prepare_msg() +
> vfu_object_msi_trigger() in this revision?

We previously did not do this switch because the server didn’t get updates
to the MSIx table & PBA.

The latest client version (which is not part of this series) forwards accesses
to the MSIx table & PBA over to the server. It also reads the PBA set by the
server. These change make it possible for the server to make this switch.

Thank you!
--
Jag

> 
> Stefan


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/17] vfio-user server in QEMU
  2022-04-25 10:32 ` [PATCH v8 00/17] vfio-user server in QEMU Stefan Hajnoczi
@ 2022-04-25 17:40   ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-25 17:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	Paolo Bonzini, eblake, dgilbert



> On Apr 25, 2022, at 6:32 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Tue, Apr 19, 2022 at 04:44:05PM -0400, Jagannathan Raman wrote:
>> This is v8 of the server side changes to enable vfio-user in QEMU.
>> 
>> Thank you very much for reviewing the last revision of this series!
> 
> I posted some minor comments. I hope the next revision or the one after
> it will be merged because the code will benefit from being in-tree where
> more people can easily try it out.

Thank you very much!

Will send the next revision out shortly!

--
Jag

> 
> Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 13/17] vfio-user: handle DMA mappings
  2022-04-25 17:34     ` Jag Raman
@ 2022-04-26 19:53       ` Jag Raman
  0 siblings, 0 replies; 46+ messages in thread
From: Jag Raman @ 2022-04-26 19:53 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	Paolo Bonzini, eblake, dgilbert



> On Apr 25, 2022, at 1:34 PM, Jag Raman <jag.raman@oracle.com> wrote:
> 
> 
> 
>> On Apr 25, 2022, at 5:56 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> 
>> On Tue, Apr 19, 2022 at 04:44:18PM -0400, Jagannathan Raman wrote:
>>> +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info)
>>> +{
>>> + VfuObject *o = vfu_get_private(vfu_ctx);
>>> + AddressSpace *dma_as = NULL;
>>> + MemoryRegion *mr = NULL;
>>> + ram_addr_t offset;
>>> +
>>> + mr = memory_region_from_host(info->vaddr, &offset);
>>> + if (!mr) {
>>> + return;
>>> + }
>>> +
>>> + dma_as = pci_device_iommu_address_space(o->pci_dev);
>>> +
>>> + memory_region_del_subregion(dma_as->root, mr);
>>> +
>>> + object_unparent((OBJECT(mr)));
>> 
>> Where is obj->parent set?
> 
> Yeah, it should be object_unref().

I think object_unparent() is the appropriate way to finalize the
MemoryRegion in this case.

‘mr’ is an owner-less MemoryRegion created by dma_register(). owner-less
MemoryRegions initialized using memory_region_init* functions are children
of the “/unattached” object.

The parent is set by dma_register() -> memory_region_init_ram_ptr() ->
memory_region_init() -> memory_region_do_init() ->
object_property_add_child() -> object_property_try_add_child().

Thank you!
--
Jag

> 
> Thank you!
> --
> Jag
> 
>> 
>> If it is not set then this call is a nop and mr is not freed:
>> 
>> void object_unparent(Object *obj)
>> {
>> if (obj->parent) {
>> object_property_del_child(obj->parent, obj);
>> }
>> }


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 12/17] vfio-user: IOMMU support for remote device
  2022-04-25 17:30       ` Jag Raman
@ 2022-04-28  9:47         ` Stefan Hajnoczi
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-28  9:47 UTC (permalink / raw)
  To: Jag Raman
  Cc: eduardo, Elena Ufimtseva, Thomas Huth, John Johnson,
	Daniel P. Berrangé,
	Beraldo Leal, John Levon, Michael S. Tsirkin, Markus Armbruster,
	Juan Quintela, Philippe Mathieu-Daudé,
	qemu-devel, Igor Mammedov, Thanos Makatos, Kanth Ghatraju,
	Marc-André Lureau, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 5608 bytes --]

On Mon, Apr 25, 2022 at 05:30:19PM +0000, Jag Raman wrote:
> 
> 
> > On Apr 25, 2022, at 5:38 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > On Wed, Apr 20, 2022 at 11:15:16AM +0000, Jag Raman wrote:
> >> 
> >> 
> >>> On Apr 19, 2022, at 4:45 PM, Jag Raman <jag.raman@oracle.com> wrote:
> >>> 
> >>> Assign separate address space for each device in the remote processes.
> >>> 
> >>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> >>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
> >>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
> >>> ---
> >>> include/hw/remote/iommu.h |  40 +++++++++++++
> >>> hw/remote/iommu.c         | 114 ++++++++++++++++++++++++++++++++++++++
> >>> hw/remote/machine.c       |  13 ++++-
> >>> MAINTAINERS               |   2 +
> >>> hw/remote/meson.build     |   1 +
> >>> 5 files changed, 169 insertions(+), 1 deletion(-)
> >>> create mode 100644 include/hw/remote/iommu.h
> >>> create mode 100644 hw/remote/iommu.c
> >>> 
> >>> diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h
> >>> new file mode 100644
> >>> index 0000000000..33b68a8f4b
> >>> --- /dev/null
> >>> +++ b/include/hw/remote/iommu.h
> >>> @@ -0,0 +1,40 @@
> >>> +/**
> >>> + * Copyright © 2022 Oracle and/or its affiliates.
> >>> + *
> >>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> >>> + * See the COPYING file in the top-level directory.
> >>> + *
> >>> + */
> >>> +
> >>> +#ifndef REMOTE_IOMMU_H
> >>> +#define REMOTE_IOMMU_H
> >>> +
> >>> +#include "hw/pci/pci_bus.h"
> >>> +#include "hw/pci/pci.h"
> >>> +
> >>> +#ifndef INT2VOIDP
> >>> +#define INT2VOIDP(i) (void *)(uintptr_t)(i)
> >>> +#endif
> >>> +
> >>> +typedef struct RemoteIommuElem {
> >>> +    MemoryRegion *mr;
> >>> +
> >>> +    AddressSpace as;
> >>> +} RemoteIommuElem;
> >>> +
> >>> +#define TYPE_REMOTE_IOMMU "x-remote-iommu"
> >>> +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU)
> >>> +
> >>> +struct RemoteIommu {
> >>> +    Object parent;
> >>> +
> >>> +    GHashTable *elem_by_devfn;
> >>> +
> >>> +    QemuMutex lock;
> >>> +};
> >>> +
> >>> +void remote_iommu_setup(PCIBus *pci_bus);
> >>> +
> >>> +void remote_iommu_unplug_dev(PCIDevice *pci_dev);
> >>> +
> >>> +#endif
> >>> diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c
> >>> new file mode 100644
> >>> index 0000000000..16c6b0834e
> >>> --- /dev/null
> >>> +++ b/hw/remote/iommu.c
> >>> @@ -0,0 +1,114 @@
> >>> +/**
> >>> + * IOMMU for remote device
> >>> + *
> >>> + * Copyright © 2022 Oracle and/or its affiliates.
> >>> + *
> >>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> >>> + * See the COPYING file in the top-level directory.
> >>> + *
> >>> + */
> >>> +
> >>> +#include "qemu/osdep.h"
> >>> +#include "qemu-common.h"
> >>> +
> >>> +#include "hw/remote/iommu.h"
> >>> +#include "hw/pci/pci_bus.h"
> >>> +#include "hw/pci/pci.h"
> >>> +#include "exec/memory.h"
> >>> +#include "exec/address-spaces.h"
> >>> +#include "trace.h"
> >>> +
> >>> +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus,
> >>> +                                              void *opaque, int devfn)
> >>> +{
> >>> +    RemoteIommu *iommu = opaque;
> >>> +    RemoteIommuElem *elem = NULL;
> >>> +
> >>> +    qemu_mutex_lock(&iommu->lock);
> >>> +
> >>> +    elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn));
> >>> +
> >>> +    if (!elem) {
> >>> +        elem = g_malloc0(sizeof(RemoteIommuElem));
> >>> +        g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem);
> >>> +    }
> >>> +
> >>> +    if (!elem->mr) {
> >>> +        elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION));
> >>> +        memory_region_set_size(elem->mr, UINT64_MAX);
> >>> +        address_space_init(&elem->as, elem->mr, NULL);
> >> 
> >> Hi,
> >> 
> >> I’d like to add a note here.
> >> 
> >> We tried to add "elem->mr” as a child of PCIDevice. That way, when PCIDevice is
> >> unplugged, the child is also finalized.
> > 
> > Do you mean via a memory_region_init()-family function where a parent
> > object is given? Or do you mean by adding a QOM child property?
> 
> I mean by adding “elem->mr” as a QOM child property of PCIDevice.
> 
> > 
> >> However, there was some issue with hotplug. During the hotplug, there’s a window
> >> during initialization where we couldn’t lookup the PCIDevice by “devfn”.
> >> 
> >> do_pci_register_device() -> pci_init_bus_master() -> pci_device_iommu_address_space()
> >> happens before do_pci_register_device() -> “bus->devices[devfn] = pci_dev”. As such,
> >> pci_find_device() doesn’t work at this time.
> > 
> > I don't follow. What calls pci_find_device()?
> 
> To add the MemoryRegion as a child of PCIDevice, remote_iommu_find_add_as()
> would need to lookup the PCIDevice using devfn. The function that looks up
> PCIDevice by devfn is pci_find_device().
> 
> The above note explains why we didn’t lookup the PCIDevice using pci_find_device()
> and then adding the MemoryRegion as its child.

If I understand correctly you're saying it's not possible to use
memory_region_init(&elem->mr, OBJECT(pci_dev), ...) inside
remote_iommu_find_add_as() because there is no way to find the
PCIDevice?

It would be nice to automatically clean up the memory region but the
AddressSpace needs to be destroyed too and there isn't an automatic way
of doing that, so the approach in this patch is fine.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 15/17] vfio-user: handle device interrupts
  2022-04-25 17:40     ` Jag Raman
@ 2022-04-28  9:54       ` Stefan Hajnoczi
  2022-05-05 16:22         ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Hajnoczi @ 2022-04-28  9:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eduardo, Elena Ufimtseva, thuth, John Johnson, berrange, bleal,
	john.levon, Michael S. Tsirkin, armbru, quintela,
	Philippe Mathieu-Daudé,
	qemu-devel, thanos.makatos, Kanth Ghatraju, marcandre.lureau,
	pbonzini, Jag Raman, eblake, dgilbert

[-- Attachment #1: Type: text/plain, Size: 1577 bytes --]

On Mon, Apr 25, 2022 at 05:40:01PM +0000, Jag Raman wrote:
> > On Apr 25, 2022, at 6:27 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > On Tue, Apr 19, 2022 at 04:44:20PM -0400, Jagannathan Raman wrote:
> >> +static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
> >> +                                             unsigned int vector)
> >> +{
> >> +    MSIMessage msg;
> >> +
> >> +    msg.address = 0;
> >> +    msg.data = vector;
> >> +
> >> +    return msg;
> >> +}
> >> +
> >> +static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
> >> +{
> >> +    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
> >> +
> >> +    vfu_irq_trigger(vfu_ctx, msg.data);
> >> +}
> > 
> > Why did you switch to vfu_object_msi_prepare_msg() +
> > vfu_object_msi_trigger() in this revision?
> 
> We previously did not do this switch because the server didn’t get updates
> to the MSIx table & PBA.
> 
> The latest client version (which is not part of this series) forwards accesses
> to the MSIx table & PBA over to the server. It also reads the PBA set by the
> server. These change make it possible for the server to make this switch.

Interesting. That's different from kernel VFIO. Before vfio-user commits
to a new approach it would be worth checking with Alex that he agrees
with the design.

I remember sending an email asking about why VFIO MSI-X PBA does not
offer the full semantics described in the PCIe spec but didn't get a
response from Alex (Message-Id:
YkMWp0lUJAHhivJA@stefanha-x1.localdomain).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 15/17] vfio-user: handle device interrupts
  2022-04-28  9:54       ` Stefan Hajnoczi
@ 2022-05-05 16:22         ` Alex Williamson
  0 siblings, 0 replies; 46+ messages in thread
From: Alex Williamson @ 2022-05-05 16:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S. Tsirkin, Philippe Mathieu-Daudé,
	pbonzini, marcandre.lureau, thuth, bleal, berrange, eduardo,
	marcel.apfelbaum, eblake, armbru, quintela, dgilbert, john.levon,
	thanos.makatos, Elena Ufimtseva, John Johnson, Kanth Ghatraju,
	Jag Raman

On Thu, 28 Apr 2022 10:54:04 +0100
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Mon, Apr 25, 2022 at 05:40:01PM +0000, Jag Raman wrote:
> > > On Apr 25, 2022, at 6:27 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > 
> > > On Tue, Apr 19, 2022 at 04:44:20PM -0400, Jagannathan Raman wrote:  
> > >> +static MSIMessage vfu_object_msi_prepare_msg(PCIDevice *pci_dev,
> > >> +                                             unsigned int vector)
> > >> +{
> > >> +    MSIMessage msg;
> > >> +
> > >> +    msg.address = 0;
> > >> +    msg.data = vector;
> > >> +
> > >> +    return msg;
> > >> +}
> > >> +
> > >> +static void vfu_object_msi_trigger(PCIDevice *pci_dev, MSIMessage msg)
> > >> +{
> > >> +    vfu_ctx_t *vfu_ctx = pci_dev->irq_opaque;
> > >> +
> > >> +    vfu_irq_trigger(vfu_ctx, msg.data);
> > >> +}  
> > > 
> > > Why did you switch to vfu_object_msi_prepare_msg() +
> > > vfu_object_msi_trigger() in this revision?  
> > 
> > We previously did not do this switch because the server didn’t get updates
> > to the MSIx table & PBA.
> > 
> > The latest client version (which is not part of this series) forwards accesses
> > to the MSIx table & PBA over to the server. It also reads the PBA set by the
> > server. These change make it possible for the server to make this switch.  
> 
> Interesting. That's different from kernel VFIO. Before vfio-user commits
> to a new approach it would be worth checking with Alex that he agrees
> with the design.
> 
> I remember sending an email asking about why VFIO MSI-X PBA does not
> offer the full semantics described in the PCIe spec but didn't get a
> response from Alex (Message-Id:
> YkMWp0lUJAHhivJA@stefanha-x1.localdomain).

IIUC, the question is why we redirect the MSI-X interrupt from the KVM
irqfd to be handled in QEMU when the vector is masked.  This is largely
to work around the fact that we haven't had a means to implement mask
and unmask in the kernel, therefore we leave the vector enabled and
only enable the emulated PBA if a masked vector fires.  This works
because nobody really cares about the PBA, nor operates in a mode where
vectors are masked and the PBA is polled.  Drivers that understand the
device likely have better places to poll for service requests than the
PBA.

Ideally, masking a vector would make use of the existing mask and
unmask uAPI via the SET_IRQS ioctl, but we haven't been able to
implement this due to lack of internal kernel APIs to support it.  We
may have those interfaces now, but lacking bandwidth, I haven't checked
recently and we seem to be getting by ok as is.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2022-05-05 16:48 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 20:44 [PATCH v8 00/17] vfio-user server in QEMU Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 02/17] qdev: unplug blocker for devices Jagannathan Raman
2022-04-21 14:55   ` Markus Armbruster
2022-04-21 17:49     ` Jag Raman
2022-04-22  5:18       ` Markus Armbruster
2022-04-22 14:18         ` Jag Raman
2022-04-19 20:44 ` [PATCH v8 03/17] remote/machine: add HotplugHandler for remote machine Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 04/17] remote/machine: add vfio-user property Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 05/17] configure: require cmake 3.19 or newer Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 06/17] vfio-user: build library Jagannathan Raman
2022-04-25  8:22   ` Stefan Hajnoczi
2022-04-19 20:44 ` [PATCH v8 07/17] vfio-user: define vfio-user-server object Jagannathan Raman
2022-04-25  8:27   ` Stefan Hajnoczi
2022-04-19 20:44 ` [PATCH v8 08/17] vfio-user: instantiate vfio-user context Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 09/17] vfio-user: find and init PCI device Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 10/17] vfio-user: run vfio-user context Jagannathan Raman
2022-04-21 14:59   ` Markus Armbruster
2022-04-21 17:52     ` Jag Raman
2022-04-22  5:14       ` Markus Armbruster
2022-04-22 14:18         ` Jag Raman
2022-04-19 20:44 ` [PATCH v8 11/17] vfio-user: handle PCI config space accesses Jagannathan Raman
2022-04-19 20:44 ` [PATCH v8 12/17] vfio-user: IOMMU support for remote device Jagannathan Raman
2022-04-20 11:15   ` Jag Raman
2022-04-25  9:38     ` Stefan Hajnoczi
2022-04-25 17:30       ` Jag Raman
2022-04-28  9:47         ` Stefan Hajnoczi
2022-04-25  9:31   ` Stefan Hajnoczi
2022-04-25 17:26     ` Jag Raman
2022-04-19 20:44 ` [PATCH v8 13/17] vfio-user: handle DMA mappings Jagannathan Raman
2022-04-25  9:56   ` Stefan Hajnoczi
2022-04-25 17:34     ` Jag Raman
2022-04-26 19:53       ` Jag Raman
2022-04-19 20:44 ` [PATCH v8 14/17] vfio-user: handle PCI BAR accesses Jagannathan Raman
2022-04-25 10:05   ` Stefan Hajnoczi
2022-04-25 17:36     ` Jag Raman
2022-04-19 20:44 ` [PATCH v8 15/17] vfio-user: handle device interrupts Jagannathan Raman
2022-04-25 10:27   ` Stefan Hajnoczi
2022-04-25 17:40     ` Jag Raman
2022-04-28  9:54       ` Stefan Hajnoczi
2022-05-05 16:22         ` Alex Williamson
2022-04-19 20:44 ` [PATCH v8 16/17] vfio-user: handle reset of remote device Jagannathan Raman
2022-04-25 10:27   ` Stefan Hajnoczi
2022-04-19 20:44 ` [PATCH v8 17/17] vfio-user: avocado tests for vfio-user Jagannathan Raman
2022-04-25 10:32 ` [PATCH v8 00/17] vfio-user server in QEMU Stefan Hajnoczi
2022-04-25 17:40   ` Jag Raman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.