All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
@ 2023-05-26 23:15 Alex Williamson
  2023-05-26 23:15 ` [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps Alex Williamson
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Alex Williamson @ 2023-05-26 23:15 UTC (permalink / raw)
  To: robin, mst, marcel.apfelbaum; +Cc: Alex Williamson, qemu-devel, clg

This RFC proposes to allow a vfio-pci device to manipulate the PCI
Express capability of an associated root port to enable Atomic Op
completer support as equivalent to host capabilities.  This would
dynamically change capability bits in the config space of the root
port on realize and exit of the vfio-pci device under the following
condiations:

 - The vfio-pci device is directly connected to the root port and
   the root port implements a v2 PCIe capability, thereby supporting
   the DEVCAP2 register.

 - The vfio-pci device is exposed as a single-function device.

 - Atomic completer support is not otherwise reported on the root port.

 - The vfio-pci device reports the VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP
   capability with a non-zero set of supported atomic completer widths.

This proposal aims to avoid complications of reporting Atomic Ops
Routing, which can easily escalate into invalid P2P paths.  We also
require a specific VM configuration to avoid dependencies between
devices which may be sourced from dissimilar host paths.

While it's not exactly standard practice to modify root port device
capabilities runtime, it also does not seem to be precluded by the PCIe
Spec (6.0.1).  The Atomic Op completion bits of the DEVCAP2 register
are defined as Read-only:

7.4 Configuration Register Types
 Read-only - Register bits are read-only and cannot be altered by software.
             Where explicitly defined, these bits are used to reflect changing
             hardware state, and as a result bit values can be observed to
             change at run time. Register bit default values and bits that
             cannot change value at run time, are permitted to be hard-coded,
             initialized by system/device firmware, or initialized by hardware
             mechanisms such as pin strapping or nonvolatile storage.
             Initialization by system firmware is permitted only for system-
             integrated devices. If the optional feature that would Set the
             bits is not implemented, the bits must be hardwired to Zero.

Here "altered by software" is relative to guest writes to the config
space register, whereas in this implementation we're acting as hardware
and the bits are changing to reflect a change in runtime capabilities.
The spec does include a HwInit register type which would restrict the
value from changing at runtime outside of resets.  Therefore while it
would not be advised to update these bits arbitrarily, it does seem safe
and compatible with guest software to update the value on device attach
and detach.

Note that of the Linux in-kernel drivers that make use of Atomic Ops,
it's not common for the driver to test Atomic Ops support of the device
itself.  Support is assumed, therefore it's fruitless to provide masking
of support at the device rather than the root port.

Also, by allowing this dynamic support, enabling Atomic Ops becomes
transparent to VM management tools.  There is no requirement to
designate specific Atomic Ops capabilities to a root port and impose a
burden on other userspace utilities.

Feedback welcome.  Thanks,

Alex

v2:
 - Don't require cold-plug device, modify RP bits around realize/exit

Alex Williamson (4):
  linux-headers: Update for vfio capability reporting AtomicOps
  vfio: Implement a common device info helper
  pcie: Add a PCIe capability version helper
  vfio/pci: Enable AtomicOps completers on root ports

 hw/pci/pcie.c                 |  7 ++++
 hw/s390x/s390-pci-vfio.c      | 37 +++--------------
 hw/vfio/common.c              | 46 ++++++++++++++++-----
 hw/vfio/pci.c                 | 78 +++++++++++++++++++++++++++++++++++
 hw/vfio/pci.h                 |  1 +
 include/hw/pci/pcie.h         |  1 +
 include/hw/vfio/vfio-common.h |  1 +
 linux-headers/linux/vfio.h    | 14 +++++++
 8 files changed, 142 insertions(+), 43 deletions(-)

-- 
2.39.2



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
@ 2023-05-26 23:15 ` Alex Williamson
  2023-05-30 13:19   ` Cédric Le Goater
  2023-05-26 23:15 ` [RFC PATCH v2 2/4] vfio: Implement a common device info helper Alex Williamson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2023-05-26 23:15 UTC (permalink / raw)
  To: robin, mst, marcel.apfelbaum; +Cc: Alex Williamson, qemu-devel, clg

This is a partial linux-headers update for illustrative and testing
purposes only, NOT FOR COMMIT.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 linux-headers/linux/vfio.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4a534edbdcba..443a8851e156 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -240,6 +240,20 @@ struct vfio_device_info {
 #define VFIO_DEVICE_INFO_CAP_ZPCI_UTIL		3
 #define VFIO_DEVICE_INFO_CAP_ZPCI_PFIP		4
 
+/*
+ * The following VFIO_DEVICE_INFO capability reports support for PCIe AtomicOp
+ * completion to the root bus with supported widths provided via flags.
+ */
+#define VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP	5
+struct vfio_device_info_cap_pci_atomic_comp {
+	struct vfio_info_cap_header header;
+	__u32 flags;
+#define VFIO_PCI_ATOMIC_COMP32	(1 << 0)
+#define VFIO_PCI_ATOMIC_COMP64	(1 << 1)
+#define VFIO_PCI_ATOMIC_COMP128	(1 << 2)
+	__u32 reserved;
+};
+
 /**
  * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
  *				       struct vfio_region_info)
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 2/4] vfio: Implement a common device info helper
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
  2023-05-26 23:15 ` [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps Alex Williamson
@ 2023-05-26 23:15 ` Alex Williamson
  2023-05-31 10:18   ` Robin Voetter
  2023-05-26 23:15 ` [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper Alex Williamson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2023-05-26 23:15 UTC (permalink / raw)
  To: robin, mst, marcel.apfelbaum
  Cc: Alex Williamson, qemu-devel, clg, Philippe Mathieu-Daudé

A common helper implementing the realloc algorithm for handling
capabilities.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/s390x/s390-pci-vfio.c      | 37 ++++------------------------
 hw/vfio/common.c              | 46 ++++++++++++++++++++++++++---------
 include/hw/vfio/vfio-common.h |  1 +
 3 files changed, 41 insertions(+), 43 deletions(-)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index f51190d4662f..66e8ec3bdef9 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -289,38 +289,11 @@ static void s390_pci_read_pfip(S390PCIBusDevice *pbdev,
     memcpy(pbdev->zpci_fn.pfip, cap->pfip, CLP_PFIP_NR_SEGMENTS);
 }
 
-static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev,
-                                                uint32_t argsz)
+static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev);
 {
-    struct vfio_device_info *info = g_malloc0(argsz);
-    VFIOPCIDevice *vfio_pci;
-    int fd;
+    VFIOPCIDevice *vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
 
-    vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
-    fd = vfio_pci->vbasedev.fd;
-
-    /*
-     * If the specified argsz is not large enough to contain all capabilities
-     * it will be updated upon return from the ioctl.  Retry until we have
-     * a big enough buffer to hold the entire capability chain.  On error,
-     * just exit and rely on CLP defaults.
-     */
-retry:
-    info->argsz = argsz;
-
-    if (ioctl(fd, VFIO_DEVICE_GET_INFO, info)) {
-        trace_s390_pci_clp_dev_info(vfio_pci->vbasedev.name);
-        g_free(info);
-        return NULL;
-    }
-
-    if (info->argsz > argsz) {
-        argsz = info->argsz;
-        info = g_realloc(info, argsz);
-        goto retry;
-    }
-
-    return info;
+    return vfio_get_device_info(vfio_pci->vbasedev.fd);
 }
 
 /*
@@ -335,7 +308,7 @@ bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh)
 
     assert(fh);
 
-    info = get_device_info(pbdev, sizeof(*info));
+    info = get_device_info(pbdev);
     if (!info) {
         return false;
     }
@@ -356,7 +329,7 @@ void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
 {
     g_autofree struct vfio_device_info *info = NULL;
 
-    info = get_device_info(pbdev, sizeof(*info));
+    info = get_device_info(pbdev);
     if (!info) {
         return;
     }
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 78358ede2764..ed142296e9fe 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -2843,11 +2843,35 @@ void vfio_put_group(VFIOGroup *group)
     }
 }
 
+struct vfio_device_info *vfio_get_device_info(int fd)
+{
+    struct vfio_device_info *info;
+    uint32_t argsz = sizeof(*info);
+
+    info = g_malloc0(argsz);
+
+retry:
+    info->argsz = argsz;
+
+    if (ioctl(fd, VFIO_DEVICE_GET_INFO, info)) {
+        g_free(info);
+        return NULL;
+    }
+
+    if (info->argsz > argsz) {
+        argsz = info->argsz;
+        info = g_realloc(info, argsz);
+        goto retry;
+    }
+
+    return info;
+}
+
 int vfio_get_device(VFIOGroup *group, const char *name,
                     VFIODevice *vbasedev, Error **errp)
 {
-    struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
-    int ret, fd;
+    g_autofree struct vfio_device_info *info = NULL;
+    int fd;
 
     fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
     if (fd < 0) {
@@ -2859,11 +2883,11 @@ int vfio_get_device(VFIOGroup *group, const char *name,
         return fd;
     }
 
-    ret = ioctl(fd, VFIO_DEVICE_GET_INFO, &dev_info);
-    if (ret) {
+    info = vfio_get_device_info(fd);
+    if (!info) {
         error_setg_errno(errp, errno, "error getting device info");
         close(fd);
-        return ret;
+        return -1;
     }
 
     /*
@@ -2891,14 +2915,14 @@ int vfio_get_device(VFIOGroup *group, const char *name,
     vbasedev->group = group;
     QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
 
-    vbasedev->num_irqs = dev_info.num_irqs;
-    vbasedev->num_regions = dev_info.num_regions;
-    vbasedev->flags = dev_info.flags;
+    vbasedev->num_irqs = info->num_irqs;
+    vbasedev->num_regions = info->num_regions;
+    vbasedev->flags = info->flags;
+
+    trace_vfio_get_device(name, info->flags, info->num_regions, info->num_irqs);
 
-    trace_vfio_get_device(name, dev_info.flags, dev_info.num_regions,
-                          dev_info.num_irqs);
+    vbasedev->reset_works = !!(info->flags & VFIO_DEVICE_FLAGS_RESET);
 
-    vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
     return 0;
 }
 
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index eed244f25f34..a8dcda592c08 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -212,6 +212,7 @@ void vfio_region_finalize(VFIORegion *region);
 void vfio_reset_handler(void *opaque);
 VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
 void vfio_put_group(VFIOGroup *group);
+struct vfio_device_info *vfio_get_device_info(int fd);
 int vfio_get_device(VFIOGroup *group, const char *name,
                     VFIODevice *vbasedev, Error **errp);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
  2023-05-26 23:15 ` [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps Alex Williamson
  2023-05-26 23:15 ` [RFC PATCH v2 2/4] vfio: Implement a common device info helper Alex Williamson
@ 2023-05-26 23:15 ` Alex Williamson
  2023-05-30 12:30   ` Cédric Le Goater
  2023-05-31 22:02   ` Robin Voetter
  2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2023-05-26 23:15 UTC (permalink / raw)
  To: robin, mst, marcel.apfelbaum; +Cc: Alex Williamson, qemu-devel, clg

Report the PCIe capability version for a device

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/pci/pcie.c         | 7 +++++++
 include/hw/pci/pcie.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index b8c24cf45f7e..b7f107ed8dd4 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -274,6 +274,13 @@ uint8_t pcie_cap_get_type(const PCIDevice *dev)
             PCI_EXP_FLAGS_TYPE) >> PCI_EXP_FLAGS_TYPE_SHIFT;
 }
 
+uint8_t pcie_cap_get_version(const PCIDevice *dev)
+{
+    uint32_t pos = dev->exp.exp_cap;
+    assert(pos > 0);
+    return pci_get_word(dev->config + pos + PCI_EXP_FLAGS) & PCI_EXP_FLAGS_VERS;
+}
+
 /* MSI/MSI-X */
 /* pci express interrupt message number */
 /* 7.8.2 PCI Express Capabilities Register: Interrupt Message Number */
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index 3cc2b159570f..51ab57bc3c50 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -93,6 +93,7 @@ void pcie_cap_exit(PCIDevice *dev);
 int pcie_endpoint_cap_v1_init(PCIDevice *dev, uint8_t offset);
 void pcie_cap_v1_exit(PCIDevice *dev);
 uint8_t pcie_cap_get_type(const PCIDevice *dev);
+uint8_t pcie_cap_get_version(const PCIDevice *dev);
 void pcie_cap_flags_set_vector(PCIDevice *dev, uint8_t vector);
 uint8_t pcie_cap_flags_get_vector(PCIDevice *dev);
 
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
                   ` (2 preceding siblings ...)
  2023-05-26 23:15 ` [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper Alex Williamson
@ 2023-05-26 23:15 ` Alex Williamson
  2023-05-30 13:36   ` Cédric Le Goater
                     ` (2 more replies)
  2023-05-31 21:55 ` [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Robin Voetter
  2023-05-31 22:31 ` Philippe Mathieu-Daudé
  5 siblings, 3 replies; 19+ messages in thread
From: Alex Williamson @ 2023-05-26 23:15 UTC (permalink / raw)
  To: robin, mst, marcel.apfelbaum; +Cc: Alex Williamson, qemu-devel, clg

Dynamically enable Atomic Ops completer support around realize/exit of
vfio-pci devices reporting host support for these accesses and adhering
to a minimal configuration standard.  While the Atomic Ops completer
bits in the root port device capabilities2 register are read-only, the
PCIe spec does allow RO bits to change to reflect hardware state.  We
take advantage of that here around the realize and exit functions of
the vfio-pci device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 hw/vfio/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.h |  1 +
 2 files changed, 79 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index bf27a3990564..d8a0fd595560 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1826,6 +1826,81 @@ static void vfio_add_emulated_long(VFIOPCIDevice *vdev, int pos,
     vfio_set_long_bits(vdev->emulated_config_bits + pos, mask, mask);
 }
 
+static void vfio_pci_enable_rp_atomics(VFIOPCIDevice *vdev)
+{
+    struct vfio_device_info_cap_pci_atomic_comp *cap;
+    g_autofree struct vfio_device_info *info = NULL;
+    PCIBus *bus = pci_get_bus(&vdev->pdev);
+    PCIDevice *parent = bus->parent_dev;
+    struct vfio_info_cap_header *hdr;
+    uint32_t mask = 0;
+    uint8_t *pos;
+
+    /*
+     * PCIe Atomic Ops completer support is only added automatically for single
+     * function devices downstream of a root port supporting DEVCAP2.  Support
+     * is added during realize and, if added, removed during device exit.  The
+     * single function requirement avoids conflicting requirements should a
+     * slot be composed of multiple devices with differing capabilities.
+     */
+    if (pci_bus_is_root(bus) || !parent || !parent->exp.exp_cap ||
+        pcie_cap_get_type(parent) != PCI_EXP_TYPE_ROOT_PORT ||
+        pcie_cap_get_version(parent) != PCI_EXP_FLAGS_VER2 ||
+        vdev->pdev.devfn ||
+        vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
+        return;
+    }
+
+    pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
+
+    /* Abort if there'a already an Atomic Ops configuration on the root port */
+    if (pci_get_long(pos) & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+                             PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
+                             PCI_EXP_DEVCAP2_ATOMIC_COMP128)) {
+        return;
+    }
+
+    info = vfio_get_device_info(vdev->vbasedev.fd);
+    if (!info) {
+        return;
+    }
+
+    hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP);
+    if (!hdr) {
+        return;
+    }
+
+    cap = (void *)hdr;
+    if (cap->flags & VFIO_PCI_ATOMIC_COMP32) {
+        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP32;
+    }
+    if (cap->flags & VFIO_PCI_ATOMIC_COMP64) {
+        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP64;
+    }
+    if (cap->flags & VFIO_PCI_ATOMIC_COMP128) {
+        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP128;
+    }
+
+    if (!mask) {
+        return;
+    }
+
+    pci_long_test_and_set_mask(pos, mask);
+    vdev->clear_parent_atomics_on_exit = true;
+}
+
+static void vfio_pci_disable_rp_atomics(VFIOPCIDevice *vdev)
+{
+    if (vdev->clear_parent_atomics_on_exit) {
+        PCIDevice *parent = pci_get_bus(&vdev->pdev)->parent_dev;
+        uint8_t *pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
+
+        pci_long_test_and_clear_mask(pos, PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+                                          PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
+                                          PCI_EXP_DEVCAP2_ATOMIC_COMP128);
+    }
+}
+
 static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
                                Error **errp)
 {
@@ -1929,6 +2004,8 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
                            QEMU_PCI_EXP_LNKCAP_MLS(QEMU_PCI_EXP_LNK_2_5GT), ~0);
             vfio_add_emulated_word(vdev, pos + PCI_EXP_LNKCTL, 0, ~0);
         }
+
+        vfio_pci_enable_rp_atomics(vdev);
     }
 
     /*
@@ -3265,6 +3342,7 @@ static void vfio_exitfn(PCIDevice *pdev)
         timer_free(vdev->intx.mmap_timer);
     }
     vfio_teardown_msi(vdev);
+    vfio_pci_disable_rp_atomics(vdev);
     vfio_bars_exit(vdev);
     vfio_migration_exit(&vdev->vbasedev);
 }
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 2674476d6c77..a2771b9ff3cc 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -174,6 +174,7 @@ struct VFIOPCIDevice {
     bool no_vfio_ioeventfd;
     bool enable_ramfb;
     bool defer_kvm_irq_routing;
+    bool clear_parent_atomics_on_exit;
     VFIODisplay *dpy;
     Notifier irqchip_change_notifier;
 };
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper
  2023-05-26 23:15 ` [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper Alex Williamson
@ 2023-05-30 12:30   ` Cédric Le Goater
  2023-05-31 22:02   ` Robin Voetter
  1 sibling, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2023-05-30 12:30 UTC (permalink / raw)
  To: Alex Williamson, robin, mst, marcel.apfelbaum; +Cc: qemu-devel

On 5/27/23 01:15, Alex Williamson wrote:
> Report the PCIe capability version for a device
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.

> ---
>   hw/pci/pcie.c         | 7 +++++++
>   include/hw/pci/pcie.h | 1 +
>   2 files changed, 8 insertions(+)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index b8c24cf45f7e..b7f107ed8dd4 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -274,6 +274,13 @@ uint8_t pcie_cap_get_type(const PCIDevice *dev)
>               PCI_EXP_FLAGS_TYPE) >> PCI_EXP_FLAGS_TYPE_SHIFT;
>   }
>   
> +uint8_t pcie_cap_get_version(const PCIDevice *dev)
> +{
> +    uint32_t pos = dev->exp.exp_cap;
> +    assert(pos > 0);
> +    return pci_get_word(dev->config + pos + PCI_EXP_FLAGS) & PCI_EXP_FLAGS_VERS;
> +}
> +
>   /* MSI/MSI-X */
>   /* pci express interrupt message number */
>   /* 7.8.2 PCI Express Capabilities Register: Interrupt Message Number */
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index 3cc2b159570f..51ab57bc3c50 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -93,6 +93,7 @@ void pcie_cap_exit(PCIDevice *dev);
>   int pcie_endpoint_cap_v1_init(PCIDevice *dev, uint8_t offset);
>   void pcie_cap_v1_exit(PCIDevice *dev);
>   uint8_t pcie_cap_get_type(const PCIDevice *dev);
> +uint8_t pcie_cap_get_version(const PCIDevice *dev);
>   void pcie_cap_flags_set_vector(PCIDevice *dev, uint8_t vector);
>   uint8_t pcie_cap_flags_get_vector(PCIDevice *dev);
>   



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps
  2023-05-26 23:15 ` [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps Alex Williamson
@ 2023-05-30 13:19   ` Cédric Le Goater
  0 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2023-05-30 13:19 UTC (permalink / raw)
  To: Alex Williamson, robin, mst, marcel.apfelbaum; +Cc: qemu-devel

On 5/27/23 01:15, Alex Williamson wrote:
> This is a partial linux-headers update for illustrative and testing
> purposes only, NOT FOR COMMIT.

Are you planing such an update for v6.5 ?

Thanks,

C.

> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>   linux-headers/linux/vfio.h | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4a534edbdcba..443a8851e156 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -240,6 +240,20 @@ struct vfio_device_info {
>   #define VFIO_DEVICE_INFO_CAP_ZPCI_UTIL		3
>   #define VFIO_DEVICE_INFO_CAP_ZPCI_PFIP		4
>   
> +/*
> + * The following VFIO_DEVICE_INFO capability reports support for PCIe AtomicOp
> + * completion to the root bus with supported widths provided via flags.
> + */
> +#define VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP	5
> +struct vfio_device_info_cap_pci_atomic_comp {
> +	struct vfio_info_cap_header header;
> +	__u32 flags;
> +#define VFIO_PCI_ATOMIC_COMP32	(1 << 0)
> +#define VFIO_PCI_ATOMIC_COMP64	(1 << 1)
> +#define VFIO_PCI_ATOMIC_COMP128	(1 << 2)
> +	__u32 reserved;
> +};
> +
>   /**
>    * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
>    *				       struct vfio_region_info)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports
  2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
@ 2023-05-30 13:36   ` Cédric Le Goater
  2023-05-31 22:03   ` Robin Voetter
  2023-05-31 22:28   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 19+ messages in thread
From: Cédric Le Goater @ 2023-05-30 13:36 UTC (permalink / raw)
  To: Alex Williamson, robin, mst, marcel.apfelbaum; +Cc: qemu-devel

On 5/27/23 01:15, Alex Williamson wrote:
> Dynamically enable Atomic Ops completer support around realize/exit of
> vfio-pci devices reporting host support for these accesses and adhering
> to a minimal configuration standard.  While the Atomic Ops completer
> bits in the root port device capabilities2 register are read-only, the
> PCIe spec does allow RO bits to change to reflect hardware state.  We
> take advantage of that here around the realize and exit functions of
> the vfio-pci device.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

LGTM. I am not sure about the single function restriction, may be that's
worth a warning ?

Thanks,

C.


> ---
>   hw/vfio/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
>   hw/vfio/pci.h |  1 +
>   2 files changed, 79 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index bf27a3990564..d8a0fd595560 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1826,6 +1826,81 @@ static void vfio_add_emulated_long(VFIOPCIDevice *vdev, int pos,
>       vfio_set_long_bits(vdev->emulated_config_bits + pos, mask, mask);
>   }
>   
> +static void vfio_pci_enable_rp_atomics(VFIOPCIDevice *vdev)
> +{
> +    struct vfio_device_info_cap_pci_atomic_comp *cap;
> +    g_autofree struct vfio_device_info *info = NULL;
> +    PCIBus *bus = pci_get_bus(&vdev->pdev);
> +    PCIDevice *parent = bus->parent_dev;
> +    struct vfio_info_cap_header *hdr;
> +    uint32_t mask = 0;
> +    uint8_t *pos;
> +
> +    /*
> +     * PCIe Atomic Ops completer support is only added automatically for single
> +     * function devices downstream of a root port supporting DEVCAP2.  Support
> +     * is added during realize and, if added, removed during device exit.  The
> +     * single function requirement avoids conflicting requirements should a
> +     * slot be composed of multiple devices with differing capabilities.
> +     */
> +    if (pci_bus_is_root(bus) || !parent || !parent->exp.exp_cap ||
> +        pcie_cap_get_type(parent) != PCI_EXP_TYPE_ROOT_PORT ||
> +        pcie_cap_get_version(parent) != PCI_EXP_FLAGS_VER2 ||
> +        vdev->pdev.devfn ||
> +        vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
> +        return;
> +    }
> +
> +    pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
> +
> +    /* Abort if there'a already an Atomic Ops configuration on the root port */
> +    if (pci_get_long(pos) & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                             PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
> +                             PCI_EXP_DEVCAP2_ATOMIC_COMP128)) {
> +        return;
> +    }
> +
> +    info = vfio_get_device_info(vdev->vbasedev.fd);
> +    if (!info) {
> +        return;
> +    }
> +
> +    hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP);
> +    if (!hdr) {
> +        return;
> +    }
> +
> +    cap = (void *)hdr;
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP32) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP32;
> +    }
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP64) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP64;
> +    }
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP128) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP128;
> +    }
> +
> +    if (!mask) {
> +        return;
> +    }
> +
> +    pci_long_test_and_set_mask(pos, mask);
> +    vdev->clear_parent_atomics_on_exit = true;
> +}
> +
> +static void vfio_pci_disable_rp_atomics(VFIOPCIDevice *vdev)
> +{
> +    if (vdev->clear_parent_atomics_on_exit) {
> +        PCIDevice *parent = pci_get_bus(&vdev->pdev)->parent_dev;
> +        uint8_t *pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
> +
> +        pci_long_test_and_clear_mask(pos, PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                                          PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
> +                                          PCI_EXP_DEVCAP2_ATOMIC_COMP128);
> +    }
> +}
> +
>   static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
>                                  Error **errp)
>   {
> @@ -1929,6 +2004,8 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size,
>                              QEMU_PCI_EXP_LNKCAP_MLS(QEMU_PCI_EXP_LNK_2_5GT), ~0);
>               vfio_add_emulated_word(vdev, pos + PCI_EXP_LNKCTL, 0, ~0);
>           }
> +
> +        vfio_pci_enable_rp_atomics(vdev);
>       }
>   
>       /*
> @@ -3265,6 +3342,7 @@ static void vfio_exitfn(PCIDevice *pdev)
>           timer_free(vdev->intx.mmap_timer);
>       }
>       vfio_teardown_msi(vdev);
> +    vfio_pci_disable_rp_atomics(vdev);
>       vfio_bars_exit(vdev);
>       vfio_migration_exit(&vdev->vbasedev);
>   }
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index 2674476d6c77..a2771b9ff3cc 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -174,6 +174,7 @@ struct VFIOPCIDevice {
>       bool no_vfio_ioeventfd;
>       bool enable_ramfb;
>       bool defer_kvm_irq_routing;
> +    bool clear_parent_atomics_on_exit;
>       VFIODisplay *dpy;
>       Notifier irqchip_change_notifier;
>   };



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/4] vfio: Implement a common device info helper
  2023-05-26 23:15 ` [RFC PATCH v2 2/4] vfio: Implement a common device info helper Alex Williamson
@ 2023-05-31 10:18   ` Robin Voetter
  0 siblings, 0 replies; 19+ messages in thread
From: Robin Voetter @ 2023-05-31 10:18 UTC (permalink / raw)
  To: Alex Williamson, mst, marcel.apfelbaum
  Cc: qemu-devel, clg, Philippe Mathieu-Daudé

On 5/27/23 01:15, Alex Williamson wrote:
> -static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev,
> -                                                uint32_t argsz)
> +static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev);
>  {

There is an extraneous semicolon here behind the function declaration
that should be removed.

Kind regards,

Robin Voetter, Stream HPC


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
                   ` (3 preceding siblings ...)
  2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
@ 2023-05-31 21:55 ` Robin Voetter
  2023-05-31 22:24   ` Alex Williamson
  2023-05-31 22:31 ` Philippe Mathieu-Daudé
  5 siblings, 1 reply; 19+ messages in thread
From: Robin Voetter @ 2023-05-31 21:55 UTC (permalink / raw)
  To: Alex Williamson, mst, marcel.apfelbaum; +Cc: qemu-devel, clg

Hi Alex,

Thanks for taking the time to implement support for Atomic Op completer
support properly :). I have tested out these patches and the kernel
patch, and apart from a minor issue with patch 2 everything works fine;
ROCm programs that use device->host atomic operations work properly.

Something that I have been thinking about, are there any implications
involved with enabling this feature automatically with no possibility of
turning it off? I have no use case for that, though, and I cant really
think of a reason other than preventing the guest from finding out
hardware details about the host.

Thanks,

Robin Voetter, Stream HPC


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper
  2023-05-26 23:15 ` [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper Alex Williamson
  2023-05-30 12:30   ` Cédric Le Goater
@ 2023-05-31 22:02   ` Robin Voetter
  2023-05-31 22:19     ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 19+ messages in thread
From: Robin Voetter @ 2023-05-31 22:02 UTC (permalink / raw)
  To: Alex Williamson, mst, marcel.apfelbaum; +Cc: qemu-devel, clg

On 5/27/23 01:15, Alex Williamson wrote:
> Report the PCIe capability version for a device
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>Reviewed-by: Robin Voetter <robin@streamhpc.com>
Tested-by: Robin Voetter <robin@streamhpc.com>

Kind regards,

Robin Voetter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports
  2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
  2023-05-30 13:36   ` Cédric Le Goater
@ 2023-05-31 22:03   ` Robin Voetter
  2023-05-31 22:28   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 19+ messages in thread
From: Robin Voetter @ 2023-05-31 22:03 UTC (permalink / raw)
  To: Alex Williamson, mst, marcel.apfelbaum; +Cc: qemu-devel, clg



On 5/27/23 01:15, Alex Williamson wrote:
> Dynamically enable Atomic Ops completer support around realize/exit of
> vfio-pci devices reporting host support for these accesses and adhering
> to a minimal configuration standard.  While the Atomic Ops completer
> bits in the root port device capabilities2 register are read-only, the
> PCIe spec does allow RO bits to change to reflect hardware state.  We
> take advantage of that here around the realize and exit functions of
> the vfio-pci device.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Robin Voetter <robin@streamhpc.com>
Tested-by: Robin Voetter <robin@streamhpc.com>

Kind regards,

Robin Voetter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper
  2023-05-31 22:02   ` Robin Voetter
@ 2023-05-31 22:19     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-05-31 22:19 UTC (permalink / raw)
  To: Robin Voetter, Alex Williamson, mst, marcel.apfelbaum; +Cc: qemu-devel, clg

On 1/6/23 00:02, Robin Voetter wrote:
> On 5/27/23 01:15, Alex Williamson wrote:
>> Report the PCIe capability version for a device
>>
>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Reviewed-by: Robin Voetter <robin@streamhpc.com>
Tested-by: Robin Voetter <robin@streamhpc.com>

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> 
> Kind regards,
> 
> Robin Voetter
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-31 21:55 ` [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Robin Voetter
@ 2023-05-31 22:24   ` Alex Williamson
  2023-05-31 22:29     ` Philippe Mathieu-Daudé
  2023-06-01  8:15     ` Robin Voetter
  0 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2023-05-31 22:24 UTC (permalink / raw)
  To: Robin Voetter; +Cc: mst, marcel.apfelbaum, qemu-devel, clg

On Wed, 31 May 2023 23:55:41 +0200
Robin Voetter <robin@streamhpc.com> wrote:

> Hi Alex,
> 
> Thanks for taking the time to implement support for Atomic Op completer
> support properly :). I have tested out these patches and the kernel
> patch, and apart from a minor issue with patch 2 everything works fine;

Yes, Cedric noted the extra semicolon as well, I'm about to post that
patch as non-RFC since it has no dependencies on anything else.

> ROCm programs that use device->host atomic operations work properly.

Great, thanks for testing!

> Something that I have been thinking about, are there any implications
> involved with enabling this feature automatically with no possibility of
> turning it off? I have no use case for that, though, and I cant really
> think of a reason other than preventing the guest from finding out
> hardware details about the host.

Not sure I follow, are you suggesting that Atomic Ops completion
support is proactively enabled in the VM to match the host, regardless
of attached assigned devices?  An obvious problem there is migration.
If I start a VM on a host with Atomic Ops support and want to migrate
to a host without Atomic Ops support, config space of the root ports is
now different and the migration fails.  QEMU would also require
elevated privileges to read config space to determine the host support,
and then what does it do if only some of the PCIe hierarchy supports
Atomic Ops.

Policy decisions like that are generally left to management tools, so
there would always be a means to enable or disable the feature.  In
fact, that's specifically why I test that the Atomic Op completer bits
are unset in the root port before changing them so that this automatic
enablement could live alongside a command line option to statically
enable some bits.

That does however remind me that it is often good with these sorts of
"clever" automatic features to have an opt-out, so I'll likely add an
x-no-rp-atomics device option in the next version to provide that.
Thanks,

Alex



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports
  2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
  2023-05-30 13:36   ` Cédric Le Goater
  2023-05-31 22:03   ` Robin Voetter
@ 2023-05-31 22:28   ` Philippe Mathieu-Daudé
  2 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-05-31 22:28 UTC (permalink / raw)
  To: Alex Williamson, robin, mst, marcel.apfelbaum; +Cc: qemu-devel, clg

On 27/5/23 01:15, Alex Williamson wrote:
> Dynamically enable Atomic Ops completer support around realize/exit of
> vfio-pci devices reporting host support for these accesses and adhering
> to a minimal configuration standard.  While the Atomic Ops completer
> bits in the root port device capabilities2 register are read-only, the
> PCIe spec does allow RO bits to change to reflect hardware state.  We
> take advantage of that here around the realize and exit functions of
> the vfio-pci device.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>   hw/vfio/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
>   hw/vfio/pci.h |  1 +
>   2 files changed, 79 insertions(+)


> +static void vfio_pci_enable_rp_atomics(VFIOPCIDevice *vdev)
> +{
> +    struct vfio_device_info_cap_pci_atomic_comp *cap;
> +    g_autofree struct vfio_device_info *info = NULL;
> +    PCIBus *bus = pci_get_bus(&vdev->pdev);
> +    PCIDevice *parent = bus->parent_dev;
> +    struct vfio_info_cap_header *hdr;
> +    uint32_t mask = 0;
> +    uint8_t *pos;
> +
> +    /*
> +     * PCIe Atomic Ops completer support is only added automatically for single
> +     * function devices downstream of a root port supporting DEVCAP2.  Support
> +     * is added during realize and, if added, removed during device exit.  The
> +     * single function requirement avoids conflicting requirements should a
> +     * slot be composed of multiple devices with differing capabilities.
> +     */
> +    if (pci_bus_is_root(bus) || !parent || !parent->exp.exp_cap ||
> +        pcie_cap_get_type(parent) != PCI_EXP_TYPE_ROOT_PORT ||
> +        pcie_cap_get_version(parent) != PCI_EXP_FLAGS_VER2 ||
> +        vdev->pdev.devfn ||
> +        vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
> +        return;
> +    }
> +
> +    pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
> +
> +    /* Abort if there'a already an Atomic Ops configuration on the root port */

Optional here: trace event logging pci_get_long(pos).

> +    if (pci_get_long(pos) & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                             PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
> +                             PCI_EXP_DEVCAP2_ATOMIC_COMP128)) {
> +        return;
> +    }
> +
> +    info = vfio_get_device_info(vdev->vbasedev.fd);
> +    if (!info) {
> +        return;
> +    }
> +
> +    hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_PCI_ATOMIC_COMP);
> +    if (!hdr) {
> +        return;
> +    }
> +
> +    cap = (void *)hdr;
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP32) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP32;
> +    }
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP64) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP64;
> +    }
> +    if (cap->flags & VFIO_PCI_ATOMIC_COMP128) {
> +        mask |= PCI_EXP_DEVCAP2_ATOMIC_COMP128;
> +    }
> +
> +    if (!mask) {
> +        return;
> +    }

Similarly optional, trace event logging (cap->flags, mask).

> +
> +    pci_long_test_and_set_mask(pos, mask);
> +    vdev->clear_parent_atomics_on_exit = true;
> +}
> +
> +static void vfio_pci_disable_rp_atomics(VFIOPCIDevice *vdev)
> +{
> +    if (vdev->clear_parent_atomics_on_exit) {
> +        PCIDevice *parent = pci_get_bus(&vdev->pdev)->parent_dev;
> +        uint8_t *pos = parent->config + parent->exp.exp_cap + PCI_EXP_DEVCAP2;
> +
> +        pci_long_test_and_clear_mask(pos, PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                                          PCI_EXP_DEVCAP2_ATOMIC_COMP64 |
> +                                          PCI_EXP_DEVCAP2_ATOMIC_COMP128);
> +    }
> +}

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-31 22:24   ` Alex Williamson
@ 2023-05-31 22:29     ` Philippe Mathieu-Daudé
  2023-06-02 14:02       ` Philippe Mathieu-Daudé
  2023-06-01  8:15     ` Robin Voetter
  1 sibling, 1 reply; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-05-31 22:29 UTC (permalink / raw)
  To: Alex Williamson, Robin Voetter; +Cc: mst, marcel.apfelbaum, qemu-devel, clg

On 1/6/23 00:24, Alex Williamson wrote:
> On Wed, 31 May 2023 23:55:41 +0200
> Robin Voetter <robin@streamhpc.com> wrote:
> 
>> Hi Alex,
>>
>> Thanks for taking the time to implement support for Atomic Op completer
>> support properly :). I have tested out these patches and the kernel
>> patch, and apart from a minor issue with patch 2 everything works fine;
> 
> Yes, Cedric noted the extra semicolon as well, I'm about to post that
> patch as non-RFC since it has no dependencies on anything else.


> Policy decisions like that are generally left to management tools, so
> there would always be a means to enable or disable the feature.  In
> fact, that's specifically why I test that the Atomic Op completer bits
> are unset in the root port before changing them so that this automatic
> enablement could live alongside a command line option to statically
> enable some bits.
> 
> That does however remind me that it is often good with these sorts of
> "clever" automatic features to have an opt-out, so I'll likely add an
> x-no-rp-atomics device option in the next version to provide that.

I was just going to suggest that, just in case :)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
                   ` (4 preceding siblings ...)
  2023-05-31 21:55 ` [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Robin Voetter
@ 2023-05-31 22:31 ` Philippe Mathieu-Daudé
  5 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-05-31 22:31 UTC (permalink / raw)
  To: Alex Williamson, robin, mst, marcel.apfelbaum; +Cc: qemu-devel, clg

On 27/5/23 01:15, Alex Williamson wrote:
> This RFC proposes to allow a vfio-pci device to manipulate the PCI
> Express capability of an associated root port to enable Atomic Op
> completer support as equivalent to host capabilities.  This would
[...]

> While it's not exactly standard practice to modify root port device
> capabilities runtime, it also does not seem to be precluded by the PCIe
> Spec (6.0.1).  The Atomic Op completion bits of the DEVCAP2 register
> are defined as Read-only:
> 
> 7.4 Configuration Register Types
>   Read-only - Register bits are read-only and cannot be altered by software.
>               Where explicitly defined, these bits are used to reflect changing
>               hardware state, and as a result bit values can be observed to
>               change at run time. Register bit default values and bits that
>               cannot change value at run time, are permitted to be hard-coded,
>               initialized by system/device firmware, or initialized by hardware
>               mechanisms such as pin strapping or nonvolatile storage.
>               Initialization by system firmware is permitted only for system-
>               integrated devices. If the optional feature that would Set the
>               bits is not implemented, the bits must be hardwired to Zero.
> 
> Here "altered by software" is relative to guest writes to the config
> space register, whereas in this implementation we're acting as hardware
> and the bits are changing to reflect a change in runtime capabilities.
> The spec does include a HwInit register type which would restrict the
> value from changing at runtime outside of resets.  Therefore while it
> would not be advised to update these bits arbitrarily, it does seem safe
> and compatible with guest software to update the value on device attach
> and detach.

 From my previous (short) PCIe experience, this is also my understanding.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-31 22:24   ` Alex Williamson
  2023-05-31 22:29     ` Philippe Mathieu-Daudé
@ 2023-06-01  8:15     ` Robin Voetter
  1 sibling, 0 replies; 19+ messages in thread
From: Robin Voetter @ 2023-06-01  8:15 UTC (permalink / raw)
  To: Alex Williamson; +Cc: mst, marcel.apfelbaum, qemu-devel, clg



On 6/1/23 00:24, Alex Williamson wrote:
> On Wed, 31 May 2023 23:55:41 +0200
> Robin Voetter <robin@streamhpc.com> wrote:
>> Something that I have been thinking about, are there any implications
>> involved with enabling this feature automatically with no possibility of
>> turning it off? I have no use case for that, though, and I cant really
>> think of a reason other than preventing the guest from finding out
>> hardware details about the host.
> 
> Not sure I follow, are you suggesting that Atomic Ops completion
> support is proactively enabled in the VM to match the host, regardless
> of attached assigned devices?  An obvious problem there is migration.
> If I start a VM on a host with Atomic Ops support and want to migrate
> to a host without Atomic Ops support, config space of the root ports is
> now different and the migration fails.  QEMU would also require
> elevated privileges to read config space to determine the host support,
> and then what does it do if only some of the PCIe hierarchy supports
> Atomic Ops.
> 
> Policy decisions like that are generally left to management tools, so
> there would always be a means to enable or disable the feature.  In
> fact, that's specifically why I test that the Atomic Op completer bits
> are unset in the root port before changing them so that this automatic
> enablement could live alongside a command line option to statically
> enable some bits.
> 
> That does however remind me that it is often good with these sorts of
> "clever" automatic features to have an opt-out, so I'll likely add an
> x-no-rp-atomics device option in the next version to provide that.

Yes, something like that that is exactly what I was thinking about.

Thanks,

Robin Voetter


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support
  2023-05-31 22:29     ` Philippe Mathieu-Daudé
@ 2023-06-02 14:02       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 19+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-06-02 14:02 UTC (permalink / raw)
  To: Markus Armbruster, Marc-André Lureau
  Cc: mst, Robin Voetter, Alex Williamson, marcel.apfelbaum, qemu-devel, clg

Hi Markus, Marc-André,

> On 1/6/23 00:24, Alex Williamson wrote:
>> Policy decisions like that are generally left to management tools, so
>> there would always be a means to enable or disable the feature.  In
>> fact, that's specifically why I test that the Atomic Op completer bits
>> are unset in the root port before changing them so that this automatic
>> enablement could live alongside a command line option to statically
>> enable some bits.
>>
>> That does however remind me that it is often good with these sorts of
>> "clever" automatic features to have an opt-out, so I'll likely add an
>> x-no-rp-atomics device option in the next version to provide that.

Still thinking about this series I remembered since commit a3c45b3e62
("qapi: New special feature flag "unstable"") the "x-" prefix is
obsolete (superseded) in QAPI generated code.

I wonder about device properties:

$ git grep -F '"x-' hw | wc -l
      130

$ git grep -F '"x-' hw/vfio/pci.c
hw/vfio/pci.c:2987:        error_setg(errp, "x-balloon-allowed only 
potentially compatible "
hw/vfio/pci.c:3335: 
DEFINE_PROP_ON_OFF_AUTO("x-pre-copy-dirty-page-tracking", VFIOPCIDevice,
hw/vfio/pci.c:3342:    DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", 
VFIOPCIDevice,
hw/vfio/pci.c:3344:    DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
hw/vfio/pci.c:3346:    DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features,
hw/vfio/pci.c:3348:    DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, 
features,
hw/vfio/pci.c:3350:    DEFINE_PROP_BOOL("x-enable-migration", VFIOPCIDevice,
hw/vfio/pci.c:3352:    DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, 
vbasedev.no_mmap, false),
hw/vfio/pci.c:3353:    DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice,
hw/vfio/pci.c:3355:    DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, 
no_kvm_intx, false),
hw/vfio/pci.c:3356:    DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, 
no_kvm_msi, false),
hw/vfio/pci.c:3357:    DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, 
no_kvm_msix, false),
hw/vfio/pci.c:3358:    DEFINE_PROP_BOOL("x-no-geforce-quirks", 
VFIOPCIDevice,
hw/vfio/pci.c:3360:    DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", 
VFIOPCIDevice, no_kvm_ioeventfd,
hw/vfio/pci.c:3362:    DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", 
VFIOPCIDevice, no_vfio_ioeventfd,
hw/vfio/pci.c:3364:    DEFINE_PROP_UINT32("x-pci-vendor-id", 
VFIOPCIDevice, vendor_id, PCI_ANY_ID),
hw/vfio/pci.c:3365:    DEFINE_PROP_UINT32("x-pci-device-id", 
VFIOPCIDevice, device_id, PCI_ANY_ID),
hw/vfio/pci.c:3366:    DEFINE_PROP_UINT32("x-pci-sub-vendor-id", 
VFIOPCIDevice,
hw/vfio/pci.c:3368:    DEFINE_PROP_UINT32("x-pci-sub-device-id", 
VFIOPCIDevice,
hw/vfio/pci.c:3370:    DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, 
igd_gms, 0),
hw/vfio/pci.c:3371: 
DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice,
hw/vfio/pci.c:3374:    DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", 
VFIOPCIDevice, msix_relo,

Is there a plan to use something similar for QOM properties?
Is it OK to keep using the "x-" prefix there?

Thanks,

Phil.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-06-02 14:03 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26 23:15 [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Alex Williamson
2023-05-26 23:15 ` [RFC PATCH v2 1/4] linux-headers: Update for vfio capability reporting AtomicOps Alex Williamson
2023-05-30 13:19   ` Cédric Le Goater
2023-05-26 23:15 ` [RFC PATCH v2 2/4] vfio: Implement a common device info helper Alex Williamson
2023-05-31 10:18   ` Robin Voetter
2023-05-26 23:15 ` [RFC PATCH v2 3/4] pcie: Add a PCIe capability version helper Alex Williamson
2023-05-30 12:30   ` Cédric Le Goater
2023-05-31 22:02   ` Robin Voetter
2023-05-31 22:19     ` Philippe Mathieu-Daudé
2023-05-26 23:15 ` [RFC PATCH v2 4/4] vfio/pci: Enable AtomicOps completers on root ports Alex Williamson
2023-05-30 13:36   ` Cédric Le Goater
2023-05-31 22:03   ` Robin Voetter
2023-05-31 22:28   ` Philippe Mathieu-Daudé
2023-05-31 21:55 ` [RFC PATCH v2 0/4] vfio/pci: Atomic Ops completer support Robin Voetter
2023-05-31 22:24   ` Alex Williamson
2023-05-31 22:29     ` Philippe Mathieu-Daudé
2023-06-02 14:02       ` Philippe Mathieu-Daudé
2023-06-01  8:15     ` Robin Voetter
2023-05-31 22:31 ` Philippe Mathieu-Daudé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.